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Preface 


Since the last edition of this book appeared, more than five million scientific papers 
have been published. There has been a parallel increase in the quantity of digital 
information: new data on genome sequences, protein interactions, molecular struc- 
tures, and gene expression—all stored in vast databases. The challenge, for both sci- 
entists and textbook writers, is to convert this overwhelming amount of information 
into an accessible and up-to-date understanding of how cells work. 

Help comes from a large increase in the number of review articles that attempt 
to make raw material easier to digest, although the vast majority of these reviews 
are still quite narrowly focused. Meanwhile, a rapidly growing collection of online 
resources tries to convince us that understanding is only a few mouse-clicks away. 
In some areas this change in the way we access knowledge has been highly suc- 
cessful—in discovering the latest information about our own medical problems, for 
example. But to understand something of the beauty and complexity of how living 
cells work, one needs more than just a wiki- this or wiki- that; it is enormously hard 
to identify the valuable and enduring gems from so much confusing landfill. Much 
more effective is a carefully wrought narrative that leads logically and progressively 
through the key ideas, components, and experiments in such a way that readers 
can build for themselves a memorable, conceptual framework for cell biology— 
a framework that will allow them to critically evaluate all of the new science and, 
more importantly, to understand it. That is what we have tried to do in Molecular 
Biology of the Cell. 

In preparing this new edition, we have inevitably had to make some difficult 
decisions. In order to incorporate exciting new discoveries, while at the same time 
keeping the book portable, much has had to be excised. We have added new sec- 
tions, such as those on new RNA functions, advances in stem cell biology, new 
methods for studying proteins and genes and for imaging cells, advances in the 
genetics and treatment of cancer, and timing, growth control, and morphogenesis 
in development. 

The chemistry of cells is extremely complex, and any list of cell parts and their 
interactions—no matter how complete—will leave huge gaps in our understanding. 
We now realize that to produce convincing explanations of cell behavior will require 
quantitative information about cells that is coupled to sophisticated mathematical/ 
computational approaches—some not yet invented. As a consequence, an emerg- 
ing goal for cell biologists is to shift their studies more toward quantitative descrip- 
tion and mathematical deduction. We highlight this approach and some of its meth- 
ods in a new section at the end of Chapter 8. 

Faced with the immensity of what we have learned about cell biology, it might 
be tempting for a student to imagine that there is little left to discover. In fact, the 
more we find out about cells, the more new questions emerge. To emphasize that 
our understanding of cell biology is incomplete, we have highlighted some of the 
major gaps in our knowledge by including What We Don’t Know at the end of each 
chapter. These brief lists include only a tiny sample of the critical unanswered ques- 
tions and challenges for the next generation of scientists. We derive great pleasure 
from the knowledge that some of our readers will provide future answers. 

The more than 1500 illustrations have been designed to create a parallel narra- 
tive, closely interwoven with the text. We have increased their consistency between 
chapters, particularly in the use of color and of common icons; membrane pumps 
and channels are a good example. To avoid interruptions to the text, some material 
has been moved into new, readily accessible panels. Most of the important pro- 
tein structures depicted have now been redrawn and consistently colored. In each 
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case, we now provide the corresponding Protein Data Bank (PDB) code for the 
protein, which can be used to access online tools that provide more information 
about it, such as those on the RCSB PDB website (www.rcsb.org). These connec- 
tions allow readers of the book to explore more fully the proteins that lie at the core 
of cell biology. 

John Wilson and Tim Hunt have again contributed their distinctive and imagi- 
native problems to help students gain a more active understanding of the text. 
The problems emphasize quantitative approaches and encourage critical thinking 
about published experiments; they are now present at the end of all chapters. The 
answers to these problems, plus more than 1800 additional problems and solutions, 
all appear in the companion volume that John and Tim have written, Molecular 
Biology of the Cell, Sixth Edition: The Problems Book. 

We live in a world that presents us with many complex issues related to cell 
biology: biodiversity, climate change, food security, environmental degradation, 
resource depletion, and human disease. We hope that our textbook will help the 
reader better understand and possibly contribute to meeting these challenges. 
Knowledge and understanding bring the power to intervene. 

We are indebted to a large number of scientists whose generous help we men- 
tion separately in the detailed acknowledgments. Here we must mention some par- 
ticularly significant contributors. For Chapter 8, Hana El-Samad provided the core 
of the section on Mathematical Analysis of Cell Functions, and Karen Hopkin made 
valuable contributions to the section on Studying Gene Expression and Function. 
Werner Kuhlbrandt helped to reorganize and rewrite Chapter 14 (Energy Conver- 
sion: Mitochondria and Chloroplasts). Rebecca Heald did the same for Chapter 16 
(The Cytoskeleton), as did Alexander Schier for Chapter 21 (Development of Mul- 
ticellular Organisms), and Matt Welch for Chapter 23 (Pathogens and Infection). 
Lewis Lanier aided in the writing of Chapter 24 (The Innate and Adaptive Immune 
Systems). Hossein Amiri generated the enormous online instructor’s question bank. 

Before starting out on the revision cycle for this edition, we asked a number of 
scientists who had used the last edition to teach cell biology students to meet with 
us and suggest improvements. They gave us useful feedback that has helped inform 
the new edition. We also benefited from the valuable input of groups of students 
who read most of the chapters in page proofs. 

Many people and much effort are needed to convert a long manuscript and a 
large pile of sketches into a finished textbook. The team at Garland Science that 
managed this conversion was outstanding. Denise Schanck, directing operations, 
displayed forbearance, insight, tact, and energy throughout the journey; she guided 
us all unerringly, ably assisted by Allie Bochicchio and Janette Scobie. Nigel Orme 
oversaw our revamped illustration program, put all the artwork into its final form, 
and again enhanced the back cover with his graphics skills. Tiago Barros helped us 
refresh our presentation of protein structures. Matthew McClements designed the 
book and its front cover. Emma Jeffcock again laid out the final pages, managing end- 
less rounds of proofs and last-minute changes with remarkable skill and patience; 
Georgina Lucas provided her with help. Michael Morales, assisted by Leah Chris- 
tians, produced and assembled the complex web of videos, animations, and other 
materials that form the core of the online resources that accompany the book. Adam 
Sendroff provided us with the valuable feedback from book users around the world 
that informed our revision cycle. Casting expert eyes over the manuscript, Eliza- 
beth Zayatz and Sherry Granum Lewis acted as development editors, Jo Clayton as 
copyeditor, and Sally Huish as proofreader. Bill Johncocks compiled the index. In 
London, Emily Preece fed us, while the Garland team’s professional help, skills, and 
energy, together with their friendship, nourished us in every other way throughout 
the revision, making the whole process a pleasure. The authors are extremely fortu- 
nate to be supported so generously. 

We thank our spouses, families, friends, and colleagues for their continuing sup- 
port, which has once again made the writing of this book possible. 

Just as we were completing this edition, Julian Lewis, our coauthor, friend, and 
colleague, finally succumbed to the cancer that he had fought so heroically for ten 
years. Starting in 1979, Julian made major contributions to all six editions, and, 
as our most elegant wordsmith, he elevated and enhanced both the style and tone 
of all the many chapters he touched. Noted for his careful scholarly approach, 
clarity and simplicity were at the core of his writing. Julian is irreplaceable, and we 
will all deeply miss his friendship and collaboration. We dedicate this Sixth Edition 
to his memory. 


Note to the Reader 


Structure of the Book 

Although the chapters of this book can be read independently of one another, they 
are arranged in a logical sequence of five parts. The first three chapters of Part I 
cover elementary principles and basic biochemistry. They can serve either as an 
introduction for those who have not studied biochemistry or as a refresher course 
for those who have. Part II deals with the storage, expression, and transmission 
of genetic information. Part III presents the principles of the main experimental 
methods for investigating and analyzing cells; here, a new section entitled “Math- 
ematical Analysis of Cell Functions” in Chapter 8 provides an extra dimension in 
our understanding of cell regulation and function. Part IV describes the internal 
organization of the cell. Part V follows the behavior of cells in multicellular sys- 
tems, starting with development of multicellular organisms and concluding with 
chapters on pathogens and infection and on the innate and adaptive immune 
systems. 


End-of-Chapter Problems 

A selection of problems, written by John Wilson and Tim Hunt, appears in the text 
at the end of each chapter. New to this edition are problems for the last four chap- 
ters on multicellular organisms. The complete solutions to all of these problems 
can be found in Molecular Biology of the Cell, Sixth Edition: The Problems Book. 


References 

A concise list of selected references is included at the end of each chapter. These 
are arranged in alphabetical order under the main chapter section headings. 
These references sometimes include the original papers in which important dis- 
coveries were first reported. 


Glossary Terms 

Throughout the book, boldface type has been used to highlight key terms at the 
point in a chapter where the main discussion occurs. Italic type is used to set off 
important terms with a lesser degree of emphasis. At the end of the book is an 
expanded glossary, covering technical terms that are part of the common cur- 
rency of cell biology; it should be the first resort for a reader who encounters an 
unfamiliar term. The complete glossary as well as a set of flashcards is available 
on the Student Website. 


Nomenclature for Genes and Proteins 
Each species has its own conventions for naming genes; the only common fea- 
ture is that they are always set in italics. In some species (such as humans), gene 
names are spelled out all in capital letters; in other species (such as zebrafish), all 
in lowercase; in yet others (most mouse genes), with the first letter in uppercase 
and rest in lowercase; or (as in Drosophila) with different combinations of upper- 
case and lowercase, according to whether the first mutant allele to be discovered 
produced a dominant or recessive phenotype. Conventions for naming protein 
products are equally varied. 

This typographical chaos drives everyone crazy. It is not just tiresome and 
absurd; it is also unsustainable. We cannot independently define a fresh conven- 
tion for each of the next few million species whose genes we may wish to study. 
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Moreover, there are many occasions, especially in a book such as this, where we 
need to refer to a gene generically—without specifying the mouse version, the 
human version, the chick version, or the hippopotamus version—because they 
are all equivalent for the purposes of our discussion. What convention then 
should we use? 

We have decided in this book to cast aside the different conventions that are 
used in individual species and follow a uniform rule: we write all gene names, like 
the names of people and places, with the first letter in uppercase and the rest in 
lowercase, but all in italics, thus: Apc, Bazooka, Cdc2, Dishevelled, Egl1. The cor- 
responding protein, where it is named after the gene, will be written in the same 
way, but in roman rather than italic letters: Apc, Bazooka, Cdc2, Dishevelled, Egl1. 
When it is necessary to specify the organism, this can be done with a prefix to the 
gene name. 

For completeness, we list a few further details of naming rules that we shall 
follow. In some instances, an added letter in the gene name is traditionally used 
to distinguish between genes that are related by function or evolution; for those 
genes, we put that letter in uppercase if it is usual to do so (LacZ, RecA, HoxA4). 
We use no hyphen to separate added letters or numbers from the rest of the name. 
Proteins are more of a problem. Many of them have names in their own right, 
assigned to them before the gene was named. Such protein names take many 
forms, although most of them traditionally begin with a lowercase letter (actin, 
hemoglobin, catalase), like the names of ordinary substances (cheese, nylon), 
unless they are acronyms (such as GFP, for Green Fluorescent Protein, or BMP4, 
for Bone Morphogenetic Protein #4). To force all such protein names into a uni- 
form style would do too much violence to established usages, and we shall simply 
write them in the traditional way (actin, GFP, and so on). For the corresponding 
gene names in all these cases, we shall nevertheless follow our standard rule: 
Actin, Hemoglobin, Catalase, Bmp4, Gfp. Occasionally in our book we need to 
highlight a protein name by setting it in italics for emphasis; the intention will 
generally be clear from the context. 

For those who wish to know them, the table below shows some of the official 
conventions for individual species—conventions that we shall mostly violate in 
this book, in the manner shown. 


integrin a-1, Itga1 integrin a1 


HoxA4 
BMP4 


HoxA4 
Integrin a1, Itgal 
HoxA4 


Cyclops, Cyc 
Unc6é 


integrin a1 
HoxA4 
Cyclops, Cyc 
Unc6 


HOXA4 HOXA4 
cyclops, Cyc Cyclops, Cyc 


unc-6 UNC-6 


sevenless, sev (named Sevenless, SEV 
after recessive phenotype) 


Deformed, Dfd (named Deformed, DFD 
after dominant mutant 
phenotype) 


Sevenless, Sev Sevenless, Sev 


Deformed, Dfd Deformed, Dfd 
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Yeast 


Saccharomyces cerevisiae | CDC28 Cdc28, Cdc28p | Cdc28 


(budding yeast) 


Schizosaccharomyces Cdc2 Cdc2, Cdc2p 
pombe (fission yeast) 
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NOTE TO THE READER 


Molecular Biology of the Cell, Sixth Edition: The Problems Book 

by John Wilson and Tim Hunt (ISBN: 978-0-8153-4453-7) 

The Problems Book is designed to help students appreciate the ways in which 
experiments and simple calculations can lead to an understanding of how cells 
work. It provides problems to accompany Chapters 1-20 of Molecular Biology 
of the Cell. Each chapter of problems is divided into sections that correspond to 
those of the main textbook and review key terms, test for understanding basic 
concepts, pose research-based problems, and now include MCAT-style questions 
which help students to prepare for standardized medical school admission tests. 
Molecular Biology of the Cell, Sixth Edition: The Problems Book should be useful 
for homework assignments and as a basis for class discussion. It could even pro- 
vide ideas for exam questions. Solutions for all of the problems are provided in the 
book. Solutions for the end-of-chapter problems for Chapters 1-24 in the main 
textbook are also found in The Problems Book. 


RESOURCES FOR INSTRUCTORS AND STUDENTS 


The teaching and learning resources for instructors and students are available 
online. The instructor’s resources are password-protected and available only to 
adopting instructors. The student resources are available to everyone. We hope 
these resources will enhance student learning and make it easier for instructors to 
prepare dynamic lectures and activities for the classroom. 


Instructor Resources 


Instructor Resources are available on the Garland Science Instructor’s Resource 
Site, located at www.garlandscience.com/instructors. The website provides access 
not only to the teaching resources for this book but also to all other Garland Sci- 
ence textbooks. Adopting instructors can obtain access to the site from their sales 
representative or by emailing science@garland.com. 


Art of Molecular Biology of the Cell, Sixth Edition 

The images from the book are available in two convenient formats: PowerPoint® 
and JPEG. They have been optimized for display on a computer. Figures are 
searchable by figure number, by figure name, or by keywords used in the figure 
legend from the book. 


Figure-Integrated Lecture Outlines 

The section headings, concept headings, and figures from the text have been inte- 
grated into PowerPoint presentations. These will be useful for instructors who 
would like a head start creating lectures for their course. Like all of our PowerPoint 
presentations, the lecture outlines can be customized. For example, the content of 
these presentations can be combined with videos and questions from the book or 
Question Bank, in order to create unique lectures that facilitate interactive learn- 
ing. 


Animations and Videos 

The 174 animations and videos that are available to students are also available on 
the Instructor’s Website in two formats. The WMV-formatted movies are created 
for instructors who wish to use the movies in PowerPoint presentations on Win- 
dows® computers; the QuickTime-formatted movies are for use in PowerPoint 
for Apple computers or Keynote® presentations. The movies can easily be down- 
loaded using the “download” button on the movie preview page. The movies are 
correlated to each chapter and callouts are highlighted in color. 


Media Guide 
This document provides an overview to the multimedia available for students and 
instructors and contains the text of the voice-over narration for all of the movies. 


Question Bank 
Written by Hossein Amiri, University of California, Santa Cruz, this greatly 
expanded question bank includes a variety of question formats: multiple choice, 
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short answer, fill-in-the-blank, true-false, and matching. There are 35-60 ques- 
tions per chapter, and a large number of the multiple-choice questions will be 
suitable for use with personal response systems (that is, clickers). The Question 
Bank was created with the philosophy that a good exam should do much more 
than simply test students’ ability to memorize information; it should require them 
to reflect upon and integrate information as a part of a sound understanding. This 
resource provides a comprehensive sampling of questions that can be used either 
directly or as inspiration for instructors to write their own test questions. 


Diploma® Test Generator Software 

The questions from the Question Bank have been loaded into the Diploma Test 
Generator software. The software is easy to use and can scramble questions to cre- 
ate multiple tests. Questions are organized by chapter and type and can be addi- 
tionally categorized by the instructor according to difficulty or subject. Existing 
questions can be edited and new ones added. The Test Generator is compatible 
with several course management systems, including Blackboard®. 


Medical Topics Guide 

This document highlights medically relevant topics covered throughout Molecular 
Biology of the Cell and The Problems Book. It will be particularly useful for instruc- 
tors with a large number of premedical, health science, or nursing students. 


Blackboard and Learning Management System (LMS) Integration 

The movies, book images, and student assessments that accompany the book 
can be integrated into Blackboard or other LMSs. These resources are bundled 
into a “Common Cartridge” or “Upload Package” that facilitates bulk uploading 
of textbook resources into Blackboard and other LMSs. The LMS Common 
Cartridge can be obtained on a DVD from your sales representative or by emailing 
science@garland.com. 


Resources for Students 


The resources for students are available on the Molecular Biology of the Cell 
Student Website, located at www.garlandscience.com/MBOC6-students. 


Animations and Videos 

There are 174 movies, covering a wide range of cell biology topics, which review 
key concepts in the book and illuminate subcellular processes. The movies are 
correlated to each chapter and callouts are highlighted in color. 


Cell Explorer Slides 
This application teaches cell morphology through interactive micrographs that 
highlight important cellular structures. 


Flashcards 
Each chapter contains a set of flashcards, built into the website, that allow stu- 
dents to review key terms from the text. 


Glossary 
The complete glossary from the book is available on the website and can be 
searched and browsed. 
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Detailed Contents 


Chapter 1 Cells and Genomes 


THE UNIVERSAL FEATURES OF CELLS ON EARTH 

All Cells Store Their Hereditary Information in the Same Linear 
Chemical Code: DNA 

All Cells Replicate Their Hereditary Information by Templated 
Polymerization 

All Cells Transcribe Portions of Their Hereditary Information into 
the Same Intermediary Form: RNA 

All Cells Use Proteins as Catalysts 

All Cells Translate RNA into Protein in the Same Way 

Each Protein Is Encoded by a Specific Gene 

Life Requires Free Energy 

All Cells Function as Biochemical Factories Dealing with the Same 
Basic Molecular Building Blocks 

All Cells Are Enclosed in a Plasma Membrane Across Which 
Nutrients and Waste Materials Must Pass 

A Living Cell Can Exist with Fewer Than 500 Genes 

Summary 


THE DIVERSITY OF GENOMES AND THE TREE OF LIFE 

Cells Can Be Powered by a Variety of Free-Energy Sources 

Some Cells Fix Nitrogen and Carbon Dioxide for Others 

The Greatest Biochemical Diversity Exists Among Prokaryotic Cells 

The Tree of Life Has Three Primary Branches: Bacteria, Archaea, 
and Eukaryotes 

Some Genes Evolve Rapidly; Others Are Highly Conserved 

Most Bacteria and Archaea Have 1000-6000 Genes 

New Genes Are Generated from Preexisting Genes 

Gene Duplications Give Rise to Families of Related Genes Within 
a Single Cell 

Genes Can Be Transferred Between Organisms, Both in the 
Laboratory and in Nature 

Sex Results in Horizontal Exchanges of Genetic Information 
Within a Species 

The Function of a Gene Can Often Be Deduced from Its Sequence 

More Than 200 Gene Families Are Common to All Three Primary 
Branches of the Tree of Life 

Mutations Reveal the Functions of Genes 

Molecular Biology Began with a Spotlight on E. coli 

Summary 


GENETIC INFORMATION IN EUKARYOTES 
Eukaryotic Cells May Have Originated as Predators 
Modern Eukaryotic Cells Evolved from a Symbiosis 
Eukaryotes Have Hybrid Genomes 
Eukaryotic Genomes Are Big 
Eukaryotic Genomes Are Rich in Regulatory DNA 
The Genome Defines the Program of Multicellular Development 
Many Eukaryotes Live as Solitary Cells 
A Yeast Serves as a Minimal Model Eukaryote 
The Expression Levels of All the Genes of An Organism 
Can Be Monitored Simultaneously 
Arabidopsis Has Been Chosen Out of 300,000 Species 
As a Model Plant 
The World of Animal Cells Is Represented By a Worm, a Fly, 
a Fish, a Mouse, and a Human 
Studies in Drosophila Provide a Key to Vertebrate Development 
The Vertebrate Genome Is a Product of Repeated Duplications 


CONOOS 


The Frog and the Zebrafish Provide Accessible Models for 
Vertebrate Development 

The Mouse ls the Predominant Mammalian Model Organism 

Humans Report on Their Own Peculiarities 

We Are All Different in Detail 

To Understand Cells and Organisms Will Require Mathematics, 
Computers, and Quantitative Information 

Summary 

Problems 

References 


Chapter 2 Cell Chemistry and Bioenergetics 


THE CHEMICAL COMPONENTS OF A CELL 

Water Is Held Together by Hydrogen Bonds 

Four Types of Noncovalent Attractions Help Bring Molecules 
Together in Cells 

Some Polar Molecules Form Acids and Bases in Water 

A Cell Is Formed from Carbon Compounds 

Cells Contain Four Major Families of Small Organic Molecules 

The Chemistry of Cells Is Dominated by Macromolecules with 
Remarkable Properties 

Noncovalent Bonds Specify Both the Precise Shape of a 
Macromolecule and Its Binding to Other Molecules 

Summary 


CATALYSIS AND THE USE OF ENERGY BY CELLS 

Cell Metabolism Is Organized by Enzymes 

Biological Order Is Made Possible by the Release of Heat Energy 
from Cells 

Cells Obtain Energy by the Oxidation of Organic Molecules 

Oxidation and Reduction Involve Electron Transfers 

Enzymes Lower the Activation-Energy Barriers That Block 
Chemical Reactions 

Enzymes Can Drive Substrate Molecules Along Specific Reaction 
Pathways 

How Enzymes Find Their Substrates: The Enormous Rapidity of 
Molecular Motions 

The Free-Energy Change for a Reaction, AG, Determines Whether 
It Can Occur Spontaneously 

The Concentration of Reactants Influences the Free-Energy 
Change and a Reaction’s Direction 

The Standard Free-Energy Change, AG°, Makes It Possible 
to Compare the Energetics of Different Reactions 

The Equilibrium Constant and AG° Are Readily Derived from 
Each Other 

The Free-Energy Changes of Coupled Reactions Are Additive 

Activated Carrier Molecules Are Essential for Biosynthesis 

The Formation of an Activated Carrier Is Coupled to an 
Energetically Favorable Reaction 

ATP Is the Most Widely Used Activated Carrier Molecule 

Energy Stored in ATP Is Often Harnessed to Join Two Molecules 
Together 

NADH and NADPH Are Important Electron Carriers 

There Are Many Other Activated Carrier Molecules in Cells 

The Synthesis of Biological Polymers Is Driven by ATP Hydrolysis 

Summary 


HOW CELLS OBTAIN ENERGY FROM FOOD 
Glycolysis Is a Central ATP-Producing Pathway 
Fermentations Produce ATP in the Absence of Oxygen 
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Glycolysis Illustrates How Enzymes Couple Oxidation to Energy 
Storage 

Organisms Store Food Molecules in Special Reservoirs 

Most Animal Cells Derive Their Energy from Fatty Acids Between 
Meals 


Sugars and Fats Are Both Degraded to Acetyl CoA in Mitochondria 


The Citric Acid Cycle Generates NADH by Oxidizing Acetyl 
Groups to COs 

Electron Transport Drives the Synthesis of the Majority of the ATP 
in Most Cells 

Amino Acids and Nucleotides Are Part of the Nitrogen Cycle 

Metabolism Is Highly Organized and Regulated 

Summary 

Problems 

References 


Chapter 3 Proteins 


THE SHAPE AND STRUCTURE OF PROTEINS 

The Shape of a Protein Is Specified by Its Amino Acid Sequence 

Proteins Fold into a Conformation of Lowest Energy 

The a Helix and the B Sheet Are Common Folding Patterns 

Protein Domains Are Modular Units from Which Larger Proteins 
Are Built 

Few of the Many Possible Polypeptide Chains Will Be Useful 
to Cells 

Proteins Can Be Classified into Many Families 

Some Protein Domains Are Found in Many Different Proteins 

Certain Pairs of Domains Are Found Together in Many Proteins 

The Human Genome Encodes a Complex Set of Proteins, 
Revealing That Much Remains Unknown 

Larger Protein Molecules Often Contain More Than One 
Polypeptide Chain 

Some Globular Proteins Form Long Helical Filaments 

Many Protein Molecules Have Elongated, Fibrous Shapes 

Proteins Contain a Surprisingly Large Amount of Intrinsically 
Disordered Polypeptide Chain 

Covalent Cross-Linkages Stabilize Extracellular Proteins 

Protein Molecules Often Serve as Subunits for the Assembly 
of Large Structures 

Many Structures in Cells Are Capable of Self-Assembly 

Assembly Factors Often Aid the Formation of Complex Biological 
Structures 

Amyloid Fibrils Can Form from Many Proteins 

Amyloid Structures Can Perform Useful Functions in Cells 

Many Proteins Contain Low-complexity Domains that Can Form 
“Reversible Amyloids” 

Summary 


PROTEIN FUNCTION 

All Proteins Bind to Other Molecules 

The Surface Conformation of a Protein Determines Its Chemistry 

Sequence Comparisons Between Protein Family Members 
Highlight Crucial Ligand-Binding Sites 

Proteins Bind to Other Proteins Through Several Types of 
Interfaces 

Antibody Binding Sites Are Especially Versatile 

The Equilibrium Constant Measures Binding Strength 

Enzymes Are Powerful and Highly Specific Catalysts 

Substrate Binding Is the First Step in Enzyme Catalysis 

Enzymes Speed Reactions by Selectively Stabilizing Transition 
States 

Enzymes Can Use Simultaneous Acid and Base Catalysis 

Lysozyme Illustrates How an Enzyme Works 

Tightly Bound Small Molecules Add Extra Functions to Proteins 

Multienzyme Complexes Help to Increase the Rate of Cell 
Metabolism 

The Cell Regulates the Catalytic Activities of Its Enzymes 

Allosteric Enzymes Have Two or More Binding Sites That Interact 


Two Ligands Whose Binding Sites Are Coupled Must Reciprocally 


Affect Each Other’s Binding 

Symmetric Protein Assemblies Produce Cooperative Allosteric 
Transitions 

Many Changes in Proteins Are Driven by Protein Phosphorylation 

A Eukaryotic Cell Contains a Large Collection of Protein Kinases 
and Protein Phosphatases 
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154 


The Regulation of the Src Protein Kinase Reveals How a Protein 
Can Function as a Microprocessor 

Proteins That Bind and Hydrolyze GTP Are Ubiquitous Cell 
Regulators 

Regulatory Proteins GAP and GEF Control the Activity of GTP- 
Binding Proteins by Determining Whether GTP or GDP 
Is Bound 

Proteins Can Be Regulated by the Covalent Addition of Other 
Proteins 

An Elaborate Ubiquitin-Conjugating System Is Used to Mark 
Proteins 

Protein Complexes with Interchangeable Parts Make Efficient 
Use of Genetic Information 

A GTP-Binding Protein Shows How Large Protein Movements 
Can Be Generated 

Motor Proteins Produce Large Movements in Cells 

Membrane-Bound Transporters Harness Energy to Pump 
Molecules Through Membranes 

Proteins Often Form Large Complexes That Function as Protein 
Machines 

Scaffolds Concentrate Sets of Interacting Proteins 

Many Proteins Are Controlled by Covalent Modifications That 
Direct Them to Specific Sites Inside the Cell 


A Complex Network of Protein Interactions Underlies Cell Function 


Summary 
Problems 
References 


Chapter 4 DNA, Chromosomes, and Genomes 


THE STRUCTURE AND FUNCTION OF DNA 

A DNA Molecule Consists of Two Complementary Chains of 
Nucleotides 

The Structure of DNA Provides a Mechanism for Heredity 

In Eukaryotes, DNA Is Enclosed in a Cell Nucleus 

Summary 


CHROMOSOMAL DNA AND ITS PACKAGING IN THE 
CHROMATIN FIBER 

Eukaryotic DNA Is Packaged into a Set of Chromosomes 

Chromosomes Contain Long Strings of Genes 

The Nucleotide Sequence of the Human Genome Shows How 
Our Genes Are Arranged 

Each DNA Molecule That Forms a Linear Chromosome Must 
Contain a Centromere, Two Telomeres, and Replication 
Origins 

DNA Molecules Are Highly Condensed in Chromosomes 

Nucleosomes Are a Basic Unit of Eukaryotic Chromosome 
Structure 

The Structure of the Nucleosome Core Particle Reveals How 
DNA Is Packaged 

Nucleosomes Have a Dynamic Structure, and Are Frequently 
Subjected to Changes Catalyzed by ATP-Dependent 
Chromatin Remodeling Complexes 

Nucleosomes Are Usually Packed Together into a Compact 
Chromatin Fiber 

Summary 


CHROMATIN STRUCTURE AND FUNCTION 

Heterochromatin Is Highly Organized and Restricts Gene 
Expression 

The Heterochromatic State Is Self-Propagating 


The Core Histones Are Covalently Modified at Many Different Sites 


Chromatin Acquires Additional Variety Through the Site-Specific 
Insertion of a Small Set of Histone Variants 

Covalent Modifications and Histone Variants Act in Concert to 
Control Chromosome Functions 

A Complex of Reader and Writer Proteins Can Spread Specific 
Chromatin Modifications Along a Chromosome 

Barrier DNA Sequences Block the Spread of Reader—Writer 
Complexes and thereby Separate Neighboring Chromatin 
Domains 

The Chromatin in Centromeres Reveals How Histone Variants 
Can Create Special Structures 

Some Chromatin Structures Can Be Directly Inherited 
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Experiments with Frog Embryos Suggest that both Activating 
and Repressive Chromatin Structures Can Be Inherited 
Epigenetically 

Chromatin Structures Are Important for Eukaryotic Chromosome 
Function 

Summary 


THE GLOBAL STRUCTURE OF CHROMOSOMES 

Chromosomes Are Folded into Large Loops of Chromatin 

Polytene Chromosomes Are Uniquely Useful for Visualizing 
Chromatin Structures 

There Are Multiple Forms of Chromatin 

Chromatin Loops Decondense When the Genes Within Them 
Are Expressed 

Chromatin Can Move to Specific Sites Within the Nucleus to 
Alter Gene Expression 

Networks of Macromolecules Form a Set of Distinct Biochemical 
Environments inside the Nucleus 

Mitotic Chromosomes Are Especially Highly Condensed 

Summary 


HOW GENOMES EVOLVE 

Genome Comparisons Reveal Functional DNA Sequences by 
their Conservation Throughout Evolution 

Genome Alterations Are Caused by Failures of the Normal 
Mechanisms for Copying and Maintaining DNA, as well as 
by Transposable DNA Elements 

The Genome Sequences of Two Species Differ in Proportion to 
the Length of Time Since They Have Separately Evolved 

Phylogenetic Trees Constructed from a Comparison of DNA 
Sequences Trace the Relationships of All Organisms 

A Comparison of Human and Mouse Chromosomes Shows 
How the Structures of Genomes Diverge 

The Size of a Vertebrate Genome Reflects the Relative Rates 
of DNA Addition and DNA Loss in a Lineage 

We Can Infer the Sequence of Some Ancient Genomes 

Multispecies Sequence Comparisons Identify Conserved DNA 
Sequences of Unknown Function 

Changes in Previously Conserved Sequences Can Help 
Decipher Critical Steps in Evolution 

Mutations in the DNA Sequences That Control Gene Expression 
Have Driven Many of the Evolutionary Changes in Vertebrates 

Gene Duplication Also Provides an Important Source of Genetic 
Novelty During Evolution 

Duplicated Genes Diverge 

The Evolution of the Globin Gene Family Shows How DNA 
Duplications Contribute to the Evolution of Organisms 

Genes Encoding New Proteins Can Be Created by the 
Recombination of Exons 

Neutral Mutations Often Spread to Become Fixed in a Population, 
with a Probability That Depends on Population Size 

A Great Deal Can Be Learned from Analyses of the Variation 
Among Humans 

Summary 

Problems 

References 


Chapter 5 DNA Replication, Repair, and 
Recombination 


THE MAINTENANCE OF DNA SEQUENCES 

Mutation Rates Are Extremely Low 

Low Mutation Rates Are Necessary for Life as We Know It 
Summary 


DNA REPLICATION MECHANISMS 

Base-Pairing Underlies DNA Replication and DNA Repair 

The DNA Replication Fork Is Asymmetrical 

The High Fidelity of DNA Replication Requires Several 
Proofreading Mechanisms 

Only DNA Replication in the 5'-to-3’ Direction Allows Efficient 
Error Correction 

A Special Nucleotide-Polymerizing Enzyme Synthesizes Short 
RNA Primer Molecules on the Lagging Strand 

Special Proteins Help to Open Up the DNA Double Helix in Front 
of the Replication Fork 

A Sliding Ring Holds a Moving DNA Polymerase Onto the DNA 
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The Proteins at a Replication Fork Cooperate to Form a 
Replication Machine 

A Strand-Directed Mismatch Repair System Removes Replication 
Errors That Escape from the Replication Machine 

DNA Topoisomerases Prevent DNA Tangling During Replication 

DNA Replication Is Fundamentally Similar in Eukaryotes and 
Bacteria 

Summary 


THE INITIATION AND COMPLETION OF DNA REPLICATION 
IN CHROMOSOMES 

DNA Synthesis Begins at Replication Origins 

Bacterial Chromosomes Typically Have a Single Origin of DNA 
Replication 

Eukaryotic Chromosomes Contain Multiple Origins of Replication 

In Eukaryotes, DNA Replication Takes Place During Only One 
Part of the Cell Cycle 

Different Regions on the Same Chromosome Replicate at Distinct 
Times in S Phase 

A Large Multisubunit Complex Binds to Eukaryotic Origins of 
Replication 

Features of the Human Genome That Specify Origins of 
Replication Remain to Be Discovered 

New Nucleosomes Are Assembled Behind the Replication Fork 

Telomerase Replicates the Ends of Chromosomes 

Telomeres Are Packaged Into Specialized Structures That 
Protect the Ends of Chromosomes 

Telomere Length Is Regulated by Cells and Organisms 

Summary 


DNA REPAIR 

Without DNA Repair, Soontaneous DNA Damage Would Rapidly 
Change DNA Sequences 

The DNA Double Helix Is Readily Repaired 

DNA Damage Can Be Removed by More Than One Pathway 

Coupling Nucleotide Excision Repair to Transcription Ensures 
That the Cell’s Most Important DNA Is Efficiently Repaired 

The Chemistry of the DNA Bases Facilitates Damage Detection 

Special Translesion DNA Polymerases Are Used in Emergencies 

Double-Strand Breaks Are Efficiently Repaired 

DNA Damage Delays Progression of the Cell Cycle 

Summary 


HOMOLOGOUS RECOMBINATION 

Homologous Recombination Has Common Features in All Cells 

DNA Base-Pairing Guides Homologous Recombination 

Homologous Recombination Can Flawlessly Repair Double- 
Strand Breaks in DNA 

Strand Exchange Is Carried Out by the RecA/Rad51 Protein 

Homologous Recombination Can Rescue Broken DNA 
Replication Forks 

Cells Carefully Regulate the Use of Homologous Recombination 
in DNA Repair 

Homologous Recombination Is Crucial for Meiosis 

Meiotic Recombination Begins with a Programmed Double-Strand 
Break 

Holliday Junctions Are Formed During Meiosis 

Homologous Recombination Produces Both Crossovers and 
Non-Crossovers During Meiosis 

Homologous Recombination Often Results in Gene Conversion 

Summary 


TRANSPOSITION AND CONSERVATIVE SITE-SPECIFIC 
RECOMBINATION 

Through Transposition, Mobile Genetic Elements Can Insert 
Into Any DNA Sequence 

DNA-Only Transposons Can Move by a Cut-and-Paste 
Mechanism 

Some Viruses Use a Transposition Mechanism to Move 
Themselves Into Host-Cell Chromosomes 

Retroviral-like Retrotransoosons Resemble Retroviruses, but 
Lack a Protein Coat 

A Large Fraction of the Human Genome Is Composed of 
Nonretroviral Retrotransposons 

Different Transposable Elements Predominate in Different 
Organisms 

Genome Sequences Reveal the Approximate Times at Which 
Transposable Elements Have Moved 
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Conservative Site-Specific Recombination Can Reversibly 
Rearrange DNA 

Conservative Site-Specific Recombination Can Be Used to 
Turn Genes On or Off 


Bacterial Conservative Site-Specific Recombinases Have Become 


Powerful Tools for Cell and Developmental Biologists 
Summary 
Problems 
References 


Chapter 6 How Cells Read the Genome: 
From DNA to Protein 


FROM DNA TO RNA 

RNA Molecules Are Single-Stranded 

Transcription Produces RNA Complementary to One Strand 
of DNA 

RNA Polymerases Carry Out Transcription 

Cells Produce Different Categories of RNA Molecules 

Signals Encoded in DNA Tell RNA Polymerase Where to Start 
and Stop 

Transcription Start and Stop Signals Are Heterogeneous in 
Nucleotide Sequence 

Transcription Initiation in Eukaryotes Requires Many Proteins 

RNA Polymerase I| Requires a Set of General Transcription 
Factors 

Polymerase || Also Requires Activator, Mediator, and Chromatin- 
Modifying Proteins 

Transcription Elongation in Eukaryotes Requires Accessory 
Proteins 

Transcription Creates Superhelical Tension 

Transcription Elongation in Eukaryotes Is Tightly Coupled to RNA 
Processing 

RNA Capping Is the First Modification of Eukaryotic Pre-mRNAs 

RNA Splicing Removes Intron Sequences from Newly 
Transcribed Pre-mRNAs 

Nucleotide Sequences Signal Where Splicing Occurs 

RNA Splicing Is Performed by the Spliceosome 

The Spliceosome Uses ATP Hydrolysis to Produce a Complex 
Series of RNA-RNA Rearrangements 

Other Properties of Pre-mRNA and Its Synthesis Help to Explain 
the Choice of Proper Splice Sites 

Chromatin Structure Affects RNA Splicing 

RNA Splicing Shows Remarkable Plasticity 

Spliceosome-Catalyzed RNA Splicing Probably Evolved from 
Self-splicing Mechanisms 

RNA-Processing Enzymes Generate the 3’ End of Eukaryotic 
mRNAs 

Mature Eukaryotic mRNAs Are Selectively Exported from the 
Nucleus 

Noncoding RNAs Are Also Synthesized and Processed in the 
Nucleus 

The Nucleolus Is a RiboSome-Producing Factory 

The Nucleus Contains a Variety of Subnuclear Aggregates 

Summary 


FROM RNA TO PROTEIN 

An mRNA Sequence Is Decoded in Sets of Three Nucleotides 

tRNA Molecules Match Amino Acids to Codons in mRNA 

tRNAs Are Covalently Modified Before They Exit from the Nucleus 

Specific Enzymes Couple Each Amino Acid to Its Appropriate 
tRNA Molecule 

Editing by tRNA Synthetases Ensures Accuracy 

Amino Acids Are Added to the C-terminal End of a Growing 
Polypeptide Chain 

The RNA Message Is Decoded in Ribosomes 

Elongation Factors Drive Translation Forward and Improve Its 
Accuracy 

Many Biological Processes Overcome the Inherent Limitations of 
Complementary Base-Pairing 

Accuracy in Translation Requires an Expenditure of Free Energy 

The Ribosome Is a Ribozyme 

Nucleotide Sequences in MRNA Signal Where to Start Protein 
Synthesis 

Stop Codons Mark the End of Translation 
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Proteins Are Made on Polyribosomes 

There Are Minor Variations in the Standard Genetic Code 

Inhibitors of Prokaryotic Protein Synthesis Are Useful as 
Antibiotics 

Quality Control Mechanisms Act to Prevent Translation of 
Damaged mRNAs 

Some Proteins Begin to Fold While Still Being Synthesized 

Molecular Chaperones Help Guide the Folding of Most Proteins 

Cells Utilize Several Tyoes of Chaperones 

Exposed Hydrophobic Regions Provide Critical Signals for 
Protein Quality Control 

The Proteasome Is a Compartmentalized Protease with 
Sequestered Active Sites 

Many Proteins Are Controlled by Regulated Destruction 

There Are Many Steps From DNA to Protein 

Summary 


THE RNA WORLD AND THE ORIGINS OF LIFE 

Single-Stranded RNA Molecules Can Fold into Highly Elaborate 
Structures 

RNA Can Both Store Information and Catalyze Chemical 
Reactions 

How Did Protein Synthesis Evolve? 

All Present-Day Cells Use DNA as Their Hereditary Material 

Summary 

Problems 

References 


Chapter 7 Control of Gene Expression 


AN OVERVIEW OF GENE CONTROL 

The Different Cell Types of a Multicellular Organism Contain 
the Same DNA 

Different Cell Types Synthesize Different Sets of RNAs and 
Proteins 

External Signals Can Cause a Cell to Change the Expression 
of Its Genes 

Gene Expression Can Be Regulated at Many of the Steps 
in the Pathway from DNA to RNA to Protein 

Summary 


CONTROL OF TRANSCRIPTION BY SEQUENCE-SPECIFIC 
DNA-BINDING PROTEINS 

The Sequence of Nucleotides in the DNA Double Helix Can Be 
Read by Proteins 

Transcription Regulators Contain Structural Motifs That Can 
Read DNA Sequences 

Dimerization of Transcription Regulators Increases Their Affinity 
and Specificity for DNA 

Transcription Regulators Bind Cooperatively to DNA 

Nucleosome Structure Promotes Cooperative Binding of 
Transcription Regulators 

Summary 


TRANSCRIPTION REGULATORS SWITCH GENES ON 
AND OFF 

The Tryptophan Repressor Switches Genes Off 

Repressors Turn Genes Off and Activators Turn Them On 

An Activator and a Repressor Control the Lac Operon 

DNA Looping Can Occur During Bacterial Gene Regulation 

Complex Switches Control Gene Transcription in Eukaryotes 

A Eukaryotic Gene Control Region Consists of a Promoter 
Plus Many cis-Regulatory Sequences 

Eukaryotic Transcription Regulators Work in Groups 

Activator Proteins Promote the Assembly of RNA Polymerase 
at the Start Point of Transcription 

Eukaryotic Transcription Activators Direct the Modification of 
Local Chromatin Structure 

Transcription Activators Can Promote Transcription by Releasing 
RNA Polymerase from Promoters 

Transcription Activators Work Synergistically 

Eukaryotic Transcription Repressors Can Inhibit Transcription 
in Several Ways 

Insulator DNA Sequences Prevent Eukaryotic Transcription 
Regulators from Influencing Distant Genes 

Summary 
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DETAILED CONTENTS 


MOLECULAR GENETIC MECHANISMS THAT CREATE AND 
MAINTAIN SPECIALIZED CELL TYPES 

Complex Genetic Switches That Regulate Drosophila 
Development Are Built Up from Smaller Molecules 


The Drosophila Eve Gene Is Regulated by Combinatorial Controls 


Transcription Regulators Are Brought Into Play by Extracellular 
Signals 

Combinatorial Gene Control Creates Many Different Cell Types 

Specialized Cell Types Can Be Experimentally Reprogrammed 
to Become Pluripotent Stem Cells 

Combinations of Master Transcription Regulators Specify Cell 
Types by Controlling the Expression of Many Genes 

Specialized Cells Must Rapidly Turn Sets of Genes On and Off 

Differentiated Cells Maintain Their Identity 


Transcription Circuits Allow the Cell to Carry Out Logic Operations 


Summary 


MECHANISMS THAT REINFORCE CELL MEMORY IN 
PLANTS AND ANIMALS 

Patterns of DNA Methylation Can Be Inherited When Vertebrate 
Cells Divide 

CG-Rich Islands Are Associated with Many Genes in Mammals 

Genomic Imprinting Is Based on DNA Methylation 

Chromosome-Wide Alterations in Chromatin Structure Can Be 
Inherited 

Epigenetic Mechanisms Ensure That Stable Patterns of Gene 
Expression Can Be Transmitted to Daughter Cells 

Summary 


POST-TRANSCRIPTIONAL CONTROLS 
Transcription Attenuation Causes the Premature Termination of 
Some RNA Molecules 


Riboswitches Probably Represent Ancient Forms of Gene Control 
Alternative RNA Splicing Can Produce Different Forms of a Protein 


from the Same Gene 

The Definition of a Gene Has Been Modified Since the Discovery 
of Alternative RNA Splicing 

A Change in the Site of RNA Transcript Cleavage and Poly-A 
Addition Can Change the C-terminus of a Protein 

RNA Editing Can Change the Meaning of the RNA Message 

RNA Transport from the Nucleus Can Be Regulated 

Some mRNAs Are Localized to Specific Regions of the Cytosol 

The 5’ and 3’ Untranslated Regions of mRNAs Control Their 
Translation 

The Phosphorylation of an Initiation Factor Regulates Protein 
Synthesis Globally 

Initiation at AUG Codons Upstream of the Translation Start Can 
Regulate Eukaryotic Translation Initiation 

Internal Ribosome Entry Sites Provide Opportunities for 
Translational Control 

Changes in mRNA Stability Can Regulate Gene Expression 

Regulation of MRNA Stability Involves P-bodies and Stress 
Granules 

Summary 


REGULATION OF GENE EXPRESSION BY NONCODING RNAs 


Small Noncoding RNA Transcripts Regulate Many Animal and 
Plant Genes Through RNA Interference 

miRNAs Regulate mRNA Translation and Stability 

RNA Interference Is Also Used as a Cell Defense Mechanism 

RNA Interference Can Direct Heterochromatin Formation 

pIRNAs Protect the Germ Line from Transposable Elements 

RNA Interference Has Become a Powerful Experimental Tool 

Bacteria Use Small Noncoding RNAs to Protect Themselves 
from Viruses 

Long Noncoding RNAs Have Diverse Functions in the Cell 

Summary 

Problems 
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Chapter 8 Analyzing Cells, Molecules, and 
Systems 


ISOLATING CELLS AND GROWING THEM IN CULTURE 

Cells Can Be Isolated from Tissues 

Cells Can Be Grown in Culture 

Eukaryotic Cell Lines Are a Widely Used Source of 
Homogeneous Cells 
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Hybridoma Cell Lines Are Factories That Produce Monoclonal 
Antibodies 
Summary 


PURIFYING PROTEINS 
Cells Can Be Separated into Their Component Fractions 


Cell Extracts Provide Accessible Systems to Study Cell Functions 


Proteins Can Be Separated by Chromatography 

Immunoprecipitation Is a Rapid Affinity Purification Method 

Genetically Engineered Tags Provide an Easy Way to Purify 
Proteins 

Purified Cell-free Systems Are Required for the Precise 
Dissection of Molecular Functions 

Summary 


ANALYZING PROTEINS 

Proteins Can Be Separated by SDS Polyacrylamide-Gel 
Electrophoresis 

Two-Dimensional Gel Electrophoresis Provides Greater Protein 
Separation 

Specific Proteins Can Be Detected by Blotting with Antibodies 

Hydrodynamic Measurements Reveal the Size and Shape of 
a Protein Complex 

Mass Spectrometry Provides a Highly Sensitive Method for 
Identifying Unknown Proteins 

Sets of Interacting Proteins Can Be Identified by Biochemical 
Methods 

Optical Methods Can Monitor Protein Interactions 

Protein Function Can Be Selectively Disrupted With Small 
Molecules 

Protein Structure Can Be Determined Using X-Ray Diffraction 

NMR Can Be Used to Determine Protein Structure in Solution 

Protein Sequence and Structure Provide Clues About Protein 
Function 

Summary 


ANALYZING AND MANIPULATING DNA 

Restriction Nucleases Cut Large DNA Molecules into Specific 
Fragments 

Gel Electrophoresis Separates DNA Molecules of Different Sizes 

Purified DNA Molecules Can Be Specifically Labeled with 
Radioisotopes or Chemical Markers in vitro 

Genes Can Be Cloned Using Bacteria 

An Entire Genome Can Be Represented in a DNA Library 

Genomic and cDNA Libraries Have Different Advantages and 
Drawbacks 

Hybridization Provides a Powerful, But Simple Way to Detect 
Specific Nucleotide Sequences 

Genes Can Be Cloned in vitro Using PCR 

PCR Is Also Used for Diagnostic and Forensic Applications 

Both DNA and RNA Can Be Rapidly Sequenced 

To Be Useful, Genome Sequences Must Be Annotated 

DNA Cloning Allows Any Protein to be Produced in Large 
Amounts 

Summary 


STUDYING GENE EXPRESSION AND FUNCTION 

Classical Genetics Begins by Disrupting a Cell Process by 
Random Mutagenesis 

Genetic Screens Identify Mutants with Specific Abnormalities 

Mutations Can Cause Loss or Gain of Protein Function 


Complementation Tests Reveal Whether Two Mutations Are in the 


Same Gene or Different Genes 

Gene Products Can Be Ordered in Pathways by Epistasis 
Analysis 

Mutations Responsible for a Phenotype Can Be Identified 
Through DNA Analysis 

Rapid and Cheap DNA Sequencing Has Revolutionized 
Human Genetic Studies 

Linked Blocks of Polymorphisms Have Been Passed Down 
from Our Ancestors 

Polymorphisms Can Aid the Search for Mutations Associated 
with Disease 

Genomics Is Accelerating the Discovery of Rare Mutations That 
Predispose Us to Serious Disease 

Reverse Genetics Begins with a Known Gene and Determines 
Which Cell Processes Require Its Function 

Animals and Plants Can Be Genetically Altered 
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XXVI DETAILED CONTENTS 


The Bacterial CRISPR System Has Been Adapted to Edit 
Genomes in a Wide Variety of Species 

Large Collections of Engineered Mutations Provide a Tool for 
Examining the Function of Every Gene in an Organism 

RNA Interference Is a Simple and Rapid Way to Test Gene 
Function 

Reporter Genes Reveal When and Where a Gene Is Expressed 

In situ Hybridization Can Reveal the Location of mRNAs and 
Noncoding RNAs 

Expression of Individual Genes Can Be Measured Using 
Quantitative RT-PCR 

Analysis of mRNAs by Microarray or RNA-seg Provides a 
Snapshot of Gene Expression 

Genome-wide Chromatin Immunoprecipitation Identifies Sites 
on the Genome Occupied by Transcription Regulators 

Ribosome Profiling Reveals Which mRNAs Are Being Translated 
in the Cell 

Recombinant DNA Methods Have Revolutionized Human Health 

Transgenic Plants Are Important for Agriculture 

Summary 


MATHEMATICAL ANALYSIS OF CELL FUNCTIONS 

Regulatory Networks Depend on Molecular Interactions 

Differential Equations Help Us Predict Transient Behavior 

Both Promoter Activity and Protein Degradation Affect the Rate 
of Change of Protein Concentration 

The Time Required to Reach Steady State Depends on Protein 
Lifetime 

Quantitative Methods Are Similar for Transcription Repressors 
and Activators 

Negative Feedback Is a Powerful Strategy in Cell Regulation 

Delayed Negative Feedback Can Induce Oscillations 

DNA Binding By a Repressor or an Activator Can Be Cooperative 

Positive Feedback Is Important for Switchlike Responses 
and Bistability 

Robustness Is an Important Characteristic of Biological Networks 

Two Transcription Regulators That Bind to the Same Gene 
Promoter Can Exert Combinatorial Control 

An Incoherent Feed-forward Interaction Generates Pulses 

A Coherent Feed-forward Interaction Detects Persistent Inputs 

The Same Network Can Behave Differently in Different Cells Due 
to Stochastic Effects 

Several Computational Approaches Can Be Used to Model the 
Reactions in Cells 

Statistical Methods Are Critical For the Analysis of Biological Data 

Summary 

Problems 

References 


Chapter 9 Visualizing Cells 


LOOKING AT CELLS IN THE LIGHT MICROSCOPE 

The Light Microscope Can Resolve Details 0.2 um Apart 

Photon Noise Creates Additional Limits to Resolution When 
Light Levels Are Low 

Living Cells Are Seen Clearly in a Phase-Contrast or a 
Differential-Interference-Contrast Microscope 

Images Can Be Enhanced and Analyzed by Digital Techniques 

Intact Tissues Are Usually Fixed and Sectioned Before Microscopy 

Specific Molecules Can Be Located in Cells by Fluorescence 
Microscopy 

Antibodies Can Be Used to Detect Specific Molecules 

Imaging of Complex Three-Dimensional Objects Is Possible with 
the Optical Microscope 

The Confocal Microscope Produces Optical Sections by 
Excluding Out-of-Focus Light 

Individual Proteins Can Be Fluorescently Tagged in Living Cells 
and Organisms 

Protein Dynamics Can Be Followed in Living Cells 

Light-Emitting Indicators Can Measure Rapidly Changing 
Intracellular lon Concentrations 

Single Molecules Can Be Visualized by Total Internal Reflection 
Fluorescence Microscopy 

Individual Molecules Can Be Touched, Imaged, and Moved Using 
Atomic Force Microscopy 
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Superresolution Fluorescence Techniques Can Overcome 
Diffraction-Limited Resolution 

Superresolution Can Also be Achieved Using Single-Molecule 
Localization Methods 

Summary 


LOOKING AT CELLS AND MOLECULES IN THE ELECTRON 
MICROSCOPE 

The Electron Microscope Resolves the Fine Structure of the Cell 

Biological Specimens Require Special Preparation for Electron 
Microscopy 

Specific Macromolecules Can Be Localized by Immunogold 
Electron Microscopy 

Different Views of a Single Object Can Be Combined to Give 
a Three-Dimensional Reconstruction 

Images of Surfaces Can Be Obtained by Scanning Electron 
Microscopy 

Negative Staining and Cryoelectron Microscopy Both Allow 
Macromolecules to Be Viewed at High Resolution 

Multiple Images Can Be Combined to Increase Resolution 

Summary 

Problems 

References 


Chapter 10 Membrane Structure 


THE LIPID BILAYER 

Phosphoglycerides, Sphingolipids, and Sterols Are the Major 
Lipids in Cell Membranes 

Phospholipids Spontaneously Form Bilayers 

The Lipid Bilayer Is a Two-dimensional Fluid 

The Fluidity of a Lipid Bilayer Depends on Its Composition 

Despite Their Fluidity, Lipid Bilayers Can Form Domains of 
Different Compositions 

Lipid Droplets Are Surrounded by a Phospholipid Monolayer 

The Asymmetry of the Lipid Bilayer Is Functionally Important 

Glycolipids Are Found on the Surface of All Eukaryotic Plasma 
Membranes 

Summary 


MEMBRANE PROTEINS 

Membrane Proteins Can Be Associated with the Lipid Bilayer 
in Various Ways 

Lipid Anchors Control the Membrane Localization of Some 
Signaling Proteins 

In Most Transmembrane Proteins, the Polypeptide Chain 
Crosses the Lipid Bilayer in an a-Helical Conformation 

Transmembrane a Helices Often Interact with One Another 

Some B Barrels Form Large Channels 

Many Membrane Proteins Are Glycosylated 

Membrane Proteins Can Be Solubilized and Purified in Detergents 

Bacteriorhodopsin Is a Light-driven Proton (H+) Pump That 
Traverses the Lipid Bilayer as Seven a Helices 

Membrane Proteins Often Function as Large Complexes 

Many Membrane Proteins Diffuse in the Plane of the Membrane 

Cells Can Confine Proteins and Lipids to Specific Domains 
Within a Membrane 

The Cortical Cytoskeleton Gives Membranes Mechanical 
Strength and Restricts Membrane Protein Diffusion 

Membrane-bending Proteins Deform Bilayers 

Summary 

Problems 

References 
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Chapter 11 Membrane Transport of Small Molecules 


and the Electrical Properties of Membranes 


PRINCIPLES OF MEMBRANE TRANSPORT 

Protein-Free Lipid Bilayers Are Impermeable to lons 

There Are Two Main Classes of Membrane Transport Proteins: 
Transporters and Channels 

Active Transport Is Mediated by Transporters Coupled to an 
Energy Source 

Summary 


TRANSPORTERS AND ACTIVE MEMBRANE TRANSPORT 
Active Transport Can Be Driven by lon-Concentration Gradients 
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Transporters in the Plasma Membrane Regulate Cytosolic pH 

An Asymmetric Distribution of Transporters in Epithelial Cells 
Underlies the Transcellular Transport of Solutes 

There Are Three Classes of ATP-Driven Pumps 

A P-type ATPase Pumps Ca?+ into the Sarcoplasmic Reticulum 
in Muscle Cells 

The Plasma Membrane Nat-K* Pump Establishes Nat and K+ 
Gradients Across the Plasma Membrane 

ABC Transporters Constitute the Largest Family of Membrane 
Transport Proteins 

Summary 


CHANNELS AND THE ELECTRICAL PROPERTIES OF 
MEMBRANES 

Aquaporins Are Permeable to Water But Impermeable to lons 

lon Channels Are lon-Selective and Fluctuate Between Open 
and Closed States 

The Membrane Potential in Animal Cells Depends Mainly on K* 
Leak Channels and the K* Gradient Across the Plasma 
Membrane 

The Resting Potential Decays Only Slowly When the Nat-K*t 
Pump Is Stopped 

The Three-Dimensional Structure of a Bacterial K* Channel 
Shows How an lon Channel Can Work 

Mechanosensitive Channels Protect Bacterial Cells Against 
Extreme Osmotic Pressures 

The Function of a Neuron Depends on Its Elongated Structure 

Voltage-Gated Cation Channels Generate Action Potentials in 
Electrically Excitable Cells 

The Use of Channelrhodopsins Has Revolutionized the Study 
of Neural Circuits 

Myelination Increases the Speed and Efficiency of Action Potential 
Propagation in Nerve Cells 

Patch-Clamp Recording Indicates That Individual lon Channels 
Open in an All-or-Nothing Fashion 

Voltage-Gated Cation Channels Are Evolutionarily and Structurally 
Related 

Different Neuron Types Display Characteristic Stable Firing 
Properties 

Transmitter-Gated lon Channels Convert Chemical Signals into 
Electrical Ones at Chemical Synapses 

Chemical Synapses Can Be Excitatory or Inhibitory 

The Acetylcholine Receptors at the Neuromuscular Junction Are 
Excitatory Transmitter-Gated Cation Channels 

Neurons Contain Many Types of Transmitter-Gated Channels 

Many Psychoactive Drugs Act at Synapses 

Neuromuscular Transmission Involves the Sequential Activation 
of Five Different Sets of lon Channels 

Single Neurons Are Complex Computation Devices 

Neuronal Computation Requires a Combination of at Least Three 
Kinds of K+ Channels 

Long-Term Potentiation (LTP) in the Mammalian Hippocampus 
Depends on Ca?+ Entry Through NMDA-Receptor Channels 

Summary 

Problems 
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Chapter 12 Intracellular Compartments and 
Protein Sorting 


THE COMPARTMENTALIZATION OF CELLS 

All Eukaryotic Cells Have the Same Basic Set of Membrane- 
enclosed Organelles 

Evolutionary Origins May Help Explain the Topological 
Relationships of Organelles 

Proteins Can Move Between Compartments in Different Ways 

Signal Sequences and Sorting Receptors Direct Proteins to the 
Correct Cell Address 

Most Organelles Cannot Be Constructed De Novo: They Require 
Information in the Organelle Itself 

Summary 


THE TRANSPORT OF MOLECULES BETWEEN THE 
NUCLEUS AND THE CYTOSOL 
Nuclear Pore Complexes Perforate the Nuclear Envelope 
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Nuclear Localization Signals Direct Nuclear Proteins to the Nucleus 650 


xxvii 

Nuclear Import Receptors Bind to Both Nuclear Localization 

Signals and NPC Proteins 652 
Nuclear Export Works Like Nuclear Import, But in Reverse 652 
The Ran GTPase Imposes Directionality on Transport Through 

NPCs 653 
Transport Through NPCs Can Be Regulated by Controlling 

Access to the Transport Machinery 654 
During Mitosis the Nuclear Envelope Disassembles 656 
Summary 657 
THE TRANSPORT OF PROTEINS INTO MITOCHONDRIA AND 

CHLOROPLASTS 658 
Translocation into Mitochondria Depends on Signal Sequences 

and Protein Translocators 659 
Mitochondrial Precursor Proteins Are Imported as Unfolded 

Polypeptide Chains 660 
ATP Hydrolysis and a Membrane Potential Drive Protein Import 

Into the Matrix Space 661 
Bacteria and Mitochondria Use Similar Mechanisms to Insert 

Porins into their Outer Membrane 662 
Transport Into the Inner Mitochondrial Membrane and 

Intermembrane Space Occurs Via Several Routes 663 
Two Signal Sequences Direct Proteins to the Thylakoid Membrane 

in Chloroplasts 664 
Summary 666 
PEROXISOMES 666 
Peroxisomes Use Molecular Oxygen and Hydrogen Peroxide 

to Perform Oxidation Reactions 666 
A Short Signal Sequence Directs the Import of Proteins into 

Peroxisomes 667 
Summary 669 
THE ENDOPLASMIC RETICULUM 669 
The ER Is Structurally and Functionally Diverse 670 
Signal Sequences Were First Discovered in Proteins Imported 

into the Rough ER 672 
A Signal-Recognition Particle (SRP) Directs the ER Signal 

Sequence to a Specific Receptor in the Rough ER Membrane 673 
The Polypeptide Chain Passes Through an Aqueous Channel 

in the Translocator 675 
Translocation Across the ER Membrane Does Not Always 

Require Ongoing Polypeptide Chain Elongation 677 
In Single-Pass Transmembrane Proteins, a Single Internal ER 

Signal Sequence Remains in the Lipid Bilayer as a Membrane- 

spanning a Helix 677 
Combinations of Start-Transfer and Stop- Transfer Signals 

Determine the Topology of Multipass Transmembrane Proteins 679 
ER Tail-anchored Proteins Are Integrated into the ER Membrane 

by a Special Mechanism 682 
Translocated Polypeptide Chains Fold and Assemble in the 

Lumen of the Rough ER 682 
Most Proteins Synthesized in the Rough ER Are Glycosylated by 

the Addition of a Common N-Linked Oligosaccharide 683 
Oligosaccharides Are Used as Tags to Mark the State of Protein 

Folding 685 
Improperly Folded Proteins Are Exported from the ER and 

Degraded in the Cytosol 685 
Misfolded Proteins in the ER Activate an Unfolded Protein 

Response 686 
Some Membrane Proteins Acquire a Covalently Attached 

Glycosylophosphatidylinositol (GPI) Anchor 688 
The ER Assembles Most Lipid Bilayers 689 
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Chapter 13 Intracellular Membrane Traffic 695 
THE MOLECULAR MECHANISMS OF MEMBRANE 

TRANSPORT AND THE MAINTENANCE OF 
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There Are Various Types of Coated Vesicles 697 
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Membrane-Bending Proteins Helo Deform the Membrane During 
Vesicle Formation 

Cytoplasmic Proteins Regulate the Pinching-Off and Uncoating 
of Coated Vesicles 

Monomeric GTPases Control Coat Assembly 

Not All Transport Vesicles Are Spherical 

Rab Proteins Guide Transport Vesicles to Their Target Membrane 

Rab Cascades Can Change the Identity of an Organelle 

SNAREs Mediate Membrane Fusion 

Interacting SNAREs Need to Be Pried Apart Before They Can 
Function Again 

Summary 


TRANSPORT FROM THE ER THROUGH THE GOLGI 
APPARATUS 

Proteins Leave the ER in COPII-Coated Transport Vesicles 

Only Proteins That Are Properly Folded and Assembled Can 
Leave the ER 

Vesicular Tubular Clusters Mediate Transport from the ER to 
the Golgi Apparatus 

The Retrieval Pathway to the ER Uses Sorting Signals 

Many Proteins Are Selectively Retained in the Compartments 
in Which They Function 

The Golgi Apparatus Consists of an Ordered Series of 
Compartments 

Oligosaccharide Chains Are Processed in the Golgi Apparatus 

Proteoglycans Are Assembled in the Golgi Apparatus 

What Is the Purpose of Glycosylation? 

Transport Through the Golgi Apparatus May Occur by 
Cisternal Maturation 

Golgi Matrix Proteins Helo Organize the Stack 

Summary 


TRANSPORT FROM THE TRANS GOLGI NETWORK TO 
LYSOSOMES 

Lysosomes Are the Principal Sites of Intracellular Digestion 

Lysosomes Are Heterogeneous 

Plant and Fungal Vacuoles Are Remarkably Versatile Lysosomes 

Multiple Pathways Deliver Materials to Lysosomes 

Autophagy Degrades Unwanted Proteins and Organelles 

A Mannose 6-Phosphate Receptor Sorts Lysosomal Hydrolases 
in the Trans Golgi Network 

Defects in the GIcNAc Phosphotransferase Cause a Lysosomal 
Storage Disease in Humans 

Some Lysosomes and Multivesicular Bodies Undergo 
Exocytosis 

Summary 


TRANSPORT INTO THE CELL FROM THE PLASMA 
MEMBRANE: ENDOCYTOSIS 

Pinocytic Vesicles Form from Coated Pits in the Plasma 
Membrane 

Not All Pinocytic Vesicles Are Clathrin-Coated 

Cells Use Receptor-Mediated Endocytosis to Import Selected 
Extracellular Macromolecules 

Specific Proteins Are Retrieved from Early Endosomes and 
Returned to the Plasma Membrane 

Plasma Membrane Signaling Receptors are Down-Regulated 
by Degradation in Lysosomes 

Early Endosomes Mature into Late Endosomes 

ESCRT Protein Complexes Mediate the Formation of 
Intralumenal Vesicles in Multivesicular Bodies 

Recycling Endosomes Regulate Plasma Membrane Composition 

Specialized Phagocytic Cells Can Ingest Large Particles 

Summary 


TRANSPORT FROM THE TRANS GOLGI NETWORK TO 
THE CELL EXTERIOR: EXOCYTOSIS 

Many Proteins and Lipids Are Carried Automatically from the 
Trans Golgi Network (TGN) to the Cell Surface 

Secretory Vesicles Bud from the Trans Golgi Network 

Precursors of Secretory Proteins Are Proteolytically Processed 
During the Formation of Secretory Vesicles 

Secretory Vesicles Wait Near the Plasma Membrane Until 
Signaled to Release Their Contents 

For Rapid Exocytosis, Synaptic Vesicles Are Primed at the 
Presynaptic Plasma Membrane 

Synaptic Vesicles Can Form Directly from Endocytic Vesicles 
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Secretory Vesicle Membrane Components Are Quickly Removed 
from the Plasma Membrane 

Some Regulated Exocytosis Events Serve to Enlarge the Plasma 
Membrane 

Polarized Cells Direct Proteins from the Trans Golgi Network 
to the Appropriate Domain of the Plasma Membrane 

Summary 
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Chapter 14 Energy Conversion: Mitochondria 
and Chloroplasts 


THE MITOCHONDRION 

The Mitochondrion Has an Outer Membrane and an Inner 
Membrane 

The Inner Membrane Cristae Contain the Machinery for Electron 
Transport and ATP Synthesis 

The Citric Acid Cycle in the Matrix Produces NADH 

Mitochondria Have Many Essential Roles in Cellular Metabolism 

A Chemiosmotic Process Couples Oxidation Energy to ATP 
Production 

The Energy Derived from Oxidation Is Stored as an 
Electrochemical Gradient 

Summary 


THE PROTON PUMPS OF THE ELECTRON-TRANSPORT 
CHAIN 

The Redox Potential Is a Measure of Electron Affinities 

Electron Transfers Release Large Amounts of Energy 

Transition Metal lons and Quinones Accept and Release 
Electrons Readily 

NADH Transfers Its Electrons to Oxygen Through Three 
Large Enzyme Complexes Embedded in the Inner 
Membrane 

The NADH Dehydrogenase Complex Contains Separate 
Modules for Electron Transport and Proton Pumping 

Cytochrome c Reductase Takes Up and Releases Protons on 
the Opposite Side of the Crista Membrane, Thereby 
Pumping Protons 

The Cytochrome c Oxidase Complex Pumps Protons and 
Reduces Op Using a Catalytic Iron—Copper Center 

The Respiratory Chain Forms a Supercomplex in the Crista 
Membrane 

Protons Can Move Rapidly Through Proteins Along Predefined 
Pathways 

Summary 


ATP PRODUCTION IN MITOCHONDRIA 

The Large Negative Value of AG for ATP Hydrolysis Makes 
ATP Useful to the Cell 

The ATP Synthase Is a Nanomachine that Produces ATP by 
Rotary Catalysis 

Proton-driven Turbines Are of Ancient Origin 

Mitochondrial Cristae Help to Make ATP Synthesis Efficient 

Special Transport Proteins Exchange ATP and ADP Through 
the Inner Membrane 

Chemiosmotic Mechanisms First Arose in Bacteria 

Summary 


CHLOROPLASTS AND PHOTOSYNTHESIS 

Chloroplasts Resemble Mitochondria But Have a Separate 
Thylakoid Compartment 

Chloroplasts Capture Energy from Sunlight and Use It to Fix 
Carbon 

Carbon Fixation Uses ATP and NADPH to Convert CO» into 
Sugars 

Sugars Generated by Carbon Fixation Can Be Stored as 
Starch or Consumed to Produce ATP 

The Thylakoid Membranes of Chloroplasts Contain the Protein 
Complexes Required for Photosynthesis and ATP Generation 

Chlorophyll-Protein Complexes Can Transfer Either Excitation 
Energy or Electrons 

A Photosystem Consists of an Antenna Complex and a Reaction 
Center 

The Thylakoid Membrane Contains Two Different Photosystems 
Working in Series 
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By Providing an Inexhaustible Source of Reducing Power, 
Photosynthetic Bacteria Overcame a Major Evolutionary 
Obstacle 796 
The Photosynthetic Electron-Transport Chains of Cyanobacteria 
Produced Atmospheric Oxygen and Permitted New 


Life-Forms 796 
Summary 798 
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Animal Mitochondria Contain the Simplest Genetic Systems 
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Summary 809 
Problems 809 
References 811 
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PRINCIPLES OF CELL SIGNALING 813 
Extracellular Signals Can Act Over Short or Long Distances 814 
Extracellular Signal Molecules Bind to Specific Receptors 815 
Each Cell ls Programmed to Respond to Specific Combinations 

of Extracellular Signals 816 


There Are Three Major Classes of Cell-Surface Receptor Proteins 818 
Cell-Surface Receptors Relay Signals Via Intracellular Signaling 


Molecules 819 
Intracellular Signals Must Be Specific and Precise in a Noisy 

Cytoplasm 820 
Intracellular Signaling Complexes Form at Activated Receptors 822 
Modular Interaction Domains Mediate Interactions Between 

Intracellular Signaling Proteins 822 
The Relationship Between Signal and Response Varies in Different 

Signaling Pathways 824 
The Speed of a Response Depends on the Turnover of Signaling 

Molecules 825 
Cells Can Respond Abruptly to a Gradually Increasing Signal 827 
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Summary 831 
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Trimeric G Proteins Relay Signals From GPCRs 832 
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Cyclic-AMP-Dependent Protein Kinase (PKA) Mediates Most 
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Some G Proteins Signal Via Phospholipids 
Ca?* Functions as a Ubiquitous Intracellular Mediator 
Feedback Generates Ca*+ Waves and Oscillations 
Ca?*/Calmodulin-Dependent Protein Kinases Mediate 
Many Responses to Ca?* Signals 
Some G Proteins Directly Regulate lon Channels 
Smell and Vision Depend on GPCRs That Regulate lon Channels 
Nitric Oxide Is a Gaseous Signaling Mediator That Passes 
Between Cells 
Second Messengers and Enzymatic Cascades Amplify Signals 
GPCR Desensitization Depends on Receptor Phosphorylation 
Summary 


SIGNALING THROUGH ENZYME-COUPLED RECEPTORS 

Activated Receptor Tyrosine Kinases (RTKs) Phosphorylate 
Themselves 

Phosphorylated Tyrosines on RTKs Serve as Docking Sites for 
Intracellular Signaling Proteins 

Proteins with SH2 Domains Bind to Phosphorylated Tyrosines 

The GTPase Ras Mediates Signaling by Most RTKs 

Ras Activates a MAP Kinase Signaling Module 

Scaffold Proteins Helo Prevent Cross-talk Between Parallel 
MAP Kinase Modules 

Rho Family GTPases Functionally Couple Cell-Surface Receptors 
to the Cytoskeleton 

PI 3-Kinase Produces Lipid Docking Sites in the Plasma 
Membrane 

The PI-3-Kinase-Akt Signaling Pathway Stimulates Animal 
Cells to Survive and Grow 

RTKs and GPCRs Activate Overlapping Signaling Pathways 

Some Enzyme-Coupled Receptors Associate with Cytoplasmic 
Tyrosine Kinases 

Cytokine Receptors Activate the JAK-STAT Signaling Pathway 


Protein Tyrosine Phosphatases Reverse Tyrosine Phosphorylations 


Signal Proteins of the TGFB Superfamily Act Through Receptor 
Serine/Threonine Kinases and Smads 
Summary 


ALTERNATIVE SIGNALING ROUTES IN GENE REGULATION 

The Receptor Notch Is a Latent Transcription Regulatory Protein 

Wnt Proteins Bind to Frizzled Receptors and Inhibit the 
Degradation of B-Catenin 

Hedgehog Proteins Bind to Patched, Relieving Its Inhibition of 
Smoothened 

Many Stressful and Inflammatory Stimuli Act Through an 
NF«B-Dependent Signaling Pathway 

Nuclear Receptors Are Ligand-Modulated Transcription 
Regulators 

Circadian Clocks Contain Negative Feedback Loops That 
Control Gene Expression 

Three Proteins in a Test Tube Can Reconstitute a Cyanobacterial 
Circadian Clock 

Summary 


SIGNALING IN PLANTS 

Multicellularity and Cell Communication Evolved Independently 
in Plants and Animals 

Receptor Serine/Threonine Kinases Are the Largest Class of 
Cell-Surface Receptors in Plants 

Ethylene Blocks the Degradation of Specific Transcription 
Regulatory Proteins in the Nucleus 

Regulated Positioning of Auxin Transporters Patterns Plant 
Growth 

Phytochromes Detect Red Light, and Cryptochromes Detect 
Blue Light 

Summary 
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Chapter 16 The Cytoskeleton 


FUNCTION AND ORIGIN OF THE CYTOSKELETON 

Cytoskeletal Filaments Adapt to Form Dynamic or Stable 
Structures 

The Cytoskeleton Determines Cellular Organization and Polarity 

Filaments Assemble from Protein Subunits That Impart Specific 
Physical and Dynamic Properties 
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Accessory Proteins and Motors Regulate Cytoskeletal Filaments 

Bacterial Cell Organization and Division Depend on Homologs 
of Eukaryotic Cytoskeletal Proteins 

Summary 


ACTIN AND ACTIN-BINDING PROTEINS 

Actin Subunits Assemble Head-to- Tail to Create Flexible, Polar 
Filaments 

Nucleation Is the Rate-Limiting Step in the Formation of Actin 
Filaments 

Actin Filaments Have Two Distinct Ends That Grow at Different 
Rates 

ATP Hydrolysis Within Actin Filaments Leads to Treadmilling at 
Steady State 

The Functions of Actin Filaments Are Inhibited by Both Polymer- 
stabilizing and Polymer-destabilizing Chemicals 

Actin-Binding Proteins Influence Filament Dynamics and 
Organization 

Monomer Availability Controls Actin Filament Assembly 

Actin-Nucleating Factors Accelerate Polymerization and 
Generate Branched or Straight Filaments 

Actin-Filament-Binding Proteins Alter Filament Dynamics 

Severing Proteins Regulate Actin Filament Depolymerization 

Higher-Order Actin Filament Arrays Influence Cellular 
Mechanical Properties and Signaling 

Bacteria Can Hijack the Host Actin Cytoskeleton 

Summary 


MYOSIN AND ACTIN 

Actin-Based Motor Proteins Are Members of the Myosin 
Superfamily 

Myosin Generates Force by Coupling ATP Hydrolysis to 
Conformational Changes 

Sliding of Myosin II Along Actin Filaments Causes Muscles 
to Contract 

A Sudden Rise in Cytosolic Ca*+ Concentration Initiates 
Muscle Contraction 

Heart Muscle Is a Precisely Engineered Machine 

Actin and Myosin Perform a Variety of Functions in Non-Muscle 
Cells 

Summary 


MICROTUBULES 

Microtubules Are Hollow Tubes Made of Protofilaments 

Microtubules Undergo Dynamic Instability 

Microtubule Functions Are Inhibited by Both Polymer-stabilizing 
and Polymer-destabilizing Drugs 

A Protein Complex Containing y- Tubulin Nucleates Microtubules 

Microtubules Emanate from the Centrosome in Animal Cells 

Microtubule-Binding Proteins Modulate Filament Dynamics 
and Organization 

Microtubule Plus-End-Binding Proteins Modulate Microtubule 
Dynamics and Attachments 

Tubulin-Sequestering and Microtubule-Severing Proteins 
Destabilize Microtubules 

Two Types of Motor Proteins Move Along Microtubules 

Microtubules and Motors Move Organelles and Vesicles 

Construction of Complex Microtubule Assemblies Requires 
Microtubule Dynamics and Motor Proteins 

Motile Cilia and Flagella Are Built from Microtubules and Dyneins 

Primary Cilia Perform Important Signaling Functions in 
Animal Cells 

Summary 


INTERMEDIATE FILAMENTS AND SEPTINS 


Intermediate Filament Structure Depends on the Lateral Bundling 


and Twisting of Coiled-Coils 


Intermediate Filaments Impart Mechanical Stability to Animal Cells 


Linker Proteins Connect Cytoskeletal Filaments and Bridge the 
Nuclear Envelope 

Septins Form Filaments That Regulate Cell Polarity 

Summary 


CELL POLARIZATION AND MIGRATION 
Many Cells Can Crawl Across a Solid Substratum 
Actin Polymerization Drives Plasma Membrane Protrusion 


Lamellioodia Contain All of the Machinery Required for Cell Motility 


Myosin Contraction and Cell Adhesion Allow Cells to Pull 
Themselves Forward 
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Cell Polarization Is Controlled by Members of the Rho Protein 
Family 

Extracellular Signals Can Activate the Three Rho Protein Family 
Members 

External Signals Can Dictate the Direction of Cell Migration 

Communication Among Cytoskeletal Elements Coordinates 
Whole-Cell Polarization and Locomotion 

Summary 
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Chapter 17 The Cell Cycle 


OVERVIEW OF THE CELL CYCLE 

The Eukaryotic Cell Cycle Usually Consists of Four Phases 
Cell-Cycle Control Is Similar in All Eukaryotes 

Cell-Cycle Progression Can Be Studied in Various Ways 
Summary 


THE CELL-CYCLE CONTROL SYSTEM 

The Cell-Cycle Control System Triggers the Major Events of 
the Cell Cycle 

The Cell-Cycle Control System Depends on Cyclically Activated 
Cyclin-Dependent Protein Kinases (Cdks) 

Cdk Activity Can Be Suppressed By Inhibitory Phosphorylation 
and Cdk Inhibitor Proteins (CKIs) 

Regulated Proteolysis Triggers the Metaphase-to-Anaphase 
Transition 

Cell-Cycle Control Also Depends on Transcriptional Regulation 

The Cell-Cycle Control System Functions as a Network of 
Biochemical Switches 

Summary 


S PHASE 

S-Cdk Initiates DNA Replication Once Per Cycle 

Chromosome Duplication Requires Duplication of Chromatin 
Structure 

Cohesins Hold Sister Chromatids Together 

Summary 


MITOSIS 

M-Cdk Drives Entry Into Mitosis 

Dephosphorylation Activates M-Cdk at the Onset of Mitosis 

Condensin Helos Configure Duplicated Chromosomes for 
Separation 

The Mitotic Spindle Is a Microtubule-Based Machine 

Microtubule-Dependent Motor Proteins Govern Spindle 
Assembly and Function 

Multiple Mechanisms Collaborate in the Assembly of a Bipolar 
Mitotic Spindle 

Centrosome Duplication Occurs Early in the Cell Cycle 

M-Cdk Initiates Spindle Assembly in Prophase 

The Completion of Spindle Assembly in Animal Cells Requires 
Nuclear-Envelope Breakdown 

Microtubule Instability Increases Greatly in Mitosis 

Mitotic Chromosomes Promote Bipolar Spindle Assembly 

Kinetochores Attach Sister Chromatids to the Spindle 

Bi-orientation Is Achieved by Trial and Error 

Multiple Forces Act on Chromosomes in the Spindle 

The APC/C Triggers Sister-Chromatid Separation and the 
Completion of Mitosis 

Unattached Chromosomes Block Sister-Chromatid Separation: 
The Spindle Assembly Checkpoint 

Chromosomes Segregate in Anaphase A and B 

Segregated Chromosomes Are Packaged in Daughter Nuclei 
at Telophase 

Summary 


CYTOKINESIS 

Actin and Myosin Il in the Contractile Ring Generate the Force 
for Cytokinesis 

Local Activation of RhoA Triggers Assembly and Contraction 
of the Contractile Ring 

The Microtubules of the Mitotic Spindle Determine the Plane 
of Animal Cell Division 

The Phragmoplast Guides Cytokinesis in Higher Plants 

Membrane-Enclosed Organelles Must Be Distributed to 
Daughter Cells During Cytokinesis 
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Some Cells Reposition Their Spindle to Divide Asymmetrically 
Mitosis Can Occur Without Cytokinesis 

The G; Phase Is a Stable State of Cdk Inactivity 

Summary 


MEIOSIS 

Meiosis Includes Two Rounds of Chromosome Segregation 

Duplicated Homologs Pair During Meiotic Prophase 

Homolog Pairing Culminates in the Formation of a Synaptonemal 
Complex 

Homolog Segregation Depends on Several Unique Features 
of Meiosis | 

Crossing-Over Is Highly Regulated 

Meiosis Frequently Goes Wrong 

Summary 


CONTROL OF CELL DIVISION AND CELL GROWTH 

Mitogens Stimulate Cell Division 

Cells Can Enter a Specialized Nondividing State 

Mitogens Stimulate G1-Cdk and G4/S-Cdk Activities 

DNA Damage Blocks Cell Division: The DNA Damage Response 

Many Human Cells Have a Built-In Limitation on the Number 
of Times They Can Divide 

Abnormal Proliferation Signals Cause Cell-Cycle Arrest or 
Apoptosis, Except in Cancer Cells 

Cell Proliferation is Accompanied by Cell Growth 

Proliferating Cells Usually Coordinate Their Growth and Division 

Summary 

Problems 
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Chapter 18 Cell Death 


Apoptosis Eliminates Unwanted Cells 

Apoptosis Depends on an Intracellular Proteolytic Cascade 
That Is Mediated by Caspases 

Cell-Surface Death Receptors Activate the Extrinsic Pathway 
of Apoptosis 

The Intrinsic Pathway of Apoptosis Depends on Mitochondria 

Bcl2 Proteins Regulate the Intrinsic Pathway of Apoptosis 

IAPs Help Control Caspases 

Extracellular Survival Factors Inhibit Apoptosis in Various Ways 

Phagocytes Remove the Apoptotic Cell 

Either Excessive or Insufficient Apoptosis Can Contribute to 
Disease 

Summary 
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References 


Chapter 19 Cell Junctions and the Extracellular 
Matrix 


CELL—CELL JUNCTIONS 

Cadherins Form a Diverse Family of Adhesion Molecules 

Cadherins Mediate Homophilic Adhesion 

Cadherin-Dependent Cell-Cell Adhesion Guides the 
Organization of Developing Tissues 

Epitheliai-Mesenchymal Transitions Depend on Control of 
Cadherins 

Catenins Link Classical Cadherins to the Actin Cytoskeleton 

Adherens Junctions Respond to Forces Generated by the Actin 
Cytoskeleton 

Tissue Remodeling Depends on the Coordination of Actin- 
Mediated Contraction With Cell-Cell Adhesion 

Desmosomes Give Epithelia Mechanical Strength 

Tight Junctions Form a Seal Between Cells and a Fence 
Between Plasma Membrane Domains 

Tight Junctions Contain Strands of Transmembrane Adhesion 
Proteins 

Scaffold Proteins Organize Junctional Protein Complexes 

Gap Junctions Couple Cells Both Electrically and Metabolically 

A Gap-Junction Connexon Is Made of Six Transmembrane 
Connexin Subunits 

In Plants, Plaasmodesmata Perform Many of the Same Functions 
as Gap Junctions 

Selectins Mediate Transient Cell-Cell Adhesions in the 
Bloodstream 
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Members of the Immunoglobulin Superfamily Mediate 
Ca2+-Independent Cell—Cell Adhesion 
Summary 


THE EXTRACELLULAR MATRIX OF ANIMALS 

The Extracellular Matrix Is Made and Oriented by the Cells 
Within It 

Glycosaminoglycan (GAG) Chains Occupy Large Amounts of 
Space and Form Hydrated Gels 

Hyaluronan Acts as a Space Filler During Tissue Morphogenesis 
and Repair 

Proteoglycans Are Composed of GAG Chains Covalently 
Linked to a Core Protein 

Collagens Are the Major Proteins of the Extracellular Matrix 

Secreted Fibril-Associated Collagens Help Organize the Fibrils 

Cells Help Organize the Collagen Fibrils They Secrete by 
Exerting Tension on the Matrix 

Elastin Gives Tissues Their Elasticity 

Fibronectin and Other Multidomain Glycoproteins Help 
Organize the Matrix 

Fibronectin Binds to Integrins 

Tension Exerted by Cells Regulates the Assembly of 
Fibronectin Fibrils 

The Basal Lamina Is a Specialized Form of Extracellular Matrix 

Laminin and Type IV Collagen Are Major Components of the 
Basal Lamina 

Basal Laminae Have Diverse Functions 

Cells Have to Be Able to Degrade Matrix, as Well as Make It 

Matrix Proteoglycans and Glycoproteins Regulate the 
Activities of Secreted Proteins 

Summary 


CELL-MATRIX JUNCTIONS 

Integrins Are Transmembrane Heterodimers That Link the 
Extracellular Matrix to the Cytoskeleton 

Integrin Defects Are Responsible for Many Genetic Diseases 

Integrins Can Switch Between an Active and an Inactive 
Conformation 

Integrins Cluster to Form Strong Adhesions 

Extracellular Matrix Attachments Act Through Integrins to 
Control Cell Proliferation and Survival 

Integrins Recruit Intracellular Signaling Proteins at Sites of 
Cell-Matrix Adhesion 

Cell-Matrix Adhesions Respond to Mechanical Forces 

Summary 


THE PLANT CELL WALL 

The Composition of the Cell Wall Depends on the Cell Type 

The Tensile Strength of the Cell Wall Allows Plant Cells to 
Develop Turgor Pressure 

The Primary Cell Wall Is Built from Cellulose Microfibrils 
Interwoven with a Network of Pectic Polysaccharides 

Oriented Cell Wall Deposition Controls Plant Cell Growth 

Microtubules Orient Cell Wall Deposition 

Summary 

Problems 
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Chapter 20 Cancer 


CANCER AS A MICROEVOLUTIONARY PROCESS 

Cancer Cells Bypass Normal Proliferation Controls and 
Colonize Other Tissues 

Most Cancers Derive from a Single Abnormal Cell 

Cancer Cells Contain Somatic Mutations 

A Single Mutation Is Not Enough to Change a Normal Cell 
into a Cancer Cell 

Cancers Develop Gradually from Increasingly Aberrant Cells 

Tumor Progression Involves Successive Rounds of Random 
Inherited Change Followed by Natural Selection 

Human Cancer Cells Are Genetically Unstable 

Cancer Cells Display an Altered Control of Growth 

Cancer Cells Have an Altered Sugar Metabolism 

Cancer Cells Have an Abnormal Ability to Survive Stress and 
DNA Damage 

Human Cancer Cells Escape a Built-in Limit to Cell Proliferation 

The Tumor Microenvironment Influences Cancer Development 
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Cancer Cells Must Survive and Proliferate in a Foreign 
Environment 

Many Properties Typically Contribute to Cancerous Growth 

Summary 


CANCER-CRITICAL GENES: HOW THEY ARE FOUND 
AND WHAT THEY DO 

The Identification of Gain-of-Function and Loss-of-Function 
Cancer Mutations Has Traditionally Required Different 
Methods 

Retroviruses Can Act as Vectors for Oncogenes That Alter Cell 
Behavior 

Different Searches for Oncogenes Converged on the Same 
Gene—Ras 

Genes Mutated in Cancer Can Be Made Overactive in Many 
Ways 

Studies of Rare Hereditary Cancer Syndromes First Identified 
Tumor Suppressor Genes 

Both Genetic and Epigenetic Mechanisms Can Inactivate 
Tumor Suppressor Genes 

Systematic Sequencing of Cancer Cell Genomes Has 
Transformed Our Understanding of the Disease 

Many Cancers Have an Extraordinarily Disrupted Genome 

Many Mutations in Tumor Cells are Merely Passengers 

About One Percent of the Genes in the Human Genome Are 
Cancer-Critical 

Disruptions in a Handful of Key Pathways Are Common to 
Many Cancers 

Mutations in the PI8K/Akt/mTOR Pathway Drive Cancer Cells 
to Grow 

Mutations in the p53 Pathway Enable Cancer Cells to Survive 
and Proliferate Despite Stress and DNA Damage 

Genome Instability Takes Different Forms in Different Cancers 

Cancers of Specialized Tissues Use Many Different Routes to 
Target the Common Core Pathways of Cancer 

Studies Using Mice Help to Define the Functions of Cancer- 
Critical Genes 

Cancers Become More and More Heterogeneous as They 
Progress 

The Changes in Tumor Cells That Lead to Metastasis Are 
Still Largely a Mystery 

A Small Population of Cancer Stem Cells May Maintain Many 
Tumors 

The Cancer Stem-Cell Phenomenon Adds to the Difficulty 
of Curing Cancer 

Colorectal Cancers Evolve Slowly Via a Succession of Visible 
Changes 

A Few Key Genetic Lesions Are Common to a Large Fraction 
of Colorectal Cancers 


Some Colorectal Cancers Have Defects in DNA Mismatch Repair 


The Steps of Tumor Progression Can Often Be Correlated 
with Specific Mutations 
Summary 


CANCER PREVENTION AND TREATMENT: PRESENT AND 
FUTURE 

Epidemiology Reveals That Many Cases of Cancer Are 
Preventable 

Sensitive Assays Can Detect Those Cancer-Causing Agents 
that Damage DNA 

Fifty Percent of Cancers Could Be Prevented by Changes 
in Lifestyle 

Viruses and Other Infections Contribute to a Significant 
Proportion of Human Cancers 

Cancers of the Uterine Cervix Can Be Prevented by Vaccination 
Against Human Papillomavirus 

Infectious Agents Can Cause Cancer in a Variety of Ways 

The Search for Cancer Cures lIs Difficult but Not Hopeless 

Traditional Therapies Exploit the Genetic Instability and Loss 
of Cell-Cycle Checkpoint Responses in Cancer Cells 

New Drugs Can Kill Cancer Cells Selectively by Targeting 
Specific Mutations 

PARP Inhibitors Kill Cancer Cells That Have Defects in Brca1 
or Brca2 Genes 

Small Molecules Can Be Designed to Inhibit Specific 
Oncogenic Proteins 
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Many Cancers May Be Treatable by Enhancing the Immune 
Response Against the Specific Tumor 

Cancers Evolve Resistance to Therapies 

Combination Therapies May Succeed Where Treatments with 
One Drug at a Time Fail 

We Now Have the Tools to Devise Combination Therapies 
Tailored to the Individual Patient 

Summary 

Problems 

References 


Chapter 21 Development of Multicellular 
Organisms 


OVERVIEW OF DEVELOPMENT 

Conserved Mechanisms Establish the Basic Animal Body Plan 

The Developmental Potential of Cells Becomes Progressively 
Restricted 

Cell Memory Underlies Cell Decision-Making 

Several Model Organisms Have Been Crucial for Understanding 
Development 

Genes Involved in Cell-Cell Communication and Transcriptional 
Control Are Especially Important for Animal Development 

Regulatory DNA Seems Largely Responsible for the Differences 
Between Animal Species 

Small Numbers of Conserved Cell—Cell Signaling Pathways 
Coordinate Spatial Patterning 

Through Combinatorial Control and Cell Memory, Simple 
Signals Can Generate Complex Patterns 

Morphogens Are Long-Range Inductive Signals That Exert 
Graded Effects 

Lateral Inhibition Can Generate Patterns of Different Cell Types 

Short-Range Activation and Long-Range Inhibition Can 
Generate Complex Cellular Patterns 

Asymmetric Cell Division Can Also Generate Diversity 

Initial Patterns Are Established in Small Fields of Cells and 
Refined by Sequential Induction as the Embryo Grows 

Developmental Biology Provides Insights into Disease and 
Tissue Maintenance 

Summary 


MECHANISMS OF PATTERN FORMATION 

Different Animals Use Different Mechanisms to Establish Their 
Primary Axes of Polarization 

Studies in Drosophila Have Revealed the Genetic Control 
Mechanisms Underlying Development 

Egg-Polarity Genes Encode Macromolecules Deposited in the 
Egg to Organize the Axes of the Early Drosophila Embryo 

Three Groups of Genes Control Drosophila Segmentation Along 
the A-P Axis 

A Hierarchy of Gene Regulatory Interactions Subdivides the 
Drosophila Embryo 

Egg-Polarity, Gap, and Pair-Rule Genes Create a Transient 
Pattern That Is Remembered by Segment-Polarity and 
Hox Genes 

Hox Genes Permanently Pattern the A-P Axis 

Hox Proteins Give Each Segment Its Individuality 

Hox Genes Are Expressed According to Their Order in the 
Hox Complex 

Trithorax and Polycomb Group Proteins Enable the Hox 
Complexes to Maintain a Permanent Record of Positional 
Information 

The D-V Signaling Genes Create a Gradient of the Transcription 
Regulator Dorsal 

A Hierarchy of Inductive Interactions Subdivides the Vertebrate 
Embryo 

A Competition Between Secreted Signaling Proteins Patterns 
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The surface of our planet is populated by living things—curious, intricately orga- 
nized chemical factories that take in matter from their surroundings and use these 
raw materials to generate copies of themselves. These living organisms appear 
extraordinarily diverse. What could be more different than a tiger and a piece of 
seaweed, or a bacterium and a tree? Yet our ancestors, knowing nothing of cells or 
DNA, saw that all these things had something in common. They called that some- 
thing “life,” marveled at it, struggled to define it, and despaired of explaining what 
it was or how it worked in terms that relate to nonliving matter. 

The discoveries of the past century have not diminished the marvel—quite the 
contrary. But they have removed the central mystery regarding the nature of life. 
We can nowsee that all living things are made of cells: small, membrane-enclosed 
units filled with a concentrated aqueous solution of chemicals and endowed with 
the extraordinary ability to create copies of themselves by growing and then divid- 
ing in two. 

Because cells are the fundamental units of life, it is to cell biology—the study 
of the structure, function, and behavior of cells—that we must look for answers 
to the questions of what life is and how it works. With a deeper understanding of 
cells and their evolution, we can begin to tackle the grand historical problems of 
life on Earth: its mysterious origins, its stunning diversity, and its invasion of every 
conceivable habitat. Indeed, as emphasized long ago by the pioneering cell biolo- 
gist E. B. Wilson, “the key to every biological problem must finally be sought in the 
cell; for every living organism is, or at some time has been, a cell.’ 

Despite their apparent diversity, living things are fundamentally similar inside. 
The whole of biology is thus a counterpoint between two themes: astonishing 
variety in individual particulars; astonishing constancy in fundamental mecha- 
nisms. In this first chapter, we begin by outlining the universal features common 
to all life on our planet. We then survey, briefly, the diversity of cells. And we see 
how, thanks to the common molecular code in which the specifications for all 
living organisms are written, it is possible to read, measure, and decipher these 
specifications to help us achieve a coherent understanding of all the forms of life, 
from the smallest to the greatest. 
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2 Chapter 1: Cells and Genomes 





THE UNIVERSAL FEATURES OF CELLS ON EARTH 


It is estimated that there are more than 10 million—perhaps 100 million—living 
species on Earth today. Each species is different, and each reproduces itself faith- 
fully, yielding progeny that belong to the same species: the parent organism hands 
down information specifying, in extraordinary detail, the characteristics that the 
offspring shall have. This phenomenon of heredity is central to the definition of 
life: it distinguishes life from other processes, such as the growth of a crystal, or the 
burning of a candle, or the formation of waves on water, in which orderly struc- 
tures are generated but without the same type of link between the peculiarities of 
parents and the peculiarities of offspring. Like the candle flame, the living organ- 
ism must consume free energy to create and maintain its organization. But life 
employs the free energy to drive a hugely complex system of chemical processes 
that are specified by hereditary information. 

Most living organisms are single cells. Others, such as ourselves, are vast mul- 
ticellular cities in which groups of cells perform specialized functions linked by 
intricate systems of communication. But even for the aggregate of more than 108 
cells that form a human body, the whole organism has been generated by cell 
divisions from a single cell. The single cell, therefore, is the vehicle for all of the 
hereditary information that defines each species (Figure 1-1). This cell includes 
the machinery to gather raw materials from the environment and to construct 
from them a new cell in its own image, complete with a new copy of its hereditary 
information. Each and every cell is truly amazing. 


All Cells Store Their Hereditary Information in the Same Linear 
Chemical Code: DNA 


Computers have made us familiar with the concept of information as a measur- 
able quantity—a million bytes (to record a few hundred pages of text or an image 
from a digital camera), 600 million bytes for the music on a CD, and so on. Com- 
puters have also made us well aware that the same information can be recorded 
in many different physical forms: the discs and tapes that we used 20 years ago for 
our electronic archives have become unreadable on present-day machines. Living 


Figure 1-1 The hereditary information 

in the fertilized egg cell determines 

the nature of the whole multicellular 
organism. Although their starting cells 
look superficially similar, as indicated: a 
sea urchin egg gives rise to a sea urchin 
(A and B). A mouse egg gives rise to a 
mouse (C and D). An egg of the seaweed 
Fucus gives rise to a Fucus seaweed 

(E and F). (A, courtesy of David McClay; 

B, courtesy of M. Gibbs, Oxford Scientific 
Films; C, courtesy of Patricia Calarco, from 
G. Martin, Science 209:768-776, 1980. 
With permission from AAAS; D, courtesy of 
O. Newman, Oxford Scientific Films; E and 
F, courtesy of Colin Brownlee.) 
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cells, like computers, store information, and it is estimated that they have been 
evolving and diversifying for over 3.5 billion years. It is scarcely to be expected 
that they would all store their information in the same form, or that the archives 
of one type of cell should be readable by the information-handling machinery of 
another. And yet it is so. All living cells on Earth store their hereditary informa- 
tion in the form of double-stranded molecules of DNA—long, unbranched, paired 
polymer chains, formed always of the same four types of monomers. These mono- 
mers, chemical compounds known as nucleotides, have nicknames drawn from 
a four-letter alphabet—A, T, C, G—and they are strung together in a long linear 
sequence that encodes the genetic information, just as the sequence of 1s and Os 
encodes the information in a computer file. We can take a piece of DNA from a 
human cell and insert it into a bacterium, or a piece of bacterial DNA and insert 
it into a human cell, and the information will be successfully read, interpreted, 
and copied. Using chemical methods, scientists have learned how to read out the 
complete sequence of monomers in any DNA molecule—extending for many mil- 
lions of nucleotides—and thereby decipher all of the hereditary information that 
each organism contains. 









All Cells Replicate Their Hereditary Information by Templated 
Polymerization 


The mechanisms that make life possible depend on the structure of the double- 
stranded DNA molecule. Each monomer in a single DNA strand—that is, each 
nucleotide—consists of two parts: a sugar (deoxyribose) with a phosphate group 
attached to it, and a base, which may be either adenine (A), guanine (G), cytosine 
(C), or thymine (T) (Figure 1-2). Each sugar is linked to the next via the phos- 
phate group, creating a polymer chain composed of a repetitive sugar-phosphate 
backbone with a series of bases protruding from it. The DNA polymer is extended 
by adding monomers at one end. For a single isolated strand, these monomers 
can, in principle, be added in any order, because each one links to the next in the 
same way, through the part of the molecule that is the same for all of them. In the 
living cell, however, DNA is not synthesized as a free strand in isolation, but on 
a template formed by a preexisting DNA strand. The bases protruding from the 
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Figure 1-2 DNA and its building blocks. 
(A) DNA is made from simple subunits, 

called nucleotides, each consisting of a 
sugar-phosphate molecule with a nitrogen- 
containing side group, or base, attached to it. 
The bases are of four types (adenine, guanine, 
cytosine, and thymine), corresponding to 
four distinct nucleotides, labeled A, G, C, 
and T. (B) A single strand of DNA consists 

of nucleotides joined together by sugar- 
phosphate linkages. Note that the individual 
sugar-phosphate units are asymmetric, 
giving the backbone of the strand a definite 
directionality, or polarity. This directionality 
guides the molecular processes by which the 
information in DNA is interpreted and copied 
in cells: the information is always “read” in 

a consistent order, just as written English 
text is read from left to right. (C) Through 
templated polymerization, the sequence of 
nucleotides in an existing DNA strand controls 
the sequence in which nucleotides are joined 
together in a new DNA strand; T in one 
strand pairs with A in the other, and G in one 
strand with C in the other. The new strand 
has a nucleotide sequence complementary 
to that of the old strand, and a backbone 
with opposite directionality: corresponding 

to the GTAA... of the original strand, it has 

... TTAC. (D) A normal DNA molecule consists 
of two such complementary strands. The 
nucleotides within each strand are linked 

by strong (covalent) chemical bonds; the 
complementary nucleotides on opposite 
strands are held together more weakly, by 
hydrogen bonds. (E) The two strands twist 
around each other to form a double helix—a 
robust structure that can accommodate any 
sequence of nucleotides without altering its 
basic structure (see Movie 4.1). 


4 Chapter 1: Cells and Genomes 


template strand 







/ 





Do © 6 6 6 6 6 6 @ 
EXER E EEE new strand 
XE XE new strand 
parent DNA double helix \ es 
KEKE EEE 


template strand 


existing strand bind to bases of the strand being synthesized, according to a strict 
rule defined by the complementary structures of the bases: A binds to T, and C 
binds to G. This base-pairing holds fresh monomers in place and thereby con- 
trols the selection of which one of the four monomers shall be added to the grow- 
ing strand next. In this way, a double-stranded structure is created, consisting of 
two exactly complementary sequences of As, Cs, Ts, and Gs. The two strands twist 
around each other, forming a DNA double helix (Figure 1-2E). 

The bonds between the base pairs are weak compared with the sugar-phos- 
phate links, and this allows the two DNA strands to be pulled apart without break- 
age of their backbones. Each strand then can serve as a template, in the way just 
described, for the synthesis of a fresh DNA strand complementary to itself—a 
fresh copy, that is, of the hereditary information (Figure 1-3). In different types 
of cells, this process of DNA replication occurs at different rates, with different 
controls to start it or stop it, and different auxiliary molecules to help it along. But 
the basics are universal: DNA is the information store for heredity, and templated 
polymerization is the way in which this information is copied throughout the liv- 
ing world. 


All Cells Transcribe Portions of Their Hereditary Information into 
the Same Intermediary Form: RNA 


To carry out its information-bearing function, DNA must do more than copy itself. 
It must also express its information, by letting the information guide the synthesis 
of other molecules in the cell. This expression occurs by a mechanism that is the 
same in all living organisms, leading first and foremost to the production of two 
other key classes of polymers: RNAs and proteins. The process (discussed in detail 
in Chapters 6 and 7) begins with a templated polymerization called transcription, 
in which segments of the DNA sequence are used as templates for the synthe- 
sis of shorter molecules of the closely related polymer ribonucleic acid, or RNA. 
Later, in the more complex process of translation, many of these RNA molecules 
direct the synthesis of polymers of a radically different chemical class—the pro- 
teins (Figure 1-4). 

In RNA, the backbone is formed ofa slightly different sugar from that of DNA— 
ribose instead of deoxyribose—and one of the four bases is slightly different—ura- 
cil (U) in place of thymine (T). But the other three bases—A, C, and G—are the 
same, and all four bases pair with their complementary counterparts in DNA—the 
A, U, C, and G of RNA with the T, A, G, and C of DNA. During transcription, the 
RNA monomers are lined up and selected for polymerization on a template strand 
of DNA, just as DNA monomers are selected during replication. The outcome is a 
polymer molecule whose sequence of nucleotides faithfully represents a portion 
of the cell’s genetic information, even though it is written in a slightly different 
alphabet—consisting of RNA monomers instead of DNA monomers. 

The same segment of DNA can be used repeatedly to guide the synthesis of 
many identical RNA molecules. Thus, whereas the cell’s archive of genetic infor- 
mation in the form of DNA is fixed and sacrosanct, these RNA transcripts are 


Figure 1-3 The copying of genetic 
information by DNA replication. In this 
process, the two strands of a DNA double 
helix are pulled apart, and each serves 

as a template for synthesis of a new 
complementary strand. 
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Figure 1-4 From DNA to protein. 
Genetic information is read out and put 
to use through a two-step process. First, 
in transcription, segments of the DNA 
sequence are used to guide the synthesis 
of molecules of RNA. Then, in translation, 
the RNA molecules are used to guide the 
synthesis of molecules of protein. 
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mass-produced and disposable (Figure 1-5). As we shall see, these transcripts 
function as intermediates in the transfer of genetic information. Most notably, 
they serve as messenger RNA (mRNA) molecules that guide the synthesis of pro- 
teins according to the genetic instructions stored in the DNA. 

RNA molecules have distinctive structures that can also give them other spe- 
cialized chemical capabilities. Being single-stranded, their backbone is flexible, 
so that the polymer chain can bend back on itself to allow one part of the molecule 
to form weak bonds with another part of the same molecule. This occurs when 
segments of the sequence are locally complementary: a ...GGGG... segment, for 
example, will tend to associate with a ...CCCC... segment. These types of internal 
associations can cause an RNA chain to fold up into a specific shape that is dic- 
tated by its sequence (Figure 1-6). The shape of the RNA molecule, in turn, may 
enable it to recognize other molecules by binding to them selectively—and even, 
in certain cases, to catalyze chemical changes in the molecules that are bound. In 
fact, some chemical reactions catalyzed by RNA molecules are crucial for several 
of the most ancient and fundamental processes in living cells, and it has been sug- 
gested that an extensive catalysis by RNA played a central part in the early evolu- 
tion of life (discussed in Chapter 6). 


All Cells Use Proteins as Catalysts 


Protein molecules, like DNA and RNA molecules, are long unbranched polymer 
chains, formed by stringing together monomeric building blocks drawn from a 
standard repertoire that is the same for all living cells. Like DNA and RNA, pro- 
teins carry information in the form of a linear sequence of symbols, in the same 
way as a human message written in an alphabetic script. There are many different 
protein molecules in each cell, and—leaving out the water—they form most of the 
cell’s mass. 
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Figure 1-5 How genetic information 

is broadcast for use inside the cell. 
Each cell contains a fixed set of DNA 
molecules —its archive of genetic 
information. A given segment of this DNA 
guides the synthesis of many identical RNA 
transcripts, which serve as working copies 
of the information stored in the archive. 
Many different sets of RNA molecules can 
be made by transcribing different parts of 
a cell’s DNA sequences, allowing different 
types of cells to use the same information 
store differently. 


Figure 1-6 The conformation of an 

RNA molecule. (A) Nucleotide pairing 
between different regions of the same 
RNA polymer chain causes the molecule 
to adopt a distinctive shape. (B) The 
three-dimensional structure of an actual 
RNA molecule produced by hepatitis delta 
virus; this RNA can catalyze RNA strand 
cleavage. The blue ribbon represents the 
sugar-phosphate backbone and the bars 
represent base pairs (see Movie 6.1). 

(B, based on A.R. Ferré-D’Amare, K. Zhou, 
and J.A. Doudna, Nature 395:567-574, 
1998. With permission from Macmillan 
Publishers Ltd.) 
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Figure 1-7 How a protein molecule acts as a catalyst for a chemical reaction. (A) In a protein 
molecule, the polymer chain folds up into a specific shape defined by its amino acid sequence. A 
groove in the surface of this particular folded molecule, the enzyme lysozyme, forms a catalytic site. 

(B) A polysaccharide molecule (red)—a polymer chain of sugar monomers — binds to the catalytic site of 
lysozyme and is broken apart, as a result of a covalent bond-breaking reaction catalyzed by the amino 
(A) lysozyme acids lining the groove (see Movie 3.9). (PDB code: 1LYD.) 





The monomers of protein, the amino acids, are quite different from those of 
DNA and RNA, and there are 20 types instead of 4. Each amino acid is built around 
the same core structure through which it can be linked in a standard way to any 
other amino acid in the set; attached to this core is a side group that gives each 
amino acid a distinctive chemical character. Each of the protein molecules is a 
polypeptide, created by joining its amino acids in a particular sequence. Through 
billions of years of evolution, this sequence has been selected to give the protein a 
useful function. Thus, by folding into a precise three-dimensional form with reac- 
tive sites on its surface (Figure 1-7A), these amino-acid polymers can bind with 
high specificity to other molecules and can act as enzymes to catalyze reactions 
that make or break covalent bonds. In this way they direct the vast majority of 
chemical processes in the cell (Figure 1-7B). 

Proteins have many other functions as well—maintaining structures, generat- 
ing movements, sensing signals, and so on—each protein molecule performing 
a specific function according to its own genetically specified sequence of amino 
acids. Proteins, above all, are the main molecules that put the cell’s genetic infor- 
mation into action. 

Thus, polynucleotides specify the amino acid sequences of proteins. Proteins, 
in turn, catalyze many chemical reactions, including those by which new DNA 
molecules are synthesized. From the most fundamental point of view, a living cell 
is a self-replicating collection of catalysts that takes in food, processes this food 
to derive both the building blocks and energy needed to make more catalysts, 
and discards the materials left over as waste (Figure 1-8A). A feedback loop that 
connects proteins and polynucleotides forms the basis for this autocatalytic, self- 
reproducing behavior of living organisms (Figure 1-8B). 


All Cells Translate RNA into Protein in the Same Way 


How the information in DNA specifies the production of proteins was a com- 
plete mystery in the 1950s when the double-stranded structure of DNA was first 
revealed as the basis of heredity. But in the intervening years, scientists have dis- 
covered the elegant mechanisms involved. The translation of genetic information 
from the 4-letter alphabet of polynucleotides into the 20-letter alphabet of pro- 
teins is a complex process. The rules of this translation seem in some respects 
neat and rational but in other respects strangely arbitrary, given that they are 
(with minor exceptions) identical in all living things. These arbitrary features, it 
is thought, reflect frozen accidents in the early history of life. They stem from the 
chance properties of the earliest organisms that were passed on by heredity and 
have become so deeply embedded in the constitution of all living cells that they 
cannot be changed without disastrous effects. 
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It turns out that the information in the sequence of a messenger RNA molecule 
is read out in groups of three nucleotides at a time: each triplet of nucleotides, or 
codon, specifies (codes for) a single amino acid in a corresponding protein. Since 
the number of distinct triplets that can be formed from four nucleotides is 4°, 
there are 64 possible codons, all of which occur in nature. However, there are only 
20 naturally occurring amino acids. That means there are necessarily many cases 
in which several codons correspond to the same amino acid. This genetic code is 
read out by a special class of small RNA molecules, the transfer RNAs (tRNAs). 
Each type of tRNA becomes attached at one end to a specific amino acid, and 
displays at its other end a specific sequence of three nucleotides—an anticodon— 
that enables it to recognize, through base-pairing, a particular codon or subset of 
codons in mRNA. The intricate chemistry that enables these tRNAs to translate 
a specific sequence of A, C, G, and U nucleotides in an mRNA molecule into a 
specific sequence of amino acids in a protein molecule occurs on the ribosome, a 
large multimolecular machine composed of both protein and ribosomal RNA. All 
of these processes are described in detail in Chapter 6. 


Each Protein Is Encoded by a Specific Gene 


DNA molecules as a rule are very large, containing the specifications for thou- 
sands of proteins. Special sequences in the DNA serve as punctuation, defining 
where the information for each protein begins and ends. And individual segments 
of the long DNA sequence are transcribed into separate mRNA molecules, coding 
for different proteins. Each such DNA segment represents one gene. A complica- 
tion is that RNA molecules transcribed from the same DNA segment can often be 
processed in more than one way, so as to give rise to a set of alternative versions 
of a protein, especially in more complex cells such as those of plants and animals. 
In addition, some DNA segments—a smaller number—are transcribed into RNA 
molecules that are not translated but have catalytic, regulatory, or structural func- 
tions; such DNA segments also count as genes. A gene therefore is defined as the 
segment of DNA sequence corresponding to a single protein or set of alternative 
protein variants or to a single catalytic, regulatory, or structural RNA molecule. 

In all cells, the expression of individual genes is regulated: instead of manu- 
facturing its full repertoire of possible proteins at full tilt all the time, the cell 
adjusts the rate of transcription and translation of different genes independently, 
according to need. Stretches of regulatory DNA are interspersed among the seg- 
ments that code for protein, and these noncoding regions bind to special protein 
molecules that control the local rate of transcription. The quantity and organiza- 
tion of the regulatory DNA vary widely from one class of organisms to another, 
but the basic strategy is universal. In this way, the genome of the cell—that is, the 
totality of its genetic information as embodied in its complete DNA sequence— 
dictates not only the nature of the cell’s proteins, but also when and where they 
are to be made. 


Figure 1-8 Life as an autocatalytic 
process. (A) The cell as a self-replicating 
collection of catalysts. (B) Polynucleotides 
(the nucleic acids DNA and RNA, which are 
nucleotide polymers) provide the sequence 
information, while proteins (amino acid 
polymers) provide most of the catalytic 
functions that serve—through a complex 
set of chemical reactions—to bring about 
the synthesis of more polynucleotides and 
proteins of the same types. 
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Life Requires Free Energy 


A living cell is a dynamic chemical system, operating far from chemical equilib- 
rium. For a cell to grow or to make a new cell in its own image, it must take in 
free energy from the environment, as well as raw materials, to drive the necessary 
synthetic reactions. This consumption of free energy is fundamental to life. When 
it stops, a cell decays toward chemical equilibrium and soon dies. 

Genetic information is also fundamental to life, and free energy is required 
for the propagation of this information. For example, to specify one bit of infor- 
mation—that is, one yes/no choice between two equally probable alternatives— 
costs a defined amount of free energy that can be calculated. The quantitative 
relationship involves some deep reasoning and depends on a precise definition of 
the term “free energy,’ as explained in Chapter 2. The basic idea, however, is not 
difficult to understand intuitively. 

Picture the molecules in a cell as a swarm of objects endowed with thermal 
energy, moving around violently at random, buffeted by collisions with one 
another. To specify genetic information—in the form of a DNA sequence, for 
example—molecules from this wild crowd must be captured, arranged in a spe- 
cific order defined by some preexisting template, and linked together in a fixed 
relationship. The bonds that hold the molecules in their proper places on the 
template and join them together must be strong enough to resist the disordering 
effect of thermal motion. The process is driven forward by consumption of free 
energy, which is needed to ensure that the correct bonds are made, and made 
robustly. In the simplest case, the molecules can be compared with spring-loaded 
traps, ready to snap into a more stable, lower-energy attached state when they 
meet their proper partners; as they snap together into the bonded arrangement, 
their available stored energy—their free energy—like the energy of the spring 
in the trap, is released and dissipated as heat. In a cell, the chemical processes 
underlying information transfer are more complex, but the same basic principle 
applies: free energy has to be spent on the creation of order. 

To replicate its genetic information faithfully, and indeed to make all its com- 
plex molecules according to the correct specifications, the cell therefore requires 
free energy, which has to be imported somehow from the surroundings. As we 
shall see in Chapter 2, the free energy required by animal cells is derived from 
chemical bonds in food molecules that the animals eat, while plants get their free 
energy from sunlight. 


All Cells Function as Biochemical Factories Dealing with the Same 
Basic Molecular Building Blocks 


Because all cells make DNA, RNA, and protein, all cells have to contain and 
manipulate a similar collection of small molecules, including simple sugars, 
nucleotides, and amino acids, as well as other substances that are universally 
required. All cells, for example, require the phosphorylated nucleotide ATP (ade- 
nosine triphosphate), not only as a building block for the synthesis of DNA and 
RNA, but also as a carrier of the free energy that is needed to drive a huge number 
of chemical reactions in the cell. 

Although all cells function as biochemical factories of a broadly similar type, 
many of the details of their small-molecule transactions differ. Some organisms, 
such as plants, require only the simplest of nutrients and harness the energy of 
sunlight to make all their own small organic molecules. Other organisms, such as 
animals, feed on living things and must obtain many of their organic molecules 
ready-made. We return to this point later. 


All Cells Are Enclosed in a Plasma Membrane Across Which 
Nutrients and Waste Materials Must Pass 
Another universal feature is that each cell is enclosed by a membrane—the 


plasma membrane. This container acts as a selective barrier that enables the cell 
to concentrate nutrients gathered from its environment and retain the products it 
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Figure 1-9 Formation of a membrane by amphiphilic phospholipid 
molecules. Phospholipids have a hydrophilic (water-loving, phosphate) head 
group and a hydrophobic (water-avoiding, hydrocarbon) tail. At an interface 
between oil and water, they arrange themselves as a single sheet with their 
head groups facing the water and their tail groups facing the oil. But when 
immersed in water, they aggregate to form bilayers enclosing aqueous 
compartments, as indicated. 


synthesizes for its own use, while excreting its waste products. Without a plasma 
membrane, the cell could not maintain its integrity as a coordinated chemical 
system. 

The molecules that form a membrane have the simple physicochemical 
property of being amphiphilic—that is, consisting of one part that is hydropho- 
bic (water-insoluble) and another part that is hydrophilic (water-soluble). Such 
molecules placed in water aggregate spontaneously, arranging their hydropho- 
bic portions to be as much in contact with one another as possible to hide them 
from the water, while keeping their hydrophilic portions exposed. Amphiphilic 
molecules of appropriate shape, such as the phospholipid molecules that com- 
prise most of the plasma membrane, spontaneously aggregate in water to create 
a bilayer that forms small closed vesicles (Figure 1-9). The phenomenon can be 
demonstrated in a test tube by simply mixing phospholipids and water together; 
under appropriate conditions, small vesicles form whose aqueous contents are 
isolated from the external medium. 

Although the chemical details vary, the hydrophobic tails of the predominant 
membrane molecules in all cells are hydrocarbon polymers (-CH2-CH2-CH>-), 
and their spontaneous assembly into a bilayered vesicle is but one of many exam- 
ples of an important general principle: cells produce molecules whose chemical 
properties cause them to self-assemble into the structures that a cell needs. 

The cell boundary cannot be totally impermeable. Ifa cell is to grow and repro- 
duce, it must be able to import raw materials and export waste across its plasma 
membrane. All cells therefore have specialized proteins embedded in their mem- 
brane that transport specific molecules from one side to the other. Some of these 
membrane transport proteins, like some of the proteins that catalyze the funda- 
mental small-molecule reactions inside the cell, have been so well preserved over 
the course of evolution that we can recognize the family resemblances between 
them in comparisons of even the most distantly related groups of living organ- 
isms. 

The transport proteins in the membrane largely determine which molecules 
enter the cell, and the catalytic proteins inside the cell determine the reactions 
that those molecules undergo. Thus, by specifying the proteins that the cell is to 
manufacture, the genetic information recorded in the DNA sequence dictates the 
entire chemistry of the cell; and not only its chemistry, but also its form and its 
behavior, for these too are chiefly constructed and controlled by the cell’s proteins. 


A Living Cell Can Exist with Fewer Than 500 Genes 


The basic principles of biological information transfer are simple enough, but how 
complex are real living cells? In particular, what are the minimum requirements? 
We can get a rough indication by considering a species that has one of the small- 
est known genomes—the bacterium Mycoplasma genitalium (Figure 1-10). This 
organism lives as a parasite in mammals, and its environment provides it with 
many of its small molecules ready-made. Nevertheless, it still has to make all the 
large molecules—DNA, RNAs, and proteins—required for the basic processes of 
heredity. It has about 530 genes, about 400 of which are essential. Its genome of 
580,070 nucleotide pairs represents 145,018 bytes of information—about as much 
as it takes to record the text of one chapter of this book. Cell biology may be com- 
plicated, but it is not impossibly so. 

The minimum number of genes for a viable cell in today’s environments is 
probably not less than 300, although there are only about 60 genes in the core set 
that is shared by all living species. 
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Summary 


The individual cell is the minimal self-reproducing unit of living matter, and it con- 
sists of a self-replicating collection of catalysts. Central to this reproduction is the 
transmission of genetic information to progeny cells. Every cell on our planet stores 
its genetic information in the same chemical form—as double-stranded DNA. The 
cell replicates its information by separating the paired DNA strands and using each 
as a template for polymerization to make a new DNA strand with a complemen- 
tary sequence of nucleotides. The same strategy of templated polymerization is used 
to transcribe portions of the information from DNA into molecules of the closely 
related polymer, RNA. These RNA molecules in turn guide the synthesis of protein 
molecules by the more complex machinery of translation, involving a large multi- 
molecular machine, the ribosome. Proteins are the principal catalysts for almost 
all the chemical reactions in the cell; their other functions include the selective 
import and export of small molecules across the plasma membrane that forms the 
cell’s boundary. The specific function of each protein depends on its amino acid 
sequence, which is specified by the nucleotide sequence of a corresponding segment 
of the DNA—the gene that codes for that protein. In this way, the genome of the 
cell determines its chemistry; and the chemistry of every living cell is fundamentally 
similar, because it must provide for the synthesis of DNA, RNA, and protein. The 
simplest known cells can survive with about 400 genes. 


THE DIVERSITY OF GENOMES AND THE TREE OF LIFE 


The success of living organisms based on DNA, RNA, and protein has been spec- 
tacular. Life has populated the oceans, covered the land, infiltrated the Earth’s 
crust, and molded the surface of our planet. Our oxygen-rich atmosphere, the 
deposits of coal and oil, the layers of iron ores, the cliffs of chalk and limestone 
and marble—all these are products, directly or indirectly, of past biological activ- 
ity on Earth. 

Living things are not confined to the familiar temperate realm of land, water, 
and sunlight inhabited by plants and plant-eating animals. They can be found in 
the darkest depths of the ocean, in hot volcanic mud, in pools beneath the fro- 
zen surface of the Antarctic, and buried kilometers deep in the Earth’s crust. The 
creatures that live in these extreme environments are generally unfamiliar, not 
only because they are inaccessible, but also because they are mostly microscopic. 
In more homely habitats, too, most organisms are too small for us to see without 
special equipment: they tend to go unnoticed, unless they cause a disease or rot 
the timbers of our houses. Yet microorganisms make up most of the total mass 
of living matter on our planet. Only recently, through new methods of molecular 
analysis and specifically through the analysis of DNA sequences, have we begun 
to get a picture of life on Earth that is not grossly distorted by our biased perspec- 
tive as large animals living on dry land. 

In this section, we consider the diversity of organisms and the relationships 
among them. Because the genetic information for every organism is written in 
the universal language of DNA sequences, and the DNA sequence of any given 
organism can be readily obtained by standard biochemical techniques, it is now 
possible to characterize, catalog, and compare any set of living organisms with 
reference to these sequences. From such comparisons we can estimate the place 
of each organism in the family tree of living species—the “tree of life.” But before 
describing what this approach reveals, we need first to consider the routes by 
which cells in different environments obtain the matter and energy they require to 
survive and proliferate, and the ways in which some classes of organisms depend 
on others for their basic chemical needs. 


Cells Can Be Powered by a Variety of Free-Energy Sources 


Living organisms obtain their free energy in different ways. Some, such as animals, 
fungi, and the many different bacteria that live in the human gut, get it by feed- 
ing on other living things or the organic chemicals they produce; such organisms 





(B) 0.2 um 


Figure 1-10 Mycoplasma genitalium. 

(A) Scanning electron micrograph showing 
the irregular shape of this small bacterium, 
reflecting the lack of any rigid cell wall. 

(B) Cross section (transmission electron 
micrograph) of a Mycoplasma cell. Of the 
530 genes of Mycoplasma genitalium, 

43 code for transfer, ribosomal, and other 
non-messenger RNAs. Functions are 
known, or can be guessed, for 339 of the 
genes coding for protein: of these, 154 
are involved in replication, transcription, 
translation, and related processes 
involving DNA, RNA, and protein; 98 in the 
membrane and surface structures of the 
cell; 46 in the transport of nutrients and 
other molecules across the membrane; 

71 in energy conversion and the synthesis 
and degradation of small molecules; and 
12 in the regulation of cell division 

and other processes. Note that these 
categories are partly overlapping, so that 
some genes feature twice. (A, from 

S. Razin et al., Infect. Immun. 30:538-546, 
1980. With permission from the American 
Society for Microbiology; B, courtesy of 
Roger Cole, in Medical Microbiology, 4th 
ed. [S. Baron ed.]. Galveston: University of 
Texas Medical Branch, 1996.) 


THE DIVERSITY OF GENOMES AND THE TREE OF LIFE 


are called organotrophic (from the Greek word trophe, meaning “food” ). Others 
derive their energy directly from the nonliving world. These primary energy con- 
verters fall into two classes: those that harvest the energy of sunlight, and those 
that capture their energy from energy-rich systems of inorganic chemicals in the 
environment (chemical systems that are far from chemical equilibrium). Organ- 
isms of the former class are called phototrophic (feeding on sunlight); those of the 
latter are called lithotrophic (feeding on rock). Organotrophic organisms could 
not exist without these primary energy converters, which are the most plentiful 
form of life. 

Phototrophic organisms include many types of bacteria, as well as algae and 
plants, on which we—and virtually all the living things that we ordinarily see 
around us—depend. Phototrophic organisms have changed the whole chemistry 
of our environment: the oxygen in the Earth’s atmosphere is a by-product of their 
biosynthetic activities. 

Lithotrophic organisms are not such an obvious feature of our world, because 
they are microscopic and mostly live in habitats that humans do not frequent— 
deep in the ocean, buried in the Earth’s crust, or in various other inhospitable 
environments. But they are a major part of the living world, and they are especially 
important in any consideration of the history of life on Earth. 

Some lithotrophs get energy from aerobic reactions, which use molecular oxy- 
gen from the environment; since atmospheric Og is ultimately the product of liv- 
ing organisms, these aerobic lithotrophs are, in a sense, feeding on the products 
of past life. There are, however, other lithotrophs that live anaerobically, in places 
where little or no molecular oxygen is present. These are circumstances similar 
to those that existed in the early days of life on Earth, before oxygen had accumu- 
lated. 

The most dramatic of these sites are the hot hydrothermal vents on the floor of 
the Pacific and Atlantic Oceans. They are located where the ocean floor is spread- 
ing as new portions of the Earth’s crust form by a gradual upwelling of material 
from the Earth’s interior (Figure 1-11). Downward-percolating seawater is heated 
and driven back upward as a submarine geyser, carrying with it a current of 
chemicals from the hot rocks below. A typical cocktail might include H2S, H2, CO, 
Mn**, Fe*t, Ni**, CH2, NH,*, and phosphorus-containing compounds. A dense 
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Figure 1-11 The geology of a hot 
hydrothermal vent in the ocean floor. As 
indicated, water percolates down toward 
the hot molten rock upwelling from the 
Earth’s interior and is heated and driven 
back upward, carrying minerals leached 
from the hot rock. A temperature gradient 
is set up, from more than 350°C near the 
core of the vent, down to 2-8°C in the 
surrounding ocean. Minerals precipitate 
from the water as it cools, forming a 
chimney. Different classes of organisms, 
thriving at different temperatures, live in 
different neighborhoods of the chimney. A 
typical chimney might be a few meters tall, 
spewing out hot, mineral-rich water at a 
flow rate of 1-2 m/sec. 
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population of microbes lives in the neighborhood of the vent, thriving on this aus- 
tere diet and harvesting free energy from reactions between the available chemi- 
cals. Other organisms—clams, mussels, and giant marine worms—in turn live off 
the microbes at the vent, forming an entire ecosystem analogous to the world of 
plants and animals that we belong to, but powered by geochemical energy instead 
of light (Figure 1-12). 


some Cells Fix Nitrogen and Carbon Dioxide for Others 


To make a living cell requires matter, as well as free energy. DNA, RNA, and 
protein are composed of just six elements: hydrogen, carbon, nitrogen, oxygen, 
sulfur, and phosphorus. These are all plentiful in the nonliving environment, in 
the Earth’s rocks, water, and atmosphere. But they are not present in chemical 
forms that allow easy incorporation into biological molecules. Atmospheric N2 
and COs, in particular, are extremely unreactive. A large amount of free energy 
is required to drive the reactions that use these inorganic molecules to make 
the organic compounds needed for further biosynthesis—that is, to fix nitrogen 
and carbon dioxide, so as to make N and C available to living organisms. Many 
types of living cells lack the biochemical machinery to achieve this fixation; they 
instead rely on other classes of cells to do the job for them. We animals depend on 
plants for our supplies of organic carbon and nitrogen compounds. Plants in turn, 
although they can fix carbon dioxide from the atmosphere, lack the ability to fix 
atmospheric nitrogen; they depend in part on nitrogen-fixing bacteria to supply 
their need for nitrogen compounds. Plants of the pea family, for example, harbor 
symbiotic nitrogen-fixing bacteria in nodules in their roots. 

Living cells therefore differ widely in some of the most basic aspects of their 
biochemistry. Not surprisingly, cells with complementary needs and capabilities 
have developed close associations. Some of these associations, as we see below, 
have evolved to the point where the partners have lost their separate identities 
altogether: they have joined forces to form a single composite cell. 


The Greatest Biochemical Diversity Exists Among Prokaryotic Cells 


From simple microscopy, it has long been clear that living organisms can be 
classified on the basis of cell structure into two groups: the eukaryotes and the 
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Figure 1-12 Organisms living ata 
depth of 2500 meters near a vent in 

the ocean floor. Close to the vent, at 
temperatures up to about 120°C, various 
lithotrophic species of bacteria and archaea 
(archaebacteria) live, directly fueled by 
geochemical energy. A little further away, 
where the temperature is lower, various 
invertebrate animals live by feeding on 
these microorganisms. Most remarkable 
are these giant (2 meter) tube worms, 
Riftia pachyptila, which, rather than feed 
on the lithotrophic cells, live in symbiosis 
with them: specialized organs in the 
worms harbor huge numbers of symbiotic 
sulfur-oxidizing bacteria. These bacteria 
harness geochemical energy and supply 
nourishment to their hosts, which have 

no mouth, gut, or anus. The tube worms 
are thought to have evolved from more 
conventional animals, and to have become 
secondarily adapted to life at hydrothermal 
vents. (Courtesy of Monika Bright, 
University of Vienna, Austria.) 
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prokaryotes. Eukaryotes keep their DNA in a distinct membrane-enclosed intra- 
cellular compartment called the nucleus. (The name is from the Greek, mean- 
ing “truly nucleated,’ from the words eu, “well” or “truly,’ and karyon, “kernel” 
or “nucleus.”) Prokaryotes have no distinct nuclear compartment to house their 
DNA. Plants, fungi, and animals are eukaryotes; bacteria are prokaryotes, as are 
archaea—a separate class of prokaryotic cells, discussed below. 

Most prokaryotic cells are small and simple in outward appearance (Figure 
1-13), and they live mostly as independent individuals or in loosely organized 
communities, rather than as multicellular organisms. They are typically spherical 
or rod-shaped and measure a few micrometers in linear dimension. They often 
have a tough protective coat, called a cell wall, beneath which a plasma mem- 
brane encloses a single cytoplasmic compartment containing DNA, RNA, pro- 
teins, and the many small molecules needed for life. In the electron microscope, 
this cell interior appears as a matrix of varying texture without any discernible 
organized internal structure (Figure 1-14). 

Prokaryotic cells live in an enormous variety of ecological niches, and they are 
astonishingly varied in their biochemical capabilities—far more so than eukary- 
otic cells. Organotrophic species can utilize virtually any type of organic molecule 
as food, from sugars and amino acids to hydrocarbons and methane gas. Photo- 
trophic species (Figure 1-15) harvest light energy in a variety of ways, some of 
them generating oxygen as a by-product, others not. Lithotrophic species can feed 
on a plain diet of inorganic nutrients, getting their carbon from CO,, and relying 
on H3S to fuel their energy needs (Figure 1-16)—or on Hg, or Fe**, or elemental 
sulfur, or any of a host of other chemicals that occur in the environment. 


Figure 1-14 The structure of a bacterium. (A) The bacterium Vibrio 
cholerae, showing Its simple internal organization. Like many other species, 
Vibrio has a helical appendage at one end—a flagellum —that rotates as a 
propeller to drive the cell forward. It can infect the human small intestine to 
cause cholera; the severe diarrhea that accompanies this disease kills more 
than 100,000 people a year. (B) An electron micrograph of a longitudinal 
section through the widely studied bacterium Escherichia coli (E. coli). The 
cell’s DNA is concentrated in the lightly stained region. Part of our normal 
intestinal flora, E. coli is related to Vibrio, and it has many flagella distributed 
over its surface that are not visible in this section. (B, courtesy of 

E. Kellenberger.) 
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Figure 1-13 Shapes and sizes of some 
bacteria. Although most are small, as 
shown, measuring a few micrometers 

in linear dimension, there are also some 
giant species. An extreme example (not 
shown) is the cigar-shaped bacterium 
Epulopiscium fishelsoni, which lives in the 
gut of a surgeonfish and can be up to 
600 um long. 
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Much of this world of microscopic organisms is virtually unexplored. Tradi- 
tional methods of bacteriology have given us an acquaintance with those species 
that can be isolated and cultured in the laboratory. But DNA sequence analysis of 
the populations of bacteria and archaea in samples from natural habitats—such 
as soil or ocean water, or even the human mouth—has opened our eyes to the fact 
that most species cannot be cultured by standard laboratory techniques. Accord- 
ing to one estimate, at least 99% of prokaryotic species remain to be characterized. 
Detected only by their DNA, it has not yet been possible to grow the vast majority 
of them in laboratories. 


The Tree of Life Has Three Primary Branches: Bacteria, Archaea, 
and Eukaryotes 


The classification of living things has traditionally depended on comparisons of 
their outward appearances: we can see that a fish has eyes, jaws, backbone, brain, 
and so on, just as we do, and that a worm does not; that a rosebush is cousin 
to an apple tree, but is less similar to a grass. As Darwin showed, we can read- 
ily interpret such close family resemblances in terms of evolution from common 
ancestors, and we can find the remains of many of these ancestors preserved in 
the fossil record. In this way, it has been possible to begin to draw a family tree of 
living organisms, showing the various lines of descent, as well as branch points 
in the history, where the ancestors of one group of species became different from 
those of another. 

When the disparities between organisms become very great, however, these 
methods begin to fail. How do we decide whether a fungus is closer kin to a plant 
or to an animal? When it comes to prokaryotes, the task becomes harder still: one 
microscopic rod or sphere looks much like another. Microbiologists have there- 
fore sought to classify prokaryotes in terms of their biochemistry and nutritional 
requirements. But this approach also has its pitfalls. Amid the bewildering variety 
of biochemical behaviors, it is difficult to know which differences truly reflect dif- 
ferences of evolutionary history. 

Genome analysis has now given us a simpler, more direct, and much more 
powerful way to determine evolutionary relationships. The complete DNA 
sequence of an organism defines its nature with almost perfect precision and in 
exhaustive detail. Moreover, this specification is in a digital form—a string of let- 
ters—that can be entered straightforwardly into a computer and compared with 
the corresponding information for any other living thing. Because DNA is subject 
to random changes that accumulate over long periods of time (as we shall see 
shortly), the number of differences between the DNA sequences of two organ- 
isms can provide a direct, objective, quantitative indication of the evolutionary 
distance between them. 

This approach has shown that the organisms that were traditionally classed 
together as “bacteria” can be as widely divergent in their evolutionary origins as 
is any prokaryote from any eukaryote. It is now clear that the prokaryotes com- 
prise two distinct groups that diverged early in the history of life on Earth, before 
the eukaryotes diverged as a separate group. The two groups of prokaryotes are 
called the bacteria (or eubacteria) and the archaea (or archaebacteria). Detailed 
genome analyses have recently revealed that the first eukayotic cell formed after a 


Figure 1-15 The phototrophic bacterium 
Anabaena cylindrica viewed in the light 
microscope. The cells of this species form 
long, multicellular filaments. Most of the 
cells (labeled V) perform photosynthesis, 
while others become specialized for 
nitrogen fixation (labeled H) or develop into 
resistant spores (labeled S). (Courtesy of 
Dave G. Adams.) 
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Figure 1-16 A lithotrophic bacterium. 
Beggiatoa, which lives in sulfurous 
environments, gets its energy by oxidizing 
H2S and can fix carbon even in the dark. 
Note the yellow deposits of sulfur inside the 
cells. (Courtesy of Raloh W. Wolfe.) 
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particular type of ancient archaeal cell engulfed an ancient bacterium (see Figure 
12-3). Thus, the living world today is considered to consist of three major divisions 
or domains: bacteria, archaea, and eukaryotes (Figure 1-17). 

Archaea are often found inhabiting environments that we humans avoid, such 
as bogs, sewage treatment plants, ocean depths, salt brines, and hot acid springs, 
although they are also widespread in less extreme and more homely environ- 
ments, from soils and lakes to the stomachs of cattle. In outward appearance they 
are not easily distinguished from bacteria. At a molecular level, archaea seem to 
resemble eukaryotes more closely in their machinery for handling genetic infor- 
mation (replication, transcription, and translation), but bacteria more closely in 
their apparatus for metabolism and energy conversion. We discuss below how 
this might be explained. 


some Genes Evolve Rapidly; Others Are Highly Conserved 


Both in the storage and in the copying of genetic information, random accidents 
and errors occur, altering the nucleotide sequence—that is, creating mutations. 
Therefore, when a cell divides, its two daughters are often not quite identical 
to one another or to their parent. On rare occasions, the error may represent a 
change for the better; more probably, it will cause no significant difference in 
the cell’s prospects. But in many cases, the error will cause serious damage—for 
example, by disrupting the coding sequence for a key protein. Changes due to 
mistakes of the first type will tend to be perpetuated, because the altered cell has 
an increased likelihood of reproducing itself. Changes due to mistakes of the sec- 
ond type—selectively neutral changes—may be perpetuated or not: in the com- 
petition for limited resources, it is a matter of chance whether the altered cell or 
its cousins will succeed. But changes that cause serious damage lead nowhere: 
the cell that suffers them dies, leaving no progeny. Through endless repetition of 
this cycle of error and trial—of mutation and natural selection—organisms evolve: 
their genetic specifications change, giving them new ways to exploit the environ- 
ment more effectively, to survive in competition with others, and to reproduce 
successfully. 

Some parts of the genome will change more easily than others in the course of 
evolution. A segment of DNA that does not code for protein and has no significant 
regulatory role is free to change at a rate limited only by the frequency of ran- 
dom errors. In contrast, a gene that codes for a highly optimized essential protein 
or RNA molecule cannot alter so easily: when mistakes occur, the faulty cells are 
almost always eliminated. Genes of this latter sort are therefore highly conserved. 
Through 3.5 billion years or more of evolutionary history, many features of the 
genome have changed beyond all recognition, but the most highly conserved 
genes remain perfectly recognizable in all living species. 
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Figure 1-17 The three major divisions 
(domains) of the living world. Note that 
the word bacteria was originally used to 
refer to prokaryotes in general, but more 
recently has been redefined to refer to 
eubacteria specifically. The tree shown 
here is based on comparisons of the 
nucleotide sequence of a ribosomal RNA 
(rRNA) subunit in the different species, and 
the distances in the diagram represent 
estimates of the numbers of evolutionary 
changes that have occurred in this 
molecule in each lineage (see Figure 1-18). 
The parts of the tree shrouded in gray 
cloud represent uncertainties about details 
of the true pattern of species divergence 
in the course of evolution: comparisons 

of nucleotide or amino acid Sequences 

of molecules other than rRNA, as well as 
other arguments, can lead to somewhat 
different trees. As indicated, the nucleus of 
the eukaryotic cell is now thought to have 
emerged from a sub-branch within the 
archaea, so that in the beginning the tree 
of life had only two branches—bacteria 
and archaea. 
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These latter genes are the ones we must examine if we wish to trace family rela- 
tionships between the most distantly related organisms in the tree of life. The ini- 
tial studies that led to the classification of the living world into the three domains 
of bacteria, archaea, and eukaryotes were based chiefly on analysis of one of the 
rRNA components of the ribosome. Because the translation of RNA into protein is 
fundamental to all living cells, this component of the ribosome has been very well 
conserved since early in the history of life on Earth (Figure 1-18). 


Most Bacteria and Archaea Have 1000-6000 Genes 


Natural selection has generally favored those prokaryotic cells that can reproduce 
the fastest by taking up raw materials from their environment and replicating 
themselves most efficiently, at the maximal rate permitted by the available food 
supplies. Small size implies a large ratio of surface area to volume, thereby helping 
to maximize the uptake of nutrients across the plasma membrane and boosting a 
cell’s reproductive rate. 

Presumably for these reasons, most prokaryotic cells carry very little super- 
fluous baggage; their genomes are small, with genes packed closely together and 
minimal quantities of regulatory DNA between them. The small genome size has 
made it easy to use modern DNA sequencing techniques to determine complete 
genome sequences. We now have this information for thousands of species of 
bacteria and archaea, as well as for hundreds of species of eukaryotes. Most bacte- 
rial and archaeal genomes contain between 10° and 10’ nucleotide pairs, encod- 
ing 1000-6000 genes. 

A complete DNA sequence reveals both the genes an organism possesses and 
the genes it lacks. When we compare the three domains of the living world, we 
can begin to see which genes are common to all of them and must therefore have 
been present in the cell that was ancestral to all present-day living things, and 
which genes are peculiar to a single branch in the tree of life. To explain the find- 
ings, however, we need to consider a little more closely how new genes arise and 
genomes evolve. 


New Genes Are Generated from Preexisting Genes 


The raw material of evolution is the DNA sequence that already exists: there is 
no natural mechanism for making long stretches of new random sequence. In 
this sense, no gene is ever entirely new. Innovation can, however, occur in several 
ways (Figure 1-19): 
1. Intragenic mutation: an existing gene can be randomly modified by changes 
in its DNA sequence, through various types of error that occur mainly in the 
process of DNA replication. 


2. Gene duplication: an existing gene can be accidentally duplicated so as to 
create a pair of initially identical genes within a single cell; these two genes 
may then diverge in the course of evolution. 


3. DNA segment shuffling: two or more existing genes can break and rejoin to 
make a hybrid gene consisting of DNA segments that originally belonged to 
separate genes. 


4. Horizontal (intercellular) transfer: a piece of DNA can be transferred from 

the genome of one cell to that of another—even to that of another species. 

This process is in contrast with the usual vertical transfer of genetic infor- 
mation from parent to progeny. 

Each of these types of change leaves a characteristic trace in the DNA sequence 

of the organism, and there is clear evidence that all four processes have frequently 
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Figure 1-18 Genetic information 
conserved since the days of the last 
common ancestor of all living things. 

A part of the gene for the smaller of the two 
main rRNA components of the ribosome is 
shown. (The complete molecule is about 
1500-1900 nucleotides long, depending 
on species.) Corresponding segments of 
nucleotide sequence from an archaean 
(Methanococcus jannaschii), a bacterium 
(Escherichia colli), and a eukaryote (Homo 
sapiens) are aligned. Sites where the 
nucleotides are identical between species 
are indicated by a vertical line; the human 
sequence is repeated at the bottom of 

the alignment so that all three two-way 
comparisons can be seen. A dot halfway 
along the E. coli sequence denotes a site 
where a nucleotide has been either deleted 
from the bacterial lineage in the course 

of evolution or inserted in the other two 
lineages. Note that the sequences from 
these three organisms, representative of 
the three domains of the living world, still 
retain unmistakable similarities. 
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occurred. In later chapters, we discuss the underlying mechanisms, but for the 
present we focus on the consequences. 


Gene Duplications Give Rise to Families of Related Genes Within a 
Single Cell 


A cell duplicates its entire genome each time it divides into two daughter cells. 
However, accidents occasionally result in the inappropriate duplication of just 
part of the genome, with retention of original and duplicate segments in a single 
cell. Once a gene has been duplicated in this way, one of the two gene copies is 
free to mutate and become specialized to perform a different function within the 
same cell. Repeated rounds of this process of duplication and divergence, over 
many millions of years, have enabled one gene to give rise to a family of genes that 
may all be found within a single genome. Analysis of the DNA sequence of pro- 
karyotic genomes reveals many examples of such gene families: in the bacterium 
Bacillus subtilis, for example, 47% of the genes have one or more obvious relatives 
(Figure 1-20). 

When genes duplicate and diverge in this way, the individuals of one species 
become endowed with multiple variants of a primordial gene. This evolutionary 
process has to be distinguished from the genetic divergence that occurs when one 
species of organism splits into two separate lines of descent at a branch point in 
the family tree—when the human line of descent became separate from that of 
chimpanzees, for example. There, the genes gradually become different in the 
course of evolution, but they are likely to continue to have corresponding func- 
tions in the two sister species. Genes that are related by descent in this way—that 
is, genes in two separate species that derive from the same ancestral gene in the 
last common ancestor of those two species—are called orthologs. Related genes 
that have resulted from a gene duplication event within a single genome—and 
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Figure 1-19 Four modes of genetic 
innovation and their effects on the DNA 
sequence of an organism. A special 
form of horizontal transfer occurs when 
two different types of cells enter into 

a permanent symbiotic association. 
Genes from one of the cells then may be 
transferred to the genome of the other, 
as we Shall see below when we discuss 
mitochondria and chloroplasts. 
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are likely to have diverged in their function—are called paralogs. Genes that are 
related by descent in either way are called homologs, a general term used to cover 
both types of relationship (Figure 1-21). 


Genes Can Be Transferred Between Organisms, Both in the 
Laboratory and in Nature 


Prokaryotes provide good examples of the horizontal transfer of genes from one 
species of cell to another. The most obvious tell-tale signs are sequences recogniz- 
able as being derived from viruses, those infecting bacteria being called bacte- 
riophages (Figure 1-22). Viruses are small packets of genetic material that have 
evolved as parasites on the reproductive and biosynthetic machinery of host cells. 
Although not themselves living cells, they often serve as vectors for gene transfer. 
A virus will replicate in one cell, emerge from it with a protective wrapping, and 
then enter and infect another cell, which may be of the same or a different species. 
Often, the infected cell will be killed by the massive proliferation of virus particles 
inside it; but sometimes, the viral DNA, instead of directly generating these par- 
ticles, may persist in its host for many cell generations as a relatively innocuous 
passenger, either as a separate intracellular fragment of DNA, known as a plasmid, 
or as a sequence inserted into the cell’s regular genome. In their travels, viruses 
can accidentally pick up fragments of DNA from the genome of one host cell and 
ferry them into another cell. Such transfers of genetic material are very common 
in prokaryotes. 

Horizontal transfers of genes between eukaryotic cells of different species 
are very rare, and they do not seem to have played a significant part in eukaryote 
evolution (although massive transfers from bacterial to eukaryotic genomes have 
occurred in the evolution of mitochondria and chloroplasts, as we discuss below). 


ancestral organism ancestral organism 


gene G 


gene G 


SPECIATION TO GIVE TWO 
SEPARATE SPECIES 


GENE DUPLICATION 
AND DIVERGENCE 


later organism 


species A species B 


(==) (=) 


genes G, and Gg are orthologs 


(A) (B) 





genes G} and G, are paralogs 


Figure 1-20 Families of evolutionarily 
related genes in the genome of Bacillus 
subtilis. The largest gene family in this 
bacterium consists of 77 genes coding for 
varieties of ABC transporters—a class of 
membrane transport proteins found in all 
three domains of the living world. (Adapted 
from F. Kunst et al., Nature 390:249-256, 
1997. With permission from Macmillan 
Publishers Ltd.) 


Figure 1-21 Paralogous genes and 
orthologous genes: two types of 
gene homology based on different 
evolutionary pathways. (A) Orthologs. 
(B) Paralogs. 
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In contrast, horizontal gene transfers occur much more frequently between differ- 
ent species of prokaryotes. Many prokaryotes have a remarkable capacity to take 
up even nonviral DNA molecules from their surroundings and thereby capture 
the genetic information these molecules carry. By this route, or by virus-mediated 
transfer, bacteria and archaea in the wild can acquire genes from neighboring 
cells relatively easily. Genes that confer resistance to an antibiotic or an ability 
to produce a toxin, for example, can be transferred from species to species and 
provide the recipient bacterium with a selective advantage. In this way, new and 
sometimes dangerous strains of bacteria have been observed to evolve in the bac- 
terial ecosystems that inhabit hospitals or the various niches in the human body. 
For example, horizontal gene transfer is responsible for the spread, over the past 
40 years, of penicillin-resistant strains of Neisseria gonorrhoeae, the bacterium 
that causes gonorrhea. On a longer time scale, the results can be even more pro- 
found; it has been estimated that at least 18% of all of the genes in the present-day 
genome of E. coli have been acquired by horizontal transfer from another species 
within the past 100 million years. 


Sex Results in Horizontal Exchanges of Genetic Information Within 
a Species 


Horizontal gene transfer among prokaryotes has a parallel in a phenomenon 
familiar to us all: sex. In addition to the usual vertical transfer of genetic mate- 
rial from parent to offspring, sexual reproduction causes a large-scale horizontal 
transfer of genetic information between two initially separate cell lineages—those 
of the father and the mother. A key feature of sex, of course, is that the genetic 
exchange normally occurs only between individuals of the same species. But no 
matter whether they occur within a species or between species, horizontal gene 
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Figure 1-22 The viral transfer of DNA 
into a cell. (A) An electron micrograph 

of particles of a bacterial virus, the T4 
bacteriophage. The head of this virus 
contains the viral DNA; the tail contains 
the apparatus for injecting the DNA into a 
host bacterium. (B) A cross section of an 
E. coli bacterium with a T4 bacteriophage 
latched onto its surface. The large dark 
objects inside the bacterium are the 

heads of new 14 particles in the course 

of assembly. When they are mature, 

the bacterium will burst open to release 
them. (C-E) The process of DNA injection 
into the bacterium, as visualized in 
unstained, frozen samples by cryoelectron 
microscopy. (C) Attachment begins. 

(D) Attached state during DNA injection. 
(E) Virus head has emptied all of its DNA 
into the bacterium. (A, courtesy of James 
Paulson; B, courtesy of Jonathan King 
and Erika Hartwig from G. Karp, Cell and 
Molecular Biology, 2nd ed. New York: John 
Wiley & Sons, 1999. With permission from 
John Wiley & Sons; C-E, courtesy of lan 
Molineux, University of Texas at Austin and 
Jun Liu, University of Texas Health Science 
Center, Houston.) 
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transfers leave a characteristic imprint: they result in individuals who are related 
more closely to one set of relatives with respect to some genes, and more closely to 
another set of relatives with respect to others. By comparing the DNA sequences of 
individual human genomes, an intelligent visitor from outer space could deduce 
that humans reproduce sexually, even if it knew nothing about human behavior. 

Sexual reproduction is widespread (although not universal), especially 
among eukaryotes. Even bacteria indulge from time to time in controlled sexual 
exchanges of DNA with other members of their own species. Natural selection 
has clearly favored organisms that can reproduce sexually, although evolutionary 
theorists dispute precisely what that selective advantage is. 


The Function of a Gene Can Often Be Deduced from Its Sequence 


Family relationships among genes are important not just for their historical inter- 
est, but because they simplify the task of deciphering gene functions. Once the 
sequence of a newly discovered gene has been determined, a scientist can tap a 
few keys on a computer to search the entire database of known gene sequences 
for genes related to it. In many cases, the function of one or more of these homo- 
logs will have been already determined experimentally. Since gene sequence 
determines gene function, one can frequently make a good guess at the function 
of the new gene: it is likely to be similar to that of the already known homologs. 

In this way, it is possible to decipher a great deal of the biology of an organism 
simply by analyzing the DNA sequence of its genome and using the information 
we already have about the functions of genes in other organisms that have been 
more intensively studied. 


More Than 200 Gene Families Are Common to All Three Primary 
Branches of the Tree of Life 


Given the complete genome sequences of representative organisms from all three 
domains—archaea, bacteria, and eukaryotes—we can search systematically for 
homologies that span this enormous evolutionary divide. In this way we can begin 
to take stock of the common inheritance of all living things. There are consider- 
able difficulties in this enterprise. For example, individual species have often lost 
some of the ancestral genes; other genes have almost certainly been acquired by 
horizontal transfer from another species and therefore are not truly ancestral, 
even though shared. In fact, genome comparisons strongly suggest that both lin- 
eage-specific gene loss and horizontal gene transfer, in some cases between evo- 
lutionarily distant species, have been major factors of evolution, at least among 
prokaryotes. Finally, in the course of 2 or 3 billion years, some genes that were 
initially shared will have changed beyond recognition through mutation. 
Because of all these vagaries of the evolutionary process, it seems that only 
a small proportion of ancestral gene families has been universally retained in 
a recognizable form. Thus, out of 4873 protein-coding gene families defined by 
comparing the genomes of 50 species of bacteria, 13 archaea, and 3 unicellular 
eukaryotes, only 63 are truly ubiquitous (that is, represented in all the genomes 
analyzed). The great majority of these universal families include components of 
the translation and transcription systems. This is not likely to be a realistic approxi- 
mation of an ancestral gene set. A better—though still crude—idea of the latter can 
be obtained by tallying the gene families that have representatives in multiple, but 
not necessarily all, species from all three major domains. Such an analysis reveals 
264 ancient conserved families. Each family can be assigned a function (at least in 
terms of general biochemical activity, but usually with more precision). As shown 
in Table 1-1, the largest number of shared gene families are involved in transla- 
tion and in amino acid metabolism and transport. However, this set of highly con- 
served gene families represents only a very rough sketch of the common inheri- 
tance of all modern life. A more precise reconstruction of the gene complement 
of the last universal common ancestor will hopefully become feasible with further 
genome sequencing and more sophisticated forms of comparative analysis. 
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TABLE 1-1 


Information processing Metabolism 


Energy production and conversion 
Carbohydrate transport and metabolism 


Amino acid transport and metabolism 


16 
43 
15 


Cellular processes and signaling Nucleotide transport and metabolism 


Cell-cycle control, mitosis, and meiosis Coenzyme transport and metabolism 
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Signal transduction mechanisms Inorganic ion transport and metabolism 


Cell wall/membrane biogenesis 2 Secondary metabolite biosynthesis, 
transport, and catabolism 


Intracellular trafficking and secretion Poorly characterized 


General biochemical function predicted; 24 
specific biological role unknown 


For the purpose of this analysis, gene families are defined as “universal” if they are represented in the genomes of at least two diverse archaea 
(Archaeoglobus fulgidus and Aeropyrum pernix), two evolutionarily distant bacteria (Escherichia coli and Bacillus subtilis), and one eukaryote 
(yeast, Saccharomyces cerevisiae). (Data from R.L. Tatusov, E.V. Koonin and D.J. Lipman, Science 278:631-637, 1997; R.L. Tatusov et al., BMC 
Bioinformatics 4:41, 2003; and the COGs database at the US National Library of Medicine.) 
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Mutations Reveal the Functions of Genes 


Without additional information, no amount of gazing at genome sequences will 
reveal the functions of genes. We may recognize that gene B is like gene A, but 
how do we discover the function of gene A in the first place? And even if we know 
the function of gene A, how do we test whether the function of gene B is truly 
the same as the sequence similarity suggests? How do we connect the world of 
abstract genetic information with the world of real living organisms? 

The analysis of gene functions depends on two complementary approaches: 
genetics and biochemistry. Genetics starts with the study of mutants: we either 
find or make an organism in which a gene is altered, and then examine the effects 
on the organism’s structure and performance (Figure 1-23). Biochemistry more 
directly examines the functions of molecules: here we extract molecules from an 
organism and then study their chemical activities. By combining genetics and 
biochemistry, it is possible to find those molecules whose production depends on 
a given gene. At the same time, careful studies of the performance of the mutant 
organism show us what role those molecules have in the operation of the organ- 
ism as a whole. Thus, genetics and biochemistry used in combination with cell 
biology provide the best way to relate genes and molecules to the structure and 
function of an organism. 

In recent years, DNA sequence information and the powerful tools of molecu- 
lar biology have accelerated progress. From sequence comparisons, we can often Figure 1-23 A mutant phenotype 
identify particular subregions within a gene that have been preserved nearly reflecting the function of a gene. A normal 
unchanged over the course of evolution. These conserved subregions are likely  Y€2St (0f the species SchiZosaccharomyces 

. f . pombe) is compared with a mutant in which 
to be the most important parts of the gene in terms of function. We can test their | change in a single gene has converted the 
individual contributions to the activity of the gene product by creating in the lab- cell from a cigar shape (left) to a T shape 
oratory mutations of specific sites within the gene, or by constructing artificial (right). The mutant gene therefore has a 
hybrid genes that combine part of one gene with part of another. Organisms can function in the control of cell shape. But 
be engineered to make either the RNA or the protein specified by the gene in large POW, In molecular terms, does the gene 

J oe . i ean product perform that function? That is a 
quantities to facilitate biochemical analysis. Specialists in molecular structure can Parder question, and it needs biochemical 
determine the three-dimensional conformation of the gene product, revealing analysis to answer it. (Courtesy of Kenneth 
the exact position of every atom in it. Biochemists can determine how each of the Sawin and Paul Nurse.) 
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parts of the genetically specified molecule contributes to its chemical behavior. 
Cell biologists can analyze the behavior of cells that are engineered to express a 
mutant version of the gene. 

There is, however, no one simple recipe for discovering a gene’s function, and 
no simple standard universal format for describing it. We may discover, for exam- 
ple, that the product of a given gene catalyzes a certain chemical reaction, and 
yet have no idea how or why that reaction is important to the organism. The func- 
tional characterization of each new family of gene products, unlike the descrip- 
tion of the gene sequences, presents a fresh challenge to the biologist’s ingenuity. 
Moreover, we will never fully understand the function of a gene until we learn its 
role in the life of the organism as a whole. To make ultimate sense of gene func- 
tions, therefore, we have to study whole organisms, not just molecules or cells. 


Molecular Biology Began with a Spotlight on E. coli 


Because living organisms are so complex, the more we learn about any particular 
species, the more attractive it becomes as an object for further study. Each dis- 
covery raises new questions and provides new tools with which to tackle general 
questions in the context of the chosen organism. For this reason, large communi- 
ties of biologists have become dedicated to studying different aspects of the same 
model organism. 

In the early days of molecular biology, the spotlight focused intensely on just 
one species: the Escherichia coli, or E. coli, bacterium (see Figures 1-13 and 1-14). 
This small, rod-shaped bacterial cell normally lives in the gut of humans and other 
vertebrates, but it can be grown easily in a simple nutrient broth in a culture bot- 
tle. It adapts to variable chemical conditions and reproduces rapidly, and it can 
evolve by mutation and selection at a remarkable speed. As with other bacteria, 
different strains of E. coli, though classified as members of a single species, dif- 
fer genetically to a much greater degree than do different varieties of a sexually 
reproducing organism such as a plant or animal. One E. coli strain may possess 
many hundreds of genes that are absent from another, and the two strains could 
have as little as 50% of their genes in common. The standard laboratory strain 
E. coli K-12 has a genome of approximately 4.6 million nucleotide pairs, contained 
in a single circular molecule of DNA that codes for about 4300 different kinds of 
proteins (Figure 1-24). 

In molecular terms, we know more about E. coli than about any other living 
organism. Most of our understanding of the fundamental mechanisms of life— 
for example, how cells replicate their DNA, or how they decode the instructions 
represented in the DNA to direct the synthesis of specific proteins—initially came 
from studies of E. coli. The basic genetic mechanisms have turned out to be highly 
conserved throughout evolution: these mechanisms are essentially the same in 
our own cells as in E. coli. 


Summary 


Prokaryotes (cells without a distinct nucleus) are biochemically the most diverse 
organisms and include species that can obtain all their energy and nutrients from 
inorganic chemical sources, such as the reactive mixtures of minerals released at 
hydrothermal vents on the ocean floor—the sort of diet that may have nourished the 
first living cells 3.5 billion years ago. DNA sequence comparisons reveal the family 
relationships of living organisms and show that the prokaryotes fall into two groups 
that diverged early in the course of evolution: the bacteria (or eubacteria) and the 
archaea. Together with the eukaryotes (cells with a membrane-enclosed nucleus), 
these constitute the three primary branches of the tree of life. 

Most bacteria and archaea are small unicellular organisms with compact 
genomes comprising 1000-6000 genes. Many of the genes within a single organism 
show strong family resemblances in their DNA sequences, implying that they origi- 
nated from the same ancestral gene through gene duplication and divergence. Fam- 
ily resemblances (homologies) are also clear when gene sequences are compared 
between different species, and more than 200 gene families have been so highly 
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Figure 1-24 The genome of E. coli. (A) A cluster of E. coli cells. (B) A diagram of the genome of 
E. coli strain K-12. The diagram is circular because the DNA of E. coli, like that of other prokaryotes, 
forms a single, closed loop. Protein-coding genes are shown as yellow or orange bars, depending 
on the DNA strand from which they are transcribed; genes encoding only RNA molecules are 
indicated by green arrows. Some genes are transcribed from one strand of the DNA double helix (in 
a clockwise direction in this diagram), others from the other strand (counterclockwise). (A, courtesy 


of Dr. Tony Brain and David Parker/Photo Researchers; B, adapted from F.R. Blattner et al., Science 
271:1453-1462, 1997.) 


(B) 


conserved that they can be recognized as common to most species from all three 
domains of the living world. Thus, given the DNA sequence of a newly discovered 
gene, it is often possible to deduce the gene’s function from the known function of a 
homologous gene in an intensively studied model organism, such as the bacterium 
E. coli. 


GENETIC INFORMATION IN EUKARYOTES 


Eukaryotic cells, in general, are bigger and more elaborate than prokaryotic cells, 
and their genomes are bigger and more elaborate, too. The greater size is accom- 
panied by radical differences in cell structure and function. Moreover, many 
classes of eukaryotic cells form multicellular organisms that attain levels of com- 
plexity unmatched by any prokaryote. 

Because they are so complex, eukaryotes confront molecular biologists with a 
special set of challenges that will concern us in the rest of this book. Increasingly, 
biologists attempt to meet these challenges through the analysis and manipula- 
tion of the genetic information within cells and organisms. It is therefore impor- 
tant at the outset to know something of the special features of the eukaryotic 
genome. We begin by briefly discussing how eukaryotic cells are organized, how 
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this reflects their way of life, and how their genomes differ from those of prokary- 
otes. This leads us to an outline of the strategy by which cell biologists, by exploit- 
ing genetic and biochemical information, are attempting to discover how eukary- 
otic organisms work. 


Eukaryotic Cells May Have Originated as Predators 


By definition, eukaryotic cells keep their DNA in an internal compartment called 
the nucleus. The nuclear envelope, a double layer of membrane, surrounds the 
nucleus and separates the DNA from the cytoplasm. Eukaryotes also have other 
features that set them apart from prokaryotes (Figure 1-25). Their cells are, typi- 
cally, 10 times bigger in linear dimension and 1000 times larger in volume. They 
have an elaborate cytoskeleton—a system of protein filaments crisscrossing the 
cytoplasm and forming, together with the many proteins that attach to them, a 
system of girders, ropes, and motors that gives the cell mechanical strength, con- 
trols its shape, and drives and guides its movements (Movie 1.1). And the nuclear 
envelope is only one part of a set of internal membranes, each structurally similar 
to the plasma membrane and enclosing different types of spaces inside the cell, 
many of them involved in digestion and secretion. Lacking the tough cell wall of 
most bacteria, animal cells and the free-living eukaryotic cells called protozoa can 
change their shape rapidly and engulf other cells and small objects by phagocyto- 
sis (Figure 1-26). 

How all of the unique properties of eukaryotic cells evolved, and in what 
sequence, is still a mystery. One plausible view, however, is that they are all reflec- 
tions of the way of life of a primordial cell that was a predator, living by capturing 
other cells and eating them (Figure 1-27). Such a way of life requires a large cell 
with a flexible plasma membrane, as well as an elaborate cytoskeleton to support 
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Figure 1-25 The major features of eukaryotic cells. The drawing depicts a typical animal cell, but almost all the same components are found in 
plants and fungi as well as in single-celled eukaryotes such as yeasts and protozoa. Plant cells contain chloroplasts in addition to the components 
shown here, and their plasma membrane is surrounded by a tough external wall formed of cellulose. 
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Figure 1-26 Phagocytosis. This series of 
stills from a movie shows a human white 
blood cell (a neutrophil) engulfing a red 
blood cell (artificially colored red) that has 
been treated with an antibody that marks it 
for destruction (see Movie 13.5). (Courtesy 
of Stephen E. Malawista and Anne de 
Boisfleury Chevance.) 





and move this membrane. It may also require that the cell’s long, fragile DNA mol- 
ecules be sequestered in a separate nuclear compartment, to protect the genome 
from damage by the movements of the cytoskeleton. 


Modern Eukaryotic Cells Evolved from a Symbiosis 


A predatory way of life helps to explain another feature of eukaryotic cells. All 
such cells contain (or at one time did contain) mitochondria (Figure 1-28). These 
small bodies in the cytoplasm, enclosed by a double layer of membrane, take up 
oxygen and harness energy from the oxidation of food molecules—such as sug- 
ars—to produce most of the ATP that powers the cell’s activities. Mitochondria are 
similar in size to small bacteria, and, like bacteria, they have their own genome in 
the form of a circular DNA molecule, their own ribosomes that differ from those 
elsewhere in the eukaryotic cell, and their own transfer RNAs. It is now gener- 
ally accepted that mitochondria originated from free-living oxygen-metabolizing 
(aerobic) bacteria that were engulfed by an ancestral cell that could otherwise 
make no such use of oxygen (that is, was anaerobic). Escaping digestion, these 
bacteria evolved in symbiosis with the engulfing cell and its progeny, receiving 
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Figure 1-27 A single-celled eukaryote that eats other cells. (A) Didinium is a carnivorous 
protozoan, belonging to the group known as ciliates. It has a globular body, about 150 um in 
diameter, encircled by two fringes of cilia— sinuous, whiplike appendages that beat continually; its 
front end is flattened except for a single protrusion, rather like a snout. (B) A Didinium engulfing its 
prey. Didinium normally swims around in the water at high speed by means of the synchronous 
beating of its cilia. When it encounters a suitable prey (yellow), usually another type of protozoan, it 
releases numerous small paralyzing darts from its snout region. Then, the Didinium attaches to and 
devours the other cell by phagocytosis, inverting like a hollow ball to engulf its victim, which can be 
almost as large as itself. (Courtesy of D. Barlow.) 
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shelter and nourishment in return for the power generation they performed for 
their hosts. This partnership between a primitive anaerobic predator cell and an 
aerobic bacterial cell is thought to have been established about 1.5 billion years 
ago, when the Earth’s atmosphere first became rich in oxygen. 

As indicated in Figure 1-29, recent genomic analyses suggest that the first 
eukaryotic cells formed after an archaeal cell engulfed an aerobic bacterium. This 
would explain why all eukaryotic cells today, including those that live as strict 
anaerobes show clear evidence that they once contained mitochondria. 

Many eukaryotic cells—specifically, those of plants and algae—also contain 
another class of small membrane-enclosed organelles somewhat similar to mito- 
chondria—the chloroplasts (Figure 1-30). Chloroplasts perform photosynthesis, 
using the energy of sunlight to synthesize carbohydrates from atmospheric car- 
bon dioxide and water, and deliver the products to the host cell as food. Like mito- 
chondria, chloroplasts have their own genome. They almost certainly originated 
as symbiotic photosynthetic bacteria, acquired by eukaryotic cells that already 
possessed mitochondria (Figure 1-31). 

A eukaryotic cell equipped with chloroplasts has no need to chase after other 
cells as prey; it is nourished by the captive chloroplasts it has inherited from its 
ancestors. Correspondingly, plant cells, although they possess the cytoskele- 
tal equipment for movement, have lost the ability to change shape rapidly and 
to engulf other cells by phagocytosis. Instead, they create around themselves a 
tough, protective cell wall. If the first eukaryotic cells were predators on other 
organisms, we can view plant cells as cells that have made the transition from 
hunting to farming. 

Fungi represent yet another eukaryotic way of life. Fungal cells, like animal 
cells, possess mitochondria but not chloroplasts; but in contrast with animal cells 
and protozoa, they have a tough outer wall that limits their ability to move rapidly 
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Figure 1-28 A mitochondrion. (A) A cross 
section, as seen in the electron microscope. 
(B) A drawing of a mitochondrion with 

part of it cut away to show the three- 
dimensional structure (Movie 1.2). (C) A 
schematic eukaryotic cell, with the interior 
space of a mitochondrion, containing the 
mitochondrial DNA and ribosomes, colored. 
Note the smooth outer membrane and the 
convoluted inner membrane, which houses 
the proteins that generate ATP from the 
oxidation of food molecules. (A, courtesy 

of Daniel S. Friend.) 
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Figure 1-29 The origin of mitochondria. 
An ancestral anaerobic predator cell (an 
archaeon) is thought to have engulfed the 


bacterial ancestor of mitochondria, initiating 
a symbiotic relationship. Clear evidence of 
a dual bacterial and archaeal inheritance 
can be discerned today in the genomes of 
all eukaryotes. 
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or to swallow up other cells. Fungi, it seems, have turned from hunters into scav- 
engers: other cells secrete nutrient molecules or release them upon death, and 
fungi feed on these leavings—performing whatever digestion is necessary extra- 
cellularly, by secreting digestive enzymes to the exterior. 


Eukaryotes Have Hybrid Genomes 


The genetic information of eukaryotic cells has a hybrid origin—from the ances- 
tral anaerobic archaeal cell, and from the bacteria that it adopted as symbionts. 
Most of this information is stored in the nucleus, but a small amount remains 
inside the mitochondria and, for plant and algal cells, in the chloroplasts. When 
mitochondrial DNA and the chloroplast DNA are separated from the nuclear DNA 
and individually analyzed and sequenced, the mitochondrial and chloroplast 
genomes are found to be degenerate, cut-down versions of the corresponding 
bacterial genomes. In a human cell, for example, the mitochondrial genome con- 
sists of only 16,569 nucleotide pairs, and codes for only 13 proteins, 2 ribosomal 
RNA components, and 22 transfer RNAs. 
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Figure 1-30 Chloroplasts. These 
organelles capture the energy of sunlight 
in plant cells and some single-celled 
eukaryotes. (A) A single cell isolated 
from a leaf of a flowering plant, seen 
in the light microscope, showing the 
green chloroplasts (Movie 1.3 and see 
Movie 14.9). (B) A drawing of one of the 
chloroplasts, showing the highly folded 
p An re system of internal membranes containing 
(A) (B) the chlorophyll molecules by which light is 
10 um absorbed. (A, courtesy of Preeti Dahiya.) 
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Many of the genes that are missing from the mitochondria and chloroplasts 
have not been lost; instead, they have moved from the symbiont genome into the 
DNA of the host cell nucleus. The nuclear DNA of humans contains many genes 
coding for proteins that serve essential functions inside the mitochondria; in 
plants, the nuclear DNA also contains many genes specifying proteins required in 
chloroplasts. In both cases, the DNA sequences of these nuclear genes show clear 
evidence of their origin from the bacterial ancestor of the respective organelle. 


Eukaryotic Genomes Are Big 


Natural selection has evidently favored mitochondria with small genomes. By con- 
trast, the nuclear genomes of most eukaryotes seem to have been free to enlarge. 
Perhaps the eukaryotic way of life has made large size an advantage: predators 
typically need to be bigger than their prey, and cell size generally increases in pro- 
portion to genome size. Whatever the reason, aided by a massive accumulation of 
DNA segments derived from parasitic transposable elements (discussed in Chap- 
ter 5), the genomes of most eukaryotes have become orders of magnitude larger 
than those of bacteria and archaea (Figure 1-32). 

The freedom to be extravagant with DNA has had profound implications. 
Eukaryotes not only have more genes than prokaryotes; they also have vastly more 
DNA that does not code for protein. The human genome contains 1000 times as 
many nucleotide pairs as the genome of a typical bacterium, perhaps 10 times as 
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Figure 1-31 The origin of chloroplasts. 
An early eukaryotic cell, already possessing 
mitochondria, engulfed a photosynthetic 
bacterium (a cyanobacterium) and retained 
it in symbiosis. Present-day chloroplasts 
are thought to trace their ancestry back to 
a single species of cyanobacterium that 
was adopted as an internal symbiont (an 
endosymbiont) over a billion years ago. 


Figure 1-32 Genome sizes compared. 
Genome size is measured in nucleotide 
pairs of DNA per haploid genome, that is, 
per single copy of the genome. (The cells 
of sexually reproducing organisms such as 
ourselves are generally diploid: they contain 
two copies of the genome, one inherited 
from the mother, the other from the father.) 
Closely related organisms can vary widely 
in the quantity of DNA in their genomes, 
even though they contain similar numbers 
of functionally distinct genes. (Data from 
W.H. Li, Molecular Evolution, pp. 380-383. 
Sunderland, MA: Sinauer, 1997.) 
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TABLE 1-2 


Escherichia coli (bacterium) 4.6 x 10° 4300 
Saccharomyces cerevisiae (yeast) 13 x 10° 6600 


Caenorhabditis elegans 130 x 10° 21,000 
(roundworm) 


Arabidopsis thaliana (plant) 29,000 
Drosophila melanogaster (fruit fly) 15,000 
Danio rerio (zebrafish) 32,000 
Mus musculus (mouse) 30,000 
Homo sapiens (human) 30,000 


“Genome size includes an estimate for the amount of highly repeated DNA sequence not in 
genome databases. 





many genes, and a great deal more noncoding DNA (~98.5% of the genome for a 
human does not code for proteins, as opposed to 11% of the genome for the bacte- 
rium E. coli). The estimated genome sizes and gene numbers for some eukaryotes 
are compiled for easy comparison with E. coli in Table 1-2; we shall discuss how 
each of these eukaryotes serves as a model organism shortly. 


Eukaryotic Genomes Are Rich in Regulatory DNA 


Much of our noncoding DNA is almost certainly dispensable junk, retained like 
a mass of old papers because, when there is little pressure to keep an archive 
small, it is easier to retain everything than to sort out the valuable information 
and discard the rest. Certain exceptional eukaryotic species, such as the puffer 
fish, bear witness to the profligacy of their relatives; they have somehow managed 
to rid themselves of large quantities of noncoding DNA. Yet they appear similar in 
structure, behavior, and fitness to related species that have vastly more such DNA 
(see Figure 4-71). 

Even in compact eukaryotic genomes such as that of puffer fish, there is more 
noncoding DNA than coding DNA, and at least some of the noncoding DNA cer- 
tainly has important functions. In particular, it regulates the expression of adja- 
cent genes. With this regulatory DNA, eukaryotes have evolved distinctive ways of 
controlling when and where a gene is brought into play. This sophisticated gene 
regulation is crucial for the formation of complex multicellular organisms. 


The Genome Defines the Program of Multicellular Develooment 


The cells in an individual animal or plant are extraordinarily varied. Fat cells, skin 
cells, bone cells, nerve cells—they seem as dissimilar as any cells could be (Figure 
1-33). Yet all these cell types are the descendants of a single fertilized egg cell, and 
all (with minor exceptions) contain identical copies of the genome of the species. 

The differences result from the way in which the cells make selective use of 
their genetic instructions according to the cues they get from their surroundings 
in the developing embryo. The DNA is not just a shopping list specifying the mol- 
ecules that every cell must have, and the cell is not an assembly of all the items 
on the list. Rather, the cell behaves as a multipurpose machine, with sensors to 
receive environmental signals and with highly developed abilities to call different 
sets of genes into action according to the sequences of signals to which the cell 
has been exposed. The genome in each cell is big enough to accommodate the 
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Figure 1-33 Cell types can vary 
enormously in size and shape. An 
animal nerve cell is compared here with a 
neutrophil, a type of white blood cell. Both 
are drawn to scale. 
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information that specifies an entire multicellular organism, but in any individual 
cell only part of that information is used. 

A large number of genes in the eukaryotic genome code for proteins that reg- 
ulate the activities of other genes. Most of these transcription regulators act by 
binding, directly or indirectly, to the regulatory DNA adjacent to the genes that are 
to be controlled, or by interfering with the abilities of other proteins to do so. The 
expanded genome of eukaryotes therefore not only specifies the hardware of the 
cell, but also stores the software that controls how that hardware is used (Figure 
1-34). 

Cells do not just passively receive signals; rather, they actively exchange sig- 
nals with their neighbors. Thus, in a developing multicellular organism, the same 
control system governs each cell, but with different consequences depending on 
the messages exchanged. The outcome, astonishingly, is a precisely patterned 
array of cells in different states, each displaying a character appropriate to its posi- 
tion in the multicellular structure. 


Many Eukaryotes Live as Solitary Cells 


Many species of eukaryotic cells lead a solitary life—some as hunters (the pro- 
tozoa), some as photosynthesizers (the unicellular algae), some as scavengers 
(the unicellular fungi, or yeasts). Figure 1-35 conveys something of the astonish- 
ing variety of the single-celled eukaryotes. The anatomy of protozoa, especially, 
is often elaborate and includes such structures as sensory bristles, photorecep- 
tors, sinuously beating cilia, leglike appendages, mouth parts, stinging darts, and 
musclelike contractile bundles. Although they are single cells, protozoa can be 
as intricate, as versatile, and as complex in their behavior as many multicellular 
organisms (see Figure 1-27, Movie 1.4, and Movie 1.5). 

In terms of their ancestry and DNA sequences, the unicellular eukaryotes are 
far more diverse than the multicellular animals, plants, and fungi, which arose as 
three comparatively late branches of the eukaryotic pedigree (see Figure 1-17). As 
with prokaryotes, humans have tended to neglect them because they are micro- 
scopic. Only now, with the help of genome analysis, are we beginning to under- 
stand their positions in the tree of life, and to put into context the glimpses these 
strange creatures can offer us of our distant evolutionary past. 


A Yeast Serves as a Minimal Model Eukaryote 


The molecular and genetic complexity of eukaryotes is daunting. Even more than 
for prokaryotes, biologists need to concentrate their limited resources on a few 
selected model organisms to unravel this complexity. 


Figure 1-34 Genetic control of the 
program of multicellular development. 
The role of a regulatory gene is 
demonstrated in the snapdragon 
Antirrhinum. In this example, a mutation 
in a single gene coding for a regulatory 
protein causes leafy shoots to develop 

in place of flowers: because a regulatory 
protein has been changed, the cells adopt 
characters that would be appropriate to a 
different location in the normal plant. The 
mutant is on the left, the normal plant on 
the right. (Courtesy of Enrico Coen and 
Rosemary Carpenter.) 
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To analyze the internal workings of the eukaryotic cell without the additional 
problems of multicellular development, it makes sense to use a species that is 
unicellular and as simple as possible. The popular choice for this role of minimal 
model eukaryote has been the yeast Saccharomyces cerevisiae (Figure 1-36)—the 
same species that is used by brewers of beer and bakers of bread. 

S. cerevisiae is a small, single-celled member of the kingdom of fungi and thus, 
according to modern views, is at least as closely related to animals as it is to plants. 
It is robust and easy to grow in a simple nutrient medium. Like other fungi, it has a 
tough cell wall, is relatively immobile, and possesses mitochondria but not chlo- 
roplasts. When nutrients are plentiful, it grows and divides almost as rapidly as a 
bacterium. It can reproduce either vegetatively (that is, by simple cell division), or 
sexually: two yeast cells that are haploid (possessing a single copy of the genome) 
can fuse to create a cell that is diploid (containing a double genome); and the dip- 
loid cell can undergo meiosis (a reduction division) to produce cells that are once 
again haploid (Figure 1-37). In contrast with higher plants and animals, the yeast 
can divide indefinitely in either the haploid or the diploid state, and the process 
leading from one state to the other can be induced at will by changing the growth 
conditions. 

In addition to these features, the yeast has a further property that makes it a 
convenient organism for genetic studies: its genome, by eukaryotic standards, 
is exceptionally small. Nevertheless, it suffices for all the basic tasks that every 
eukaryotic cell must perform. Mutants are available for essentially every gene, 
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Figure 1-35 An assortment of protozoa: 
a small sample of an extremely diverse 
class of organisms. The drawings are 
done to different scales, but in each case 
the scale bar represents 10 um. The 
organisms in (A), (C), and (G) are ciliates; 
(B) is a heliozoan; (D) is an amoeba; 

(E) is a dinoflagellate; and (F) is a euglenoid. 
(From M.A. Sleigh, Biology of Protozoa. 
Cambridge, UK: Cambridge University 
Press, 1973.) 


Figure 1-36 The yeast Saccharomyces 
cerevisiae. (A) A scanning electron 
micrograph of a cluster of the cells. This 
species is also known as budding yeast; 

it proliferates by forming a protrusion or 
bud that enlarges and then separates 
from the rest of the original cell. Many cells 
with buds are visible in this micrograph. 
(B) A transmission electron micrograph of 
a cross section of a yeast cell, showing 

its nucleus, mitochondrion, and thick cell 
wall. (A, courtesy of Ira Herskowitz and Eric 
Schabatach.) 
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Figure 1-37 The reproductive cycles of the yeast S. cerevisiae. 
Depending on environmental conditions and on details of the genotype, 
cells of this species can exist in either a diploid (2n) state, with a double 
chromosome set, or a haploid (n) state, with a single chromosome set. The 
diploid form can either proliferate by ordinary cell-division cycles or undergo 
meiosis to produce haploid cells. The haploid form can either proliferate by 
ordinary cell-division cycles or undergo sexual fusion with another haploid 
cell to become diploid. Meiosis is triggered by starvation and gives rise to 
spores—haploid cells in a dormant state, resistant to harsh environmental 
conditions. 


and studies on yeasts (using both S. cerevisiae and other species) have provided 
a key to many crucial processes, including the eukaryotic cell-division cycle—the 
critical chain of events by which the nucleus and all the other components of a cell 
are duplicated and parceled out to create two daughter cells from one. The control 
system that governs this process has been so well conserved over the course of 
evolution that many of its components can function interchangeably in yeast and 
human cells: if a mutant yeast lacking an essential yeast cell-division-cycle gene 
is supplied with a copy of the homologous cell-division-cycle gene from a human, 
the yeast is cured of its defect and becomes able to divide normally. 


The Expression Levels of All the Genes of An Organism 
Can Be Monitored Simultaneously 


The complete genome sequence of S. cerevisiae, determined in 1997, consists 
of approximately 13,117,000 nucleotide pairs, including the small contribution 
(78,520 nucleotide pairs) of the mitochondrial DNA. This total is only about 2.5 
times as much DNA as there is in E. coli, and it codes for only 1.5 times as many 
distinct proteins (about 6600 in all). The way of life of S. cerevisiae is similar in 
many ways to that of a bacterium, and it seems that this yeast has likewise been 
subject to selection pressures that have kept its genome compact. 

Knowledge of the complete genome sequence of any organism—be it a yeast 
or a human—opens up new perspectives on the workings of the cell: things that 
once seemed impossibly complex now seem within our grasp. Using techniques 
described in Chapter 8, it is now possible, for example, to monitor, simultane- 
ously, the amount of mRNA transcript that is produced from every gene in the 
yeast genome under any chosen conditions, and to see how this whole pattern of 
gene activity changes when conditions change. The analysis can be repeated with 
mRNA prepared from mutant cells lacking a chosen gene—any gene that we care 
to test. In principle, this approach provides a way to reveal the entire system of 
control relationships that govern gene expression—not only in yeast cells, but in 
any organism whose genome sequence is known. 


Arabidopsis Has Been Chosen Out of 300,000 Species 
As a Model Plant 


The large multicellular organisms that we see around us—the flowers and trees 
and animals—seem fantastically varied, but they are much closer to one another 
in their evolutionary origins, and more similar in their basic cell biology, than 
the great host of microscopic single-celled organisms. Thus, while bacteria and 
archaea are separated by perhaps 3.5 billion years of evolution, vertebrates and 
insects are separated by about 700 million years, fish and mammals by about 450 
million years, and the different species of flowering plants by only about 150 mil- 
lion years. 

Because of the close evolutionary relationship between all flowering plants, 
we Can, once again, get insight into the cell and molecular biology of this whole 
class of organisms by focusing on just one or a few species for detailed analysis. 
Out of the several hundred thousand species of flowering plants on Earth today, 
molecular biologists have chosen to concentrate their efforts on a small weed, 
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the common Thale cress Arabidopsis thaliana (Figure 1-38), which can be grown 
indoors in large numbers and produces thousands of offspring per plant after 
8-10 weeks. Arabidopsis has a total genome size of approximately 220 million 
nucleotide pairs, about 17 times the size of yeast’s (see Table 1-2). 


The World of Animal Cells Is Represented By a Worm, a Fly, a 
Fish, a Mouse, and a Human 


Multicellular animals account for the majority of all named species of living 
organisms, and for the largest part of the biological research effort. Five species 
have emerged as the foremost model organisms for molecular genetic studies. In 
order of increasing size, they are the nematode worm Caenorhabditis elegans, the 
fly Drosophila melanogaster, the zebrafish Danio rerio, the mouse Mus musculus, 
and the human, Homo sapiens. Each has had its genome sequenced. 

Caenorhabditis elegans (Figure 1-39) is a small, harmless relative of the eel- 
worm that attacks crops. With a life cycle of only a few days, an ability to survive in 
a freezer indefinitely in a state of suspended animation, a simple body plan, and 
an unusual life cycle that is well suited for genetic studies (described in Chapter 
21), it is an ideal model organism. C. elegans develops with clockwork precision 
from a fertilized egg cell into an adult worm with exactly 959 body cells (plus a 
variable number of egg and sperm cells)—an unusual degree of regularity for an 
animal. We now have a minutely detailed description of the sequence of events by 
which this occurs, as the cells divide, move, and change their character according 
to strict and predictable rules. The genome of 130 million nucleotide pairs codes 
for about 21,000 proteins, and many mutants and other tools are available for the 
testing of gene functions. Although the worm has a body plan very different from 
our own, the conservation of biological mechanisms has been sufficient for the 
worm to be a model for many of the developmental and cell-biological processes 
that occur in the human body. Thus, for example, studies of the worm have been 
critical for helping us to understand the programs of cell division and cell death 
that determine the number of cells in the body—a topic of great importance for 
both developmental biology and cancer research. 


Studies in Drosophila Provide a Key to Vertebrate Develooment 


The fruit fly Drosophila melanogaster (Figure 1-40) has been used as a model 
genetic organism for longer than any other; in fact, the foundations of classical 
genetics were built to a large extent on studies of this insect. Over 80 years ago, it 
provided, for example, definitive proof that genes—the abstract units of heredi- 
tary information—are carried on chromosomes, concrete physical objects whose 
behavior had been closely followed in the eukaryotic cell with the light micro- 
scope, but whose function was at first unknown. The proof depended on one of 
the many features that make Drosophila peculiarly convenient for genetics—the 
giant chromosomes, with characteristic banded appearance, that are visible in 
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Figure 1-39 Caenorhabditis elegans, the first multicellular organism to have its 
complete genome sequence determined. This small nematode, about 1 mm long, lives in 
the soil. Most individuals are hermaphrodites, producing both eggs and sperm. (Courtesy of 
Maria Gallegos, University of Wisconsin, Madison.) 
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Figure 1-38 Arabidopsis thaliana, the 
plant chosen as the primary model 


for studying plant molecular genetics. 


(Courtesy of Toni Hayden and the John 
Innes Foundation.) 
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some of its cells (Figure 1-41). Specific changes in the hereditary information, 
manifest in families of mutant flies, were found to correlate exactly with the loss 
or alteration of specific giant-chromosome bands. 

In more recent times, Drosophila, more than any other organism, has shown us 
how to trace the chain of cause and effect from the genetic instructions encoded 
in the chromosomal DNA to the structure of the adult multicellular body. Dro- 
sophila mutants with body parts strangely misplaced or mispatterned provided 
the key to the identification and characterization of the genes required to make 
a properly structured body, with gut, limbs, eyes, and all the other parts in their 
correct places. Once these Drosophila genes were sequenced, the genomes of ver- 
tebrates could be scanned for homologs. These were found, and their functions 
in vertebrates were then tested by analyzing mice in which the genes had been 
mutated. The results, as we see later in the book, reveal an astonishing degree of 
similarity in the molecular mechanisms that govern insect and vertebrate devel- 
opment (discussed in Chapter 21). 

The majority of all named species of living organisms are insects. Even if Dro- 
sophila had nothing in common with vertebrates, but only with insects, it would 
still be an important model organism. But if understanding the molecular genet- 
ics of vertebrates is the goal, why not simply tackle the problem head-on? Why 
sidle up to it obliquely, through studies in Drosophila? 

Drosophila requires only 9 days to progress from a fertilized egg to an adult; it 
is vastly easier and cheaper to breed than any vertebrate, and its genome is much 
smaller—about 200 million nucleotide pairs, compared with 3200 million for a 
human. This genome codes for about 15,000 proteins, and mutants can now be 
obtained for essentially any gene. But there is also another, deeper reason why 
genetic mechanisms that are hard to discover in a vertebrate are often read- 
ily revealed in the fly. This relates, as we now explain, to the frequency of gene 
duplication, which is substantially greater in vertebrate genomes than in the fly 
genome and has probably been crucial in making vertebrates the complex and 
subtle creatures that they are. 


The Vertebrate Genome Is a Product of Repeated Duplications 


Almost every gene in the vertebrate genome has paralogs—other genes in the 
same genome that are unmistakably related and must have arisen by gene dupli- 
cation. In many cases, a whole cluster of genes is closely related to similar clusters 
present elsewhere in the genome, suggesting that genes have been duplicated in 
linked groups rather than as isolated individuals. According to one hypothesis, at 
an early stage in the evolution of the vertebrates, the entire genome underwent 
duplication twice in succession, giving rise to four copies of every gene. 

The precise course of vertebrate genome evolution remains uncertain, because 
many further evolutionary changes have occurred since these ancient events. 


Figure 1-40 Drosophila melanogaster. 
Molecular genetic studies on this fly have 
provided the main key to understanding 
how all animals develop from a fertilized 
egg into an adult. (From E.B. Lewis, 
Science 221:cover, 1983. With permission 
from AAAS.) 
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Figure 1-41 Giant chromosomes from 
salivary gland cells of Drosophila. 
Because many rounds of DNA replication 
have occurred without an intervening cell 
division, each of the chromosomes in 
these unusual cells contains over 1000 
identical DNA molecules, all aligned in 
register. This makes them easy to see in 
the light microscope, where they display 

a characteristic and reproducible banding 
pattern. Specific bands can be identified as 
the locations of specific genes: a mutant 
fly with a region of the banding pattern 
missing shows a phenotype reflecting loss 
of the genes in that region. Genes that are 
being transcribed at a high rate correspond 
to bands with a “puffed” appearance. 

The bands stained dark brown in the 
micrograph are sites where a particular 
regulatory protein is bound to the DNA. 
(Courtesy of B. Zink and R. Paro, from 

R. Paro, Trends Genet. 6:416-421, 1990. 
With permission from Elsevier.) 


GENETIC INFORMATION IN EUKARYOTES 


Genes that were once identical have diverged; many of the gene copies have been 
lost through disruptive mutations; some have undergone further rounds of local 
duplication; and the genome, in each branch of the vertebrate family tree, has 
suffered repeated rearrangements, breaking up most of the original gene order- 
ings. Comparison of the gene order in two related organisms, such as the human 
and the mouse, reveals that—on the time scale of vertebrate evolution—chro- 
mosomes frequently fuse and fragment to move large blocks of DNA sequence 
around. Indeed, it is possible, as discussed in Chapter 4, that the present state 
of affairs is the result of many separate duplications of fragments of the genome, 
rather than duplications of the genome as a whole. 

There is, however, no doubt that such whole-genome duplications do occur 
from time to time in evolution, for we can see recent instances in which dupli- 
cated chromosome sets are still clearly identifiable as such. The frog genus Xeno- 
pus, for example, comprises a set of closely similar species related to one another 
by repeated duplications or triplications of the whole genome. Among these frogs 
are X. tropicalis, with an ordinary diploid genome; the common laboratory spe- 
cies X. laevis, with a duplicated genome and twice as much DNA per cell; and 
X. ruwenzoriensis, with a sixfold reduplication of the original genome and six 
times as much DNA per cell (108 chromosomes, compared with 36 in X. laevis, for 
example). These species are estimated to have diverged from one another within 
the past 120 million years (Figure 1-42). 


The Frog and the Zebrafish Provide Accessible Models for 
Vertebrate Develooment 


Frogs have long been used to study the early steps of embryonic development 
in vertebrates, because their eggs are big, easy to manipulate, and fertilized out- 
side of the animal, so that the subsequent development of the early embryo is 
easily followed (Figure 1-43). Xenopus laevis, in particular, continues to be an 
important model organism, even though it is poorly suited for genetic analysis 
(Movie 1.6 and see Movie 21.1). 

The zebrafish Danio rerio has similar advantages, but without this drawback. 
Its genome is compact—only half as big as that of a mouse or a human—and it 
has a generation time of only about three months. Many mutants are known, and 
genetic engineering is relatively easy. The zebrafish has the added virtue that it is 
transparent for the first two weeks of its life, so that one can watch the behavior 
of individual cells in the living organism (see Movie 21.2). All this has made it an 
increasingly important model vertebrate (Figure 1-44). 


The Mouse Is the Predominant Mammalian Model Organism 


Mammals have typically two times as many genes as Drosophila, a genome that 
is 16 times larger, and millions or billions of times as many cells in their adult 
bodies. In terms of genome size and function, cell biology, and molecular mech- 
anisms, mammals are nevertheless a highly uniform group of organisms. Even 
anatomically, the differences among mammals are chiefly a matter of size and 
proportions; it is hard to think of a human body part that does not have a counter- 
part in elephants and mice, and vice versa. Evolution plays freely with quantita- 
tive features, but it does not readily change the logic of the structure. 


Figure 1-42 Two species of the frog genus Xenopus. X. tropicalis, above, 
has an ordinary diploid genome; X. laevis, below, has twice as much DNA per 
cell. From the banding patterns of their chromosomes and the arrangement 
of genes along them, as well as from comparisons of gene sequences, it is 
clear that the large-genome species have evolved through duplications of 

the whole genome. These duplications are thought to have occurred in the 
aftermath of matings between frogs of slightly divergent Xenopus species. 
(Courtesy of E. Amaya, M. Offield, and R. Grainger, Trends Genet. 14:253- 
255, 1998. With permission from Elsevier.) 
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Figure 1-43 Stages in the normal development of a frog. These drawings 
show the development of a Rana pipiens tadpole from a fertilized egg. The 
entire process takes place outside of the mother, making the mechanisms 
involved readily accessible for experimental studies. (From W. Shumway, 
Anat. Rec. 78:139-147, 1940.) 


For a more exact measure of how closely mammalian species resemble one 
another genetically, we can compare the nucleotide sequences of corresponding 
(orthologous) genes, or the amino acid sequences of the proteins that these genes 
encode. The results for individual genes and proteins vary widely. But typically, if 
we line up the amino acid sequence of a human protein with that of the ortholo- 
gous protein from, say, an elephant, about 85% of the amino acids are identical. 
A similar comparison between human and bird shows an amino acid identity of 
about 70%—twice as many differences, because the bird and the mammalian lin- 
eages have had twice as long to diverge as those of the elephant and the human 
(Figure 1-45). 

The mouse, being small, hardy, and a rapid breeder, has become the foremost 
model organism for experimental studies of vertebrate molecular genetics. Many 
naturally occurring mutations are known, often mimicking the effects of corre- 
sponding mutations in humans (Figure 1-46). Methods have been developed, 
moreover, to test the function of any chosen mouse gene, or of any noncoding 
portion of the mouse genome, by artificially creating mutations in it, as we explain 
later in the book. 

Just one made-to-order mutant mouse can provide a wealth of information for 
the cell biologist. It reveals the effects of the chosen mutation in a host of different 
contexts, simultaneously testing the action of the gene in all the different kinds of 
cells in the body that could in principle be affected. 


Humans Report on Their Own Peculiarities 


As humans, we have a special interest in the human genome. We want to know the 
full set of parts from which we are made, and to discover how they work. But even 
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Figure 1-44 Zebrafish as a model for 
studies of vertebrate development. These 
small, hardy tropical fish are convenient 
for genetic studies. Additionally, they have 
transparent embryos that develop outside 
of the mother, so that one can clearly 
observe cells moving and changing their 
character in the living organism throughout 
its development. (A) Adult fish. (B) An 
embryo 24 hours after fertilization. (A, with 
permission from Steve Baskauf; B, from 
M. Rhinn et al., Neural Dev. 4:12, 2009.) 
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if you were a mouse, preoccupied with the molecular biology of mice, humans 
would be attractive as model genetic organisms, because of one special property: 
through medical examinations and self-reporting, we catalog our own genetic 
(and other) disorders. The human population is enormous, consisting today 
of some 7 billion individuals, and this self-documenting property means that a 
huge database of information exists on human mutations. The human genome 
sequence of more than 3 billion nucleotide pairs has been determined for thou- 
sands of different people, making it easier than ever before to identify at a molecu- 
lar level the precise genetic change responsible for any given human mutant phe- 
notype. 

By drawing together the insights from humans, mice, fish, flies, worms, yeasts, 
plants, and bacteria—using gene sequence similarities to map out the correspon- 
dences between one model organism and another—we are enriching our under- 
standing of them all. 
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Figure 1-45 Times of divergence of 
different vertebrates. The scale on the left 
shows the estimated date and geological 
era of the last common ancestor of each 
specified pair of animals. Each time 
estimate is based on comparisons of the 
amino acid sequences of orthologous 
proteins; the longer the animals of a pair 
have had to evolve independently, the 
smaller the percentage of amino acids 
that remain identical. The time scale 

has been calibrated to match the fossil 
evidence showing that the last common 
ancestor of mammals and birds lived 
310 million years ago. 

The figures on the right give data on 
sequence divergence for one particular 
protein—the a chain of hemoglobin. Note 
that although there is a clear general trend 
of increasing divergence with increasing 
time for this protein, there are irregularities 
that are thought to reflect the action of 
natural selection driving especially rapid 
changes of hemoglobin sequence when 
the organisms experienced special 
physiological demands. Some proteins, 
subject to stricter functional constraints, 
evolve much more slowly than hemoglobin, 
others as much as five times faster. All this 
gives rise to substantial uncertainties in 
estimates of divergence times, and some 
experts believe that the major groups of 
mammals diverged from one another as 
much as 60 million years more recently 
than shown here. (Adapted from S. Kumar 
and S.B. Hedges, Nature 392:91 7-920, 
1998. With permission from Macmillan 
Publishers Ltd.) 


Figure 1-46 Human and mouse: similar 
genes and similar development. The 
human baby and the mouse shown 

here have similar white patches on their 
foreheads because both have mutations in 
the same gene (called Kit), required for the 
development and maintenance of pigment 
cells. (Courtesy of R.A. Fleischman.) 
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We Are All Different in Detail 


What precisely do we mean when we speak of the human genome? Whose 
genome? On average, any two people taken at random differ in about one or two 
in every 1000 nucleotide pairs in their DNA sequence. The genome of the human 
species is, properly speaking, a very complex thing, embracing the entire pool of 
variant genes found in the human population. Knowledge of this variation is help- 
ing us to understand, for example, why some people are prone to one disease, 
others to another; why some respond well to a drug, others badly. It is also provid- 
ing clues to our history—the population movements and minglings of our ances- 
tors, the infections they suffered, the diets they ate. All these things have left traces 
in the variant forms of genes that survive today in the human communities that 
populate the globe. 


To Understand Cells and Organisms Will Require Mathematics, 
Computers, and Quantitative Information 


Empowered by knowledge of complete genome sequences, we can list the genes, 
proteins, and RNA molecules in a cell, and we have methods that allow us to begin 
to depict the complex web of interactions between them. But how are we to turn 
all this information into an understanding of how cells work? Even for a single cell 
type belonging to a single species of organism, the current deluge of data seems 
overwhelming. The sort of informal reasoning on which biologists usually rely 
seems totally inadequate in the face of such complexity. 

In fact, the difficulty is more than just a matter of information overload. Bio- 
logical systems are, for example, full of feedback loops, and the behavior of even 
the simplest of systems with feedback is remarkably difficult to predict by intu- 
ition alone (Figure 1-47); small changes in parameters can cause radical changes 
in outcome. To go from a circuit diagram to a prediction of the behavior of the 
system, we need detailed quantitative information, and to draw deductions from 
that information we need mathematics and computers. 

Such tools for quantitative reasoning are essential, but they are not all-power- 
ful. You might think that, knowing how each protein influences each other pro- 
tein, and how the expression of each gene is regulated by the products of others, 
we should soon be able to calculate how the cell as a whole will behave, just as 
an astronomer can calculate the orbits of the planets, or a chemical engineer can 
calculate the flows through a chemical plant. But any attempt to perform this feat 
for anything close to an entire living cell rapidly reveals the limits of our present 
knowledge. The information we have, plentiful as it is, is full of gaps and uncer- 
tainties. Moreover, it is largely qualitative rather than quantitative. Most often, cell 
biologists studying the cell’s control systems sum up their knowledge in simple 
schematic diagrams—this book is full of them—rather than in numbers, graphs, 
and differential equations. 

To progress from qualitative descriptions and intuitive reasoning to quantita- 
tive descriptions and mathematical deduction is one of the biggest challenges for 
contemporary cell biology. So far, the challenge has been met only for a few very 
simple fragments of the machinery of living cells—subsystems involving a hand- 
ful of different proteins, or two or three cross-regulatory genes, where theory and 
experiment go closely hand in hand. We discuss some of these examples later in 
the book and devote the entire final section of Chapter 8 to the role of quantitation 
in cell biology. 

Knowledge and understanding bring the power to intervene—with humans, 
to avoid or prevent disease; with plants, to create better crops; with bacteria, to 
turn them to our own uses. All these biological enterprises are linked, because the 
genetic information of all living organisms is written in the same language. The 
new-found ability of molecular biologists to read and decipher this language has 
already begun to transform our relationship to the living world. The account of 
cell biology in the subsequent chapters will, we hope, equip the reader to under- 
stand, and possibly to contribute to, the great scientific adventure of the twen- 
ty-first century. 
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Figure 1-47 A very simple regulatory 
circuit—a single gene regulating its 
own expression by the binding of its 
protein product to its own regulatory 
DNA. Simple schematic diagrams such 

as this are found throughout this book. 
They are often used to summarize what 
we know, but they leave many questions 
unanswered. When the protein binds, does 
it inhibit or stimulate transcription from the 
gene? How steeply does the transcription 
rate depend on the protein concentration? 
How long, on average, does a molecule of 
the protein remain bound to the DNA? How 
long does it take to make each molecule 
of mRNA or protein, and how quickly does 
each type of molecule get degraded? 

As explained in Chapter 8, mathematical 
modeling shows that we need quantitative 
answers to all these and other questions 
before we can predict the behavior of 

even this single-gene system. For different 
parameter values, the system may settle 
to a unique steady state; or it may behave 
as a switch, capable of existing in one or 
another of a set of alternative states; or it 
may oscillate; or it may show large random 
fluctuations. 


CHAPTER 1 END-OF-CHAPTER PROBLEMS 


Summary 


Eukaryotic cells, by definition, keep their DNA in a separate membrane-enclosed 
compartment, the nucleus. They have, in addition, a cytoskeleton for support and 
movement, elaborate intracellular compartments for digestion and secretion, the 
capacity (in many species) to engulf other cells, and a metabolism that depends on 
the oxidation of organic molecules by mitochondria. These properties suggest that 
eukaryotes may have originated as predators on other cells. Mitochondria—and, 
in plants, chloroplasts—contain their own genetic material, and they evidently 
evolved from bacteria that were taken up into the cytoplasm of ancient cells and 
survived as symbionts. 

Eukaryotic cells typically have 3-30 times as many genes as prokaryotes, and 
often thousands of times more noncoding DNA. The noncoding DNA allows for 
great complexity in the regulation of gene expression, as required for the construc- 
tion of complex multicellular organisms. Many eukaryotes are, however, unicel- 
lular—among them the yeast Saccharomyces cerevisiae, which serves as a simple 
model organism for eukaryotic cell biology, revealing the molecular basis of many 
fundamental processes that have been strikingly conserved during a billion years of 
evolution. A small number of other organisms have also been chosen for intensive 
study: a worm, a fly, a fish, and the mouse serve as “model organisms” for multicel- 
lular animals; and a small milkweed serves as a model for plants. 

Powerful new technologies such as genome sequencing are producing striking 
advances in our knowledge of human beings, and they are helping to advance our 
understanding of human health and disease. But living systems are incredibly com- 
plex, and mammalian genomes contain multiple closely related homologs of most 
genes. This genetic redundancy has allowed diversification and specialization of 
genes for new purposes, but it also makes biological mechanisms harder to deci- 
pher. For this reason, simpler model organisms have played a key part in revealing 
universal genetic mechanisms of animal development, and research using these 
systems remains critical for driving scientific and medical advances. 


PROBLEMS 


Which statements are true? Explain why or why not. 


1-1 Each member of the human hemoglobin gene 
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WHAT WE DON’T KNOW 


e What new approaches might 
orovide a clearer view of the anaerobic 
archaeon that is thought to have 
formed the nucleus of the first 
eukaryotic cell? How did its symbiosis 
with an aerobic bacterium lead to the 
mitochondrion? Somewhere on Earth, 
are there cells not yet identified that 
can fill in the details of how eukaryotic 
cells originated? 


e DNA sequencing has revealed a rich 
and previously undiscovered world 

of microbial cells, the vast majority 

of which fail to grow in a laboratory. 
How might these cells be made more 
accessible for detailed study? 


e What new model cells or organisms 
should be developed for scientists to 
study? Why might a concerted focus 
on these models speed progress 
toward understanding a critical 
aspect of cell function that is poorly 
understood? 


e How did the first cell membranes 
arise? 


illustrated in Figure Q1-1. Only one in a million comput- 


family, which consists of seven genes arranged in two clus- 
ters on different chromosomes, is an ortholog to all of the 
other members. 


1-2 Horizontal gene transfer is more prevalent in sin- 
gle-celled organisms than in multicellular organisms. 


1-3 Most of the DNA sequences in a bacterial genome 
code for proteins, whereas most of the DNA sequences in 
the human genome do not. 


Discuss the following problems. 


1-4 Since it was deciphered four decades ago, some 
have claimed that the genetic code must be a frozen acci- 
dent, while others have argued that it was shaped by nat- 
ural selection. A striking feature of the genetic code is its 
inherent resistance to the effects of mutation. For example, 
a change in the third position of a codon often specifies the 
same amino acid or one with similar chemical properties. 
The natural code resists mutation more effectively (is less 
susceptible to error) than most other possible versions, as 


er-generated “random” codes is more error-resistant than 
the natural genetic code. Does the extraordinary mutation 
resistance of the genetic code argue in favor of its origin as 
a frozen accident or as a result of natural selection? Explain 
your reasoning. 
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Figure Q1-1 Susceptibility to mutation of the natural code shown 
relative to that of millions of computer-generated alternative genetic 
codes (Problem 1—4). Susceptibility measures the average change in 
amino acid properties caused by random mutations in a genetic code. 
A small value indicates that mutations tend to cause minor changes. 
(Data courtesy of Steve Freeland.) 
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1-5 You have begun to characterize a sample obtained 
from the depths of the oceans on Europa, one of Jupi- 
ter’s moons. Much to your surprise, the sample contains 
a life-form that grows well in a rich broth. Your prelimi- 
nary analysis shows that it is cellular and contains DNA, 
RNA, and protein. When you show your results to a col- 
league, she suggests that your sample was contaminated 
with an organism from Earth. What approaches might 
you try to distinguish between contamination and a 
novel cellular life-form based on DNA, RNA, and protein? 


1-6 — Itisnotso difficult to imagine what it means to feed 
on the organic molecules that living things produce. That 
is, after all, what we do. But what does it mean to “feed” on 
sunlight, as phototrophs do? Or, even stranger, to “feed” on 
rocks, as lithotrophs do? Where is the “food,” for example, 
in the mixture of chemicals (H2S, H2, CO, Mnt, Fe**, Ni**, 
CH,, and NH,"*) that spews from a hydrothermal vent? 


1-7 How many possible different trees (branching pat- 
terns) can in theory be drawn to display the evolution of 
bacteria, archaea, and eukaryotes, assuming that they all 
arose from a common ancestor? 


1-8 The genes for ribosomal RNA are highly conserved 
(relatively few sequence changes) in all organisms on 
Earth; thus, they have evolved very slowly over time. Were 
ribosomal RNA genes “born” perfect? 


1-9 Genes participating in informational processes 
such as replication, transcription, and translation are 
transferred between species much less often than are 
genes involved in metabolism. The basis for this inequality 
is unclear at present, but one suggestion is that it relates 
to the underlying complexity of the two types of processes. 
Informational processes tend to involve large aggregates 
of different gene products, whereas metabolic reactions 
are usually catalyzed by enzymes composed of a single 
protein. Why would the complexity of the underlying pro- 
cess—informational or metabolic—have any effect on the 
rate of horizontal gene transfer? 


1-10 Animal cells have neither cell walls nor chloro- 
plasts, whereas plant cells have both. Fungal cells are 
somewhere in between; they have cell walls but lack chlo- 
roplasts. Are fungal cells more likely to be animal cells that 
gained the ability to make cell walls, or plant cells that lost 
their chloroplasts? This question represented a difficult 
issue for early investigators who sought to assign evolu- 
tionary relationships based solely on cell characteristics 
and morphology. How do you suppose that this question 
was eventually decided? 
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Figure Q1-2 Phylogenetic tree for hemoglobin genes from a variety 
of species (Problem 1-11). The legumes are highlighted in green. The 
lengths of lines that connect the present-day species represent the 
evolutionary distances that separate them. 


1-11 When plant hemoglobin genes were first discov- 
ered in legumes, it was so surprising to find a gene typi- 
cal of animal blood that it was hypothesized that the plant 
gene arose by horizontal transfer from an animal. Many 
more hemoglobin genes have now been sequenced, and 
a phylogenetic tree based on some of these sequences is 
shown in Figure Q1-2. 


A. Does this tree support or refute the hypothesis that 
the plant hemoglobins arose by horizontal gene transfer? 
B. Supposing that the plant hemoglobin genes were 


originally derived from a parasitic nematode, for example, 
what would you expect the phylogenetic tree to look like? 


1-12 Rates of evolution appear to vary in different lin- 
eages. For example, the rate of evolution in the rat lineage 
is significantly higher than in the human lineage. These 
rate differences are apparent whether one looks at changes 
in nucleotide sequences that encode proteins and are sub- 
ject to selective pressure or at changes in noncoding nucle- 
otide sequences, which are not under obvious selection 
pressure. Can you offer one or more possible explanations 
for the slower rate of evolutionary change in the human 
lineage versus the rat lineage? 
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Cell Chemistry and 
Bioenergetics 


It is at first sight difficult to accept the idea that living creatures are merely chemi- 
cal systems. Their incredible diversity of form, their seemingly purposeful behav- 
ior, and their ability to grow and reproduce all seem to set them apart from the 
world of solids, liquids, and gases that chemistry normally describes. Indeed, 
until the nineteenth century animals were believed to contain a Vital Force—an 
“animus” —that was responsible for their distinctive properties. 

We now know that there is nothing in living organisms that disobeys chemical 
or physical laws. However, the chemistry of life is indeed special. First, it is based 
overwhelmingly on carbon compounds, the study of which is known as organic 
chemistry. Second, cells are 70% water, and life depends largely on chemical reac- 
tions that take place in aqueous solution. Third, and most important, cell chem- 
istry is enormously complex: even the simplest cell is vastly more complicated 
in its chemistry than any other chemical system known. In particular, although 
cells contain a variety of small carbon-containing molecules, most of the carbon 
atoms present are incorporated into enormous polymeric molecules—chains of 
chemical subunits linked end-to-end. It is the unique properties of these macro- 
molecules that enable cells and organisms to grow and reproduce—as well as to 
do all the other things that are characteristic of life. 


THE CHEMICAL COMPONENTS OF A CELL 


Living organisms are made of only a small selection of the 92 naturally occurring 
elements, four of which—carbon (C), hydrogen (H), nitrogen (N), and oxygen 
(O)—make up 96.5% of an organism’s weight (Figure 2-1). The atoms of these ele- 
ments are linked together by covalent bonds to form molecules (see Panel 2-1, pp. 
90-91). Because covalent bonds are typically 100 times stronger than the thermal 
energies within a cell, they resist being pulled apart by thermal motions, and they 
are normally broken only during specific chemical reactions with other atoms and 
molecules. Two different molecules can be held together by noncovalent bonds, 
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FROM FOOD 


Figure 2-1 The main elements in cells, 
highlighted in the periodic table. When 
ordered by their atomic number and 
arranged in this manner, elements fall 

into vertical columns that show similar 
properties. Atoms in the same vertical 
column must gain (or lose) the same 
number of electrons to attain a filled outer 
shell, and they thus behave similarly in 
bond or ion formation. Thus, for example, 
Mg and Ca tend to give away the two 
electrons in their outer shells. C, N, and O 
occur in the same horizontal row, and tend 
to complete their second shells by sharing 
electrons. 

The four elements highlighted in red 
constitute 99% of the total number of 
atoms present in the human body. An 
additional seven elements, highlighted in 
blue, together represent about 0.9% of 
the total. The elements shown in green are 
required in trace amounts by humans. It 
remains unclear whether those elements 
shown in yellow are essential in humans. 
The chemistry of life, it seems, is therefore 
predominantly the chemistry of lighter 
elements. The atomic weights shown here 
are those of the most common isotope of 
each element. 


44 Chapter 2: Cell Chemistry and Bioenergetics 


ATP 
average hydrolysis C-C bond 
thermal motions in cell breakage 
ENERGY - 
CONTENT o MĀ 
(kJ/mole) 1 10 100 1000 10,000 kJ 
| 
noncovalent bond green complete 
breakage in water light glucose oxidation 


which are much weaker (Figure 2-2). We shall see later that noncovalent bonds 
are important in the many situations where molecules have to associate and dis- 
sociate readily to carry out their biological functions. 


Water Is Held Together by Hydrogen Bonds 


The reactions inside a cell occur in an aqueous environment. Life on Earth began 
in the ocean, and the conditions in that primeval environment put a permanent 
stamp on the chemistry of living things. Life therefore hinges on the chemical 
properties of water, which are reviewed in Panel 2-2, pp. 92-93. 

In each water molecule (H20) the two H atoms are linked to the O atom by 
covalent bonds. The two bonds are highly polar because the O is strongly attrac- 
tive for electrons, whereas the H is only weakly attractive. Consequently, there is 
an unequal distribution of electrons in a water molecule, with a preponderance 
of positive charge on the two H atoms and of negative charge on the O. When 
a positively charged region of one water molecule (that is, one of its H atoms) 
approaches a negatively charged region (that is, the O) of a second water mole- 
cule, the electrical attraction between them can result in a hydrogen bond. These 
bonds are much weaker than covalent bonds and are easily broken by the ran- 
dom thermal motions that reflect the heat energy of the molecules. Thus, each 
bond lasts only a short time. But the combined effect of many weak bonds can be 
profound. For example, each water molecule can form hydrogen bonds through 
its two H atoms to two other water molecules, producing a network in which 
hydrogen bonds are being continually broken and formed. It is only because of 
the hydrogen bonds that link water molecules together that water is a liquid at 
room temperature—with a high boiling point and high surface tension—rather 
than a gas. 

Molecules, such as alcohols, that contain polar bonds and that can form 
hydrogen bonds with water dissolve readily in water. Molecules carrying charges 
(ions) likewise interact favorably with water. Such molecules are termed hydro- 
philic, meaning that they are water-loving. Many of the molecules in the aqueous 
environment of a cell necessarily fall into this category, including sugars, DNA, 
RNA, and most proteins. Hydrophobic (water-hating) molecules, by contrast, are 
uncharged and form few or no hydrogen bonds, and so do not dissolve in water. 
Hydrocarbons are an important example. In these molecules all of the H atoms are 
covalently linked to C atoms by a largely nonpolar bond; thus they cannot form 
effective hydrogen bonds to other molecules (see Panel 2-1, p. 90). This makes the 
hydrocarbon as a whole hydrophobic—a property that is exploited in cells, whose 
membranes are constructed from molecules that have long hydrocarbon tails, as 
we see in Chapter 10. 


Four Types of Noncovalent Attractions Help Bring Molecules 
Together in Cells 


Much of biology depends on the specific binding of different molecules caused by 
three types of noncovalent bonds: electrostatic attractions (ionic bonds), hydro- 
gen bonds, and van der Waals attractions; and on a fourth factor that can push 
molecules together: the hydrophobic force. The properties of the four types of 
noncovalent attractions are presented in Panel 2-3 (pp. 94-95). Although each 


Figure 2-2 Some energies important 
for cells. A crucial property of any bond— 
covalent or noncovalent—is its strength. 
Bond strength is measured by the amount 
of energy that must be supplied to break 
it, expressed in units of either kilojoules 
per mole (kJ/mole) or kilocalories per mole 
(kcal/mole). Thus if 100 kJ of energy must 
be supplied to break 6 x 102° bonds of 

a specific type (that is, 1 mole of these 
bonds), then the strength of that bond is 
100 kJ/mole. Note that, in this diagram, 
energies are compared on a logarithmic 
scale. Typical strengths and lengths of the 
main classes of chemical bonds are given 
in Table 2-1. 

One joule (J) is the amount of energy 
required to move an object a distance of 
one meter against a force of one Newton. 
This measure of energy is derived from the 
SI units (Systeme Internationale d’Unites) 
universally employed by physical scientists. 
A second unit of energy, often used by 
cell biologists, is the kilocalorie (kcal); one 
calorie is the amount of energy needed to 
raise the temperature of 1 gram of water by 
1°C. One kJ is equal to 0.239 kcal (1 kcal 
= 4.18 kJ). 
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Figure 2-3 Schematic indicating how two macromolecules with 
complementary surfaces can bind tightly to one another through 
noncovalent interactions. Noncovalent chemical bonds have less than 
1/20 the strength of a covalent bond. They are able to produce tight 

binding only when many of them are formed simultaneously. Although only 
electrostatic attractions are illustrated here, in reality all four noncovalent 
forces often contribute to holding two macromolecules together (Movie 2.1). 


individual noncovalent attraction would be much too weak to be effective in the 
face of thermal motions, their energies can sum to create a strong force between 
two separate molecules. Thus sets of noncovalent attractions often allow the com- 
plementary surfaces of two macromolecules to hold those two macromolecules 
together (Figure 2-3). 

Table 2-1 compares noncovalent bond strengths to that of a typical covalent 
bond, both in the presence and in the absence of water. Note that, by forming 
competing interactions with the involved molecules, water greatly reduces the 
strength of both electrostatic attractions and hydrogen bonds. 

The structure of a typical hydrogen bond is illustrated in Figure 2-4. This bond 
represents a special form of polar interaction in which an electropositive hydro- 
gen atom is shared by two electronegative atoms. Its hydrogen can be viewed as a 
proton that has partially dissociated from a donor atom, allowing it to be shared 
by a second acceptor atom. Unlike a typical electrostatic interaction, this bond is 
highly directional—being strongest when a straight line can be drawn between all 
three of the involved atoms. 

The fourth effect that often brings molecules together in water is not, strictly 
speaking, a bond at all. However, a very important hydrophobic force is caused by 
a pushing of nonpolar surfaces out of the hydrogen-bonded water network, where 
they would otherwise physically interfere with the highly favorable interactions 
between water molecules. Bringing any two nonpolar surfaces together reduces 
their contact with water; in this sense, the force is nonspecific. Nevertheless, we 
shall see in Chapter 3 that hydrophobic forces are central to the proper folding of 
protein molecules. 


Some Polar Molecules Form Acids and Bases in Water 


One of the simplest kinds of chemical reaction, and one that has profound signif- 
icance in cells, takes place when a molecule containing a highly polar covalent 
bond between a hydrogen and another atom dissolves in water. The hydrogen 
atom in sucha molecule has given up its electron almost entirely to the companion 
atom, and so exists as an almost naked positively charged hydrogen nucleus—in 


TABLE 2-1 


377 (90) 
12.6 (3) 
4.2 (1) 

0.4 (0.1) 


hydrogen 16.7 (4) 


van der Waals 
attraction (per 
atom) 


0.4 (0.1) 


*An ionic bond is an electrostatic attraction between two fully charged atoms. **Values in 
parentheses are kcal/mole. 1 kJ = 0.239 kcal and 1 kcal = 4.18 kJ. 
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Figure 2-4 Hydrogen bonds. (A) Ball-and- 
stick model of a typical hydrogen bond. 
The distance between the hydrogen and 
the oxygen atom here is less than the sum 
of their van der Waals radii, indicating a 
partial sharing of electrons. (B) The most 
common hydrogen bonds in cells. 
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other words, a proton (H*). When the polar molecule becomes surrounded by 
water molecules, the proton will be attracted to the partial negative charge on the 
O atom of an adjacent water molecule. This proton can easily dissociate from its 
original partner and associate instead with the oxygen atom of the water mole- 
cule, generating a hydronium ion (H30*) (Figure 2-5A). The reverse reaction also 
takes place very readily, so in the aqueous solution protons are constantly flitting 
to and fro between one molecule and another. 

Substances that release protons when they dissolve in water, thus forming 
H30*, are termed acids. The higher the concentration of H30*, the more acidic 
the solution. H30* is present even in pure water, at a concentration of 107” M, as 
a result of the movement of protons from one water molecule to another (Figure 
2-5B). By convention, the H30*+ concentration is usually referred to as the H* con- 
centration, even though most protons in an aqueous solution are present as H30+. 
To avoid the use of unwieldy numbers, the concentration of H30* is expressed 
using a logarithmic scale called the pH scale. Pure water has a pH of 7.0 and is said 
to be neutral—that is, neither acidic (pH <7) nor basic (pH >7). 

Acids are characterized as being strong or weak, depending on how readily 
they give up their protons to water. Strong acids, such as hydrochloric acid (HCI), 
lose their protons quickly. Acetic acid, on the other hand, is a weak acid because 
it holds on to its proton more tightly when dissolved in water. Many of the acids 
important in the cell—such as molecules containing a carboxyl (COOH) group— 
are weak acids (see Panel 2-2, pp. 92-93). 

Because the proton of a hydronium ion can be passed readily to many types of 
molecules in cells, altering their character, the concentration of H30* inside a cell 
(the acidity) must be closely regulated. Acids—especially weak acids—will give up 
their protons more readily if the concentration of H30* in solution is low and will 
tend to receive them back if the concentration in solution is high. 

The opposite of an acid is a base. Any molecule capable of accepting a proton 
from a water molecule is called a base. Sodium hydroxide (NaOH) is basic (the 
term alkaline is also used) because it dissociates readily in aqueous solution to 
form Na* ions and OH’ ions. Because of this property, NaOH is called a strong 
base. More important in living cells, however, are the weak bases—those that 
have a weak tendency to reversibly accept a proton from water. Many biologically 
important molecules contain an amino (NH2) group. This group is a weak base 
that can generate OH by taking a proton from water: -NH2 + H20 — -NH3* + OH- 
(see Panel 2-2, pp. 92-93). 

Because an OH” ion combines with a H30* ion to form two water molecules, 
an increase in the OH” concentration forces a decrease in the concentration of 
H30*, and vice versa. A pure solution of water contains an equal concentration 
(10-’ M) of both ions, rendering it neutral. The interior of a cell is also kept close 
to neutrality by the presence of buffers: weak acids and bases that can release or 
take up protons near pH 7, keeping the environment of the cell relatively constant 
under a variety of conditions. 


Figure 2-5 Protons readily move in 
aqueous solutions. (A) The reaction that 
takes place when a molecule of acetic 

acid dissolves in water. At pH 7, nearly all 
of the acetic acid is present as acetate 

ion. (B) Water molecules are continuously 
exchanging protons with each other to 
form hydronium and hydroxyl ions. These 
ions in turn rapidly recombine to form water 
molecules. 
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A Cell ls Formed from Carbon Compounds 


Having reviewed the ways atoms combine into molecules and how these mole- 
cules behave in an aqueous environment, we now examine the main classes of 
small molecules found in cells. We shall see that a few categories of molecules, 
formed from a handful of different elements, give rise to all the extraordinary rich- 
ness of form and behavior shown by living things. 

If we disregard water and inorganic ions such as potassium, nearly all the 
molecules in a cell are based on carbon. Carbon is outstanding among all the 
elements in its ability to form large molecules; silicon is a poor second. Because 
carbon is small and has four electrons and four vacancies in its outermost shell, a 
carbon atom can form four covalent bonds with other atoms. Most important, one 
carbon atom can join to other carbon atoms through highly stable covalent C-C 
bonds to form chains and rings and hence generate large and complex molecules 
with no obvious upper limit to their size. The carbon compounds made by cells 
are called organic molecules. In contrast, all other molecules, including water, are 
said to be inorganic. 

Certain combinations of atoms, such as the methyl (-CH3), hydroxyl (-OH), 
carboxyl (-COOH), carbonyl (-C=O), phosphate (-PO37"), sulfhydryl (-SH), and 
amino (-NH2) groups, occur repeatedly in the molecules made by cells. Each such 
chemical group has distinct chemical and physical properties that influence the 
behavior of the molecule in which the group occurs. The most common chemical 
groups and some of their properties are summarized in Panel 2-1, pp. 90-91. 


Cells Contain Four Major Families of Small Organic Molecules 


The small organic molecules of the cell are carbon-based compounds that have 
molecular weights in the range of 100-1000 and contain up to 30 or so carbon 
atoms. They are usually found free in solution and have many different fates. Some 
are used as monomer subunits to construct giant polymeric macromolecules— 
proteins, nucleic acids, and large polysaccharides. Others act as energy sources 
and are broken down and transformed into other small molecules in a maze of 
intracellular metabolic pathways. Many small molecules have more than one role 
in the cell—for example, acting both as a potential subunit for a macromolecule 
and as an energy source. Small organic molecules are much less abundant than 
the organic macromolecules, accounting for only about one-tenth of the total 
mass of organic matter in a cell. As a rough guess, there may be a thousand differ- 
ent kinds of these small molecules in a typical cell. 

All organic molecules are synthesized from and are broken down into the 
same set of simple compounds. As a consequence, the compounds in a cell are 
chemically related and most can be classified into a few distinct families. Broadly 
speaking, cells contain four major families of small organic molecules: the sugars, 
the fatty acids, the nucleotides, and the amino acids (Figure 2-6). Although many 
compounds present in cells do not fit into these categories, these four families 
of small organic molecules, together with the macromolecules made by linking 
them into long chains, account for a large fraction of the cell mass. 

Amino acids and the proteins that they form will be the subject of Chapter 
3. Asummary of the structures and properties of the remaining three families— 
sugars, fatty acids, and nucleotides—is presented in Panels 2-4, 2-5, and 2-6, 
respectively (see pages 96-101). 


The Chemistry of Cells Is Dominated by Macromolecules with 
Remarkable Properties 


By weight, macromolecules are the most abundant carbon-containing molecules 
in a living cell (Figure 2-7). They are the principal building blocks from which 
a cell is constructed and also the components that confer the most distinctive 
properties of living things. The macromolecules in cells are polymers that are 
constructed by covalently linking small organic molecules (called monomers) into 
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Figure 2-6 The four main families of small organic molecules in 
cells. These small molecules form the monomeric building blocks, or 
subunits, for most of the macromolecules and other assemblies of the 
cell. Some, such as the sugars and the fatty acids, are also energy 
sources. Their structures are outlined here and shown in more detail in 
the Panels at the end of this chapter and in Chapter 3. 
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long chains (Figure 2-8). They have remarkable properties that could not have 
been predicted from their simple constituents. 

Proteins are abundant and spectacularly versatile, performing thousands of 
distinct functions in cells. Many proteins serve as enzymes, the catalysts that facil- 
itate the many covalent bond-making and bond-breaking reactions that the cell 
needs. Enzymes catalyze all of the reactions whereby cells extract energy from 
food molecules, for example, and an enzyme called ribulose bisphosphate car- 
boxylase helps to convert CO2 to sugars in photosynthetic organisms, producing 
most of the organic matter needed for life on Earth. Other proteins are used to 
build structural components, such as tubulin, a protein that self-assembles to 
make the cell’s long microtubules, or histones, proteins that compact the DNA in 
chromosomes. Yet other proteins act as molecular motors to produce force and 
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Figure 2-7 The distribution of molecules in cells. The approximate composition of a bacterial cell 
is shown by weight. The composition of an animal cell is similar, even though its volume is roughly 
1000 times greater. Note that macromolecules dominate. The major inorganic ions include Nat, Kt, 
Mg2*, Ga~, and Cr. 
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movement, as for myosin in muscle. Proteins perform many other functions, and 
we shall examine the molecular basis for many of them later in this book. 

Although the chemical reactions for adding subunits to each polymer are dif- 
ferent in detail for proteins, nucleic acids, and polysaccharides, they share import- 
ant features. Each polymer grows by the addition of a monomer onto the end of a 
growing chain in a condensation reaction, in which one molecule of water is lost 
with each subunit added (Figure 2-9). The stepwise polymerization of monomers 
into a long chain is a simple way to manufacture a large, complex molecule, since 
the subunits are added by the same reaction performed over and over again by the 
same set of enzymes. Apart from some of the polysaccharides, most macromol- 
ecules are made from a limited set of monomers that are slightly different from 
one another—for example, the 20 different amino acids from which proteins are 
made. It is critical to life that the polymer chain is not assembled at random from 
these subunits; instead the subunits are added in a precise order, or sequence. The 
elaborate mechanisms that allow enzymes to accomplish this task are described 
in detail in Chapters 5 and 6. 


Noncovalent Bonds Specify Both the Precise Shape of a 
Macromolecule and Its Binding to Other Molecules 


Most of the covalent bonds in a macromolecule allow rotation of the atoms they 
join, giving the polymer chain great flexibility. In principle, this allows a macro- 
molecule to adopt an almost unlimited number of shapes, or conformations, as 
random thermal energy causes the polymer chain to writhe and rotate. However, 
the shapes of most biological macromolecules are highly constrained because of 
the many weak noncovalent bonds that form between different parts of the same 
molecule. If these noncovalent bonds are formed in sufficient numbers, the poly- 
mer chain can strongly prefer one particular conformation, determined by the 
linear sequence of monomers in its chain. Most protein molecules and many of 
the small RNA molecules found in cells fold tightly into a highly preferred confor- 
mation in this way (Figure 2-10). 

The four types of noncovalent interactions important in biological molecules 
were presented earlier, and they are discussed further in Panel 2-3 (pp. 94-95). In 
addition to folding biological macromolecules into unique shapes, they can also 
add up to create a strong attraction between two different molecules (see Figure 
2-3). This form of molecular interaction provides for great specificity, inasmuch 
as the close multipoint contacts required for strong binding make it possible for a 
macromolecule to select out—through binding—just one of the many thousands 
of other types of molecules present inside a cell. Moreover, because the strength of 
the binding depends on the number of noncovalent bonds that are formed, inter- 
actions of almost any affinity are possible—allowing rapid dissociation where 
appropriate. 

As we discuss next, binding of this type underlies all biological catalysis, mak- 
ing it possible for proteins to function as enzymes. In addition, noncovalent inter- 
actions allow macromolecules to be used as building blocks for the formation of 
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Figure 2-9 Condensation and hydrolysis as opposite reactions. The macromolecules of the cell 
are polymers that are formed from subunits (or monomers) by a condensation reaction, and they 
are broken down by hydrolysis. The condensation reactions are all energetically unfavorable; thus 
polymer formation requires an energy input, as will be described in the text. 
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Figure 2-8 Three families of 
macromolecules. Each is a polymer 
formed from small molecules (called 
monomers) linked together by covalent 
bonds. 
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Figure 2-10 The folding of proteins 

and RNA molecules into a particularly 
y stable three-dimensional shape, or 

conformation. If the noncovalent bonds 


maintaining the stable conformation are 
disrupted, the molecule becomes a flexible 
chain that loses its biological activity. 
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larger structures, thereby forming intricate machines with multiple moving parts 
that perform such complex tasks as DNA replication and protein synthesis (Fig- 
ure 2-11). 





Summary 


Living organisms are autonomous, self-propagating chemical systems. They are 
formed from a distinctive and restricted set of small carbon-based molecules that 
are essentially the same for every living species. Each of these small molecules is 
composed of a small set of atoms linked to each other in a precise configuration 
through covalent bonds. The main categories are sugars, fatty acids, amino acids, 
and nucleotides. Sugars are a primary source of chemical energy for cells and can be 
incorporated into polysaccharides for energy storage. Fatty acids are also import- 
ant for energy storage, but their most critical function is in the formation of cell 
membranes. Long chains of amino acids form the remarkably diverse and versatile 
macromolecules known as proteins. Nucleotides play a central part in energy trans- 
fer, while also serving as the subunits for the informational macromolecules, RNA 
and DNA. 

Most of the dry mass of a cell consists of macromolecules that have been pro- 
duced as linear polymers of amino acids (proteins) or nucleotides (DNA and RNA), 
covalently linked to each other in an exact order. Most of the protein molecules 
and many of the RNAs fold into a unique conformation that is determined by their 
sequence of subunits. This folding process creates unique surfaces, and it depends 
on a large set of weak attractions produced by noncovalent forces between atoms. 
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Figure 2-11 Small molecules become covalently linked to form macromolecules, which in turn assemble through noncovalent interactions 
to form large complexes. Small molecules, proteins, and a ribosome are drawn approximately to scale. Ribosomes are a central part of the 
machinery that the cell uses to make proteins: each ribosome is formed as a complex of about 90 macromolecules (protein and RNA molecules). 
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These forces are of four types: electrostatic attractions, hydrogen bonds, van der 
Waals attractions, and an interaction between nonpolar groups caused by their 
hydrophobic expulsion from water. The same set of weak forces governs the specific 
binding of other molecules to macromolecules, making possible the myriad associa- 
tions between biological molecules that produce the structure and the chemistry of 
a cell. 


CATALYSIS AND THE USE OF ENERGY BY CELLS 


One property of living things above all makes them seem almost miraculously dif- 
ferent from nonliving matter: they create and maintain order, in a universe that is 
tending always to greater disorder (Figure 2-12). To create this order, the cells in 
a living organism must perform a never-ending stream of chemical reactions. In 
some of these reactions, small organic molecules—amino acids, sugars, nucle- 
otides, and lipids—are being taken apart or modified to supply the many other 
small molecules that the cell requires. In other reactions, small molecules are 
being used to construct an enormously diverse range of proteins, nucleic acids, 
and other macromolecules that endow living systems with all of their most dis- 
tinctive properties. Each cell can be viewed as a tiny chemical factory, performing 
many millions of reactions every second. 


Cell Metabolism Is Organized by Enzymes 


The chemical reactions that a cell carries out would normally occur only at much 
higher temperatures than those existing inside cells. For this reason, each reac- 
tion requires a specific boost in chemical reactivity. This requirement is crucial, 
because it allows the cell to control its chemistry. The control is exerted through 
specialized biological catalysts. These are almost always proteins called enzymes, 
although RNA catalysts also exist, called ribozymes. Each enzyme accelerates, or 
catalyzes, just one of the many possible kinds of reactions that a particular mol- 
ecule might undergo. Enzyme-catalyzed reactions are connected in series, so 
that the product of one reaction becomes the starting material, or substrate, for 
the next (Figure 2-13). Long linear reaction pathways are in turn linked to one 
another, forming a maze of interconnected reactions that enable the cell to sur- 
vive, grow, and reproduce. 

Two opposing streams of chemical reactions occur in cells: (1) the catabolic 
pathways break down foodstuffs into smaller molecules, thereby generating both 
a useful form of energy for the cell and some of the small molecules that the cell 
needs as building blocks, and (2) the anabolic, or biosynthetic, pathways use the 
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Figure 2-12 Biological structures are highly ordered. Well-defined, ornate, and beautiful spatial patterns can be found at every level of 
organization in living organisms. In order of increasing size: (A) protein molecules in the coat of a virus (a parasite that, although not technically alive, 
contains the same types of molecules as those found in living cells); (B) the regular array of microtubules seen in a cross section of a sperm tail; 

(C) surface contours of a pollen grain (a single cell); (D) cross section of a fern stem, showing the patterned arrangement of cells; and (E) a spiral 
arrangement of leaves in a succulent plant. (A, courtesy of Robert Grant, Stéphane Crainic, and James M. Hogle; B, courtesy of Lewis Tilney; 

C, courtesy of Colin MacFarlane and Chris Jeffree; D, courtesy of Jim Haseloff.) 
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Figure 2-13 How a set of enzyme-catalyzed reactions generates a metabolic pathway. Each enzyme catalyzes a particular 
chemical reaction, leaving the enzyme unchanged. In this example, a set of enzymes acting in series converts molecule A to 
molecule F, forming a metabolic pathway. (For a diagram of many of the reactions in a human cell, abbreviated as shown, see 


Figure 2—63.) 


small molecules and the energy harnessed by catabolism to drive the synthesis of 
the many other molecules that form the cell. Together these two sets of reactions 
constitute the metabolism of the cell (Figure 2-14). 

The details of cell metabolism form the traditional subject of biochemistry and 
most of them need not concern us here. But the general principles by which cells 
obtain energy from their environment and use it to create order are central to cell 
biology. We begin with a discussion of why a constant input of energy is needed 
to sustain all living things. 


Biological Order Is Made Possible by the Release of Heat Energy 
from Cells 


The universal tendency of things to become disordered is a fundamental law of 
physics—the second law of thermodynamics—which states that in the universe, or 
in any isolated system (a collection of matter that is completely isolated from the 
rest of the universe), the degree of disorder always increases. This law has such 
profound implications for life that we will restate it in several ways. 

For example, we can present the second law in terms of probability by stating 
that systems will change spontaneously toward those arrangements that have the 
greatest probability. If we consider a box of 100 coins all lying heads up, a series 
of accidents that disturbs the box will tend to move the arrangement toward a 
mixture of 50 heads and 50 tails. The reason is simple: there is a huge number 
of possible arrangements of the individual coins in the mixture that can achieve 
the 50-50 result, but only one possible arrangement that keeps all of the coins 
oriented heads up. Because the 50-50 mixture is therefore the most probable, we 
say that it is more “disordered.” For the same reason, it is a common experience 
that one’s living space will become increasingly disordered without intentional 
effort: the movement toward disorder is a spontaneous process, requiring a peri- 
odic effort to reverse it (Figure 2-15). 

The amount of disorder in a system can be quantified and expressed as the 
entropy of the system: the greater the disorder, the greater the entropy. Thus, 
another way to express the second law of thermodynamics is to say that systems 
will change spontaneously toward arrangements with greater entropy. 

Living cells—by surviving, growing, and forming complex organisms—are 
generating order and thus might appear to defy the second law of thermodynam- 
ics. How is this possible? The answer is that a cell is not an isolated system: it takes 
in energy from its environment in the form of food, or as photons from the sun (or 
even, as in some chemosynthetic bacteria, from inorganic molecules alone). It 
then uses this energy to generate order within itself. In the course of the chemical 
reactions that generate order, the cell converts part of the energy it uses into heat. 
The heat is discharged into the cell’s environment and disorders the surroundings. 
As a result, the total entropy—that of the cell plus its surroundings—increases, as 
demanded by the second law of thermodynamics. 

To understand the principles governing these energy conversions, think 
of a cell surrounded by a sea of matter representing the rest of the universe. As 
the cell lives and grows, it creates internal order. But it constantly releases heat 
energy as it synthesizes molecules and assembles them into cell structures. Heat 
is energy in its most disordered form—the random jostling of molecules. When 
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Figure 2-14 Schematic representation of 
the relationship between catabolic and 
anabolic pathways in metabolism. As 
suggested in this diagram, a major portion 
of the energy stored in the chemical bonds 
of food molecules is dissipated as heat. In 
addition, the mass of food required by any 
organism that derives all of its energy from 
catabolism is much greater than the mass 
of the molecules that it can produce by 
anabolism. 
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the cell releases heat to the sea, it increases the intensity of molecular motions 
there (thermal motion)—thereby increasing the randomness, or disorder, of the 
sea. The second law of thermodynamics is satisfied because the increase in the 
amount of order inside the cell is always more than compensated for by an even 
greater decrease in order (increase in entropy) in the surrounding sea of matter 
(Figure 2-16). 

Where does the heat that the cell releases come from? Here we encoun- 
ter another important law of thermodynamics. The first law of thermodynamics 
states that energy can be converted from one form to another, but that it cannot 
be created or destroyed. Figure 2-17 illustrates some interconversions between 
different forms of energy. The amount of energy in different forms will change 
as a result of the chemical reactions inside the cell, but the first law tells us that 
the total amount of energy must always be the same. For example, an animal cell 
takes in foodstuffs and converts some of the energy present in the chemical bonds 
between the atoms of these food molecules (chemical-bond energy) into the ran- 
dom thermal motion of molecules (heat energy). 

The cell cannot derive any benefit from the heat energy it releases unless the 
heat-generating reactions inside the cell are directly linked to the processes that 
generate molecular order. It is the tight coupling of heat production to an increase 
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Figure 2-15 An everyday illustration of 
the spontaneous drive toward disorder. 
Reversing this tendency toward disorder 
requires an intentional effort and an input of 
energy: it is not spontaneous. In fact, from 
the second law of thermodynamics, we 
can be certain that the human intervention 
required will release enough heat to the 
environment to more than compensate for 
the reordering of the items in this room. 


Figure 2-16 A simple thermodynamic 
analysis of a living cell. In the diagram on 
the left, the molecules of both the cell and 
the rest of the universe (the sea of matter) 
are depicted in a relatively disordered 
state. In the diagram on the right, the cell 
has taken in energy from food molecules 
and released heat through reactions that 
order the molecules the cell contains. The 
heat released increases the disorder in the 
environment around the cell (depicted by 
jagged arrows and distorted molecules, 
indicating increased molecular motions 
caused by heat). As a result, the second 
law of thermodynamics—which states 
that the amount of disorder in the universe 
must always increase—is satisfied as 

the cell grows and divides. For a detailed 
discussion, see Panel 2-7 (pp. 102-103). 
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in order that distinguishes the metabolism of a cell from the wasteful burning of 
fuel in a fire. Later, we illustrate how this coupling occurs. For now, it is sufficient 
to recognize that a direct linkage of the “controlled burning” of food molecules to 
the generation of biological order is required for cells to create and maintain an 
island of order in a universe tending toward chaos. 


Cells Obtain Energy by the Oxidation of Organic Molecules 


All animal and plant cells are powered by energy stored in the chemical bonds 
of organic molecules, whether they are sugars that a plant has photosynthesized 
as food for itself or the mixture of large and small molecules that an animal has 
eaten. Organisms must extract this energy in usable form to live, grow, and repro- 
duce. In both plants and animals, energy is extracted from food molecules by a 
process of gradual oxidation, or controlled burning. 

The Earth’s atmosphere contains a great deal of oxygen, and in the presence of 
oxygen the most energetically stable form of carbon is CO2 and that of hydrogen 


Figure 2-17 Some interconversions 
between different forms of energy. 

All energy forms are, in principle, 
interconvertible. In all these processes the 
total amount of energy is conserved. Thus, 
for example, from the height and weight 
of the brick in (1), we can predict exactly 
how much heat will be released when it hits 
the floor. In (2), note that the large amount 
of chemical-bond energy released when 
water is formed is initially converted to 
very rapid thermal motions in the two new 
water molecules; but collisions with other 
molecules almost instantaneously spread 
this kinetic energy evenly throughout the 
surroundings (heat transfer), making the 
new molecules indistinguishable from all 
the rest. 
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is H20. A cell is therefore able to obtain energy from sugars or other organic mol- 
ecules by allowing their carbon and hydrogen atoms to combine with oxygen to 
produce CO% and H20, respectively—a process called aerobic respiration. 

Photosynthesis (discussed in detail in Chapter 14) and respiration are com- 
plementary processes (Figure 2-18). This means that the transactions between 
plants and animals are not all one way. Plants, animals, and microorganisms have 
existed together on this planet for so long that many of them have become an 
essential part of the others’ environments. The oxygen released by photosynthe- 
sis is consumed in the combustion of organic molecules during aerobic respira- 
tion. And some of the CO2 molecules that are fixed today into organic molecules 
by photosynthesis in a green leaf were yesterday released into the atmosphere 
by the respiration of an animal—or by the respiration of a fungus or bacterium 
decomposing dead organic matter. We therefore see that carbon utilization forms 
a huge cycle that involves the biosphere (all of the living organisms on Earth) as a 
whole (Figure 2-19). Similarly, atoms of nitrogen, phosphorus, and sulfur move 
between the living and nonliving worlds in cycles that involve plants, animals, 
fungi, and bacteria. 


Oxidation and Reduction Involve Electron Transfers 


The cell does not oxidize organic molecules in one step, as occurs when organic 
material is burned in a fire. Through the use of enzyme catalysts, metabolism takes 
these molecules through a large number of reactions that only rarely involve the 
direct addition of oxygen. Before we consider some of these reactions and their 
purpose, we discuss what is meant by the process of oxidation. 
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Figure 2-19 The carbon cycle. Individual carbon atoms are incorporated into organic molecules of 
the living world by the photosynthetic activity of bacteria, algae, and plants. They pass to animals, 
microorganisms, and organic material in soil and oceans in cyclic paths. COs is restored to the 
atmosphere when organic molecules are oxidized by cells or burned by humans as fuels. 
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Figure 2-18 Photosynthesis and 
respiration as complementary processes 
in the living world. Photosynthesis 
converts the electromagnetic energy in 
sunlight into chemical-bond energy in 
sugars and other organic molecules. 
Plants, algae, and cyanobacteria obtain 
the carbon atoms that they need for this 
purpose from atmospheric COs and the 
hydrogen from water, releasing Os gas 
as a by-product. The organic molecules 
produced by photosynthesis in turn serve 
as food for other organisms. Many of these 
organisms carry out aerobic respiration, 
a process that uses Os to form COs from 
the same carbon atoms that had been 
taken up as CO» and converted into sugars 
by photosynthesis. In the process, the 
organisms that respire obtain the chemical- 
bond energy that they need to survive. 

The first cells on the Earth are 
thought to have been capable of neither 
photosynthesis nor respiration (discussed 
in Chapter 14). However, photosynthesis 
must have preceded respiration on the 
Earth, since there is strong evidence that 
billions of years of photosynthesis were 
required before Os had been released in 
sufficient quantity to create an atmosphere 
rich in this gas. (The Earth’s atmosphere 
currently contains 20% Ob.) 
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Oxidation refers to more than the addition of oxygen atoms; the term applies 
more generally to any reaction in which electrons are transferred from one atom 
to another. Oxidation in this sense refers to the removal of electrons, and reduc- 
tion—the converse of oxidation—means the addition of electrons. Thus, Fe2* is 
oxidized if it loses an electron to become Fe**, and a chlorine atom is reduced 
if it gains an electron to become Cl. Since the number of electrons is conserved 
(no loss or gain) in a chemical reaction, oxidation and reduction always occur 
simultaneously: that is, if one molecule gains an electron in a reaction (reduc- 
tion), asecond molecule loses the electron (oxidation). When a sugar molecule is 
oxidized to COz and H20, for example, the O2 molecules involved in forming H20 
gain electrons and thus are said to have been reduced. 

The terms “oxidation” and “reduction” apply even when there is only a partial 
shift of electrons between atoms linked by a covalent bond (Figure 2-20). When 
a carbon atom becomes covalently bonded to an atom with a strong affinity for 
electrons, such as oxygen, chlorine, or sulfur, for example, it gives up more than 
its equal share of electrons and forms a polar covalent bond. Because the positive 
charge of the carbon nucleus is now somewhat greater than the negative charge of 
its electrons, the atom acquires a partial positive charge and is said to be oxidized. 
Conversely, a carbon atom in a C-H linkage has slightly more than its share of 
electrons, and so it is said to be reduced. 

When a molecule in a cell picks up an electron (e), it often picks up a proton 
(H+) at the same time (protons being freely available in water). The net effect in 
this case is to add a hydrogen atom to the molecule. 


A+e+H*— AH 


Even though a proton plus an electron is involved (instead of just an electron), 
such hydrogenation reactions are reductions, and the reverse, dehydrogenation 
reactions are oxidations. It is especially easy to tell whether an organic molecule 
is being oxidized or reduced: reduction is occurring if its number of C-H bonds 
increases, whereas oxidation is occurring if its number of C-H bonds decreases 
(see Figure 2-20B). 

Cells use enzymes to catalyze the oxidation of organic molecules in small 
steps, through a sequence of reactions that allows useful energy to be harvested. 
We now need to explain how enzymes work and some of the constraints under 








which they operate. 
Figure 2—20 Oxidation and reduction. (A) When two atoms form a polar ae P a 
covalent bond, the atom ending up with a greater share of electrons is said A N 
to be reduced, while the other atom acquires a lesser share of electrons and o B 
is said to be oxidized. The reduced atom has acquired a partial negative 
charge (0°) as the positive charge on the atomic nucleus is now more than 
equaled by the total charge of the electrons surrounding it, and conversely, $ 


the oxidized atom has acquired a partial positive charge (ô+). (B) The single 
carbon atom of methane can be converted to that of carbon dioxide by 
the successive replacement of its covalently bonded hydrogen atoms 

with oxygen atoms. With each step, electrons are shifted away from the 
carbon (as indicated by the blue shading), and the carbon atom becomes 
progressively more oxidized. Each of these steps is energetically favorable 
under the conditions present inside a cell. 





methanol 


? $ formaldehyde 


formic acid 









FORMATION OF 
A POLAR 
COVALENT 
BOND 


aaam 


partial Fa 


positive 


 ATOM2 charge (°°) | MOLECULE 
oxidized 








NX partial 
negative 
charge (6 ) 
reduced (B) carbon dioxide 





CATALYSIS AND THE USE OF ENERGY BY CELLS 


Enzymes Lower the Activation-Energy Barriers That Block 
Chemical Reactions 


Consider the reaction 
paper + O2 — smoke + ashes + heat + CO2 + H20 


Once ignited, the paper burns readily, releasing to the atmosphere both energy 
as heat and water and carbon dioxide as gases. The reaction is irreversible, since 
the smoke and ashes never spontaneously retrieve these entities from the heated 
atmosphere and reconstitute themselves into paper. When the paper burns, its 
chemical energy is dissipated as heat—not lost from the universe, since energy 
can never be created or destroyed, but irretrievably dispersed in the chaotic ran- 
dom thermal motions of molecules. At the same time, the atoms and molecules of 
the paper become dispersed and disordered. In the language of thermodynamics, 
there has been a loss of free energy; that is, of energy that can be harnessed to do 
work or drive chemical reactions. This loss reflects a reduction of orderliness in 
the way the energy and molecules were stored in the paper. 

We shall discuss free energy in more detail shortly, but the general principle 
is clear enough intuitively: chemical reactions proceed spontaneously only in 
the direction that leads to a loss of free energy. In other words, the spontaneous 
direction for any reaction is the direction that goes “downhill,” where a “downhill” 
reaction is one that is energetically favorable. 

Although the most energetically favorable form of carbon under ordinary con- 
ditions is CO2, and that of hydrogen is H20, a living organism does not disappear 
in a puff of smoke, and the paper book in your hands does not burst into flames. 
This is because the molecules both in the living organism and in the book are in a 
relatively stable state, and they cannot be changed to a state of lower energy with- 
out an input of energy: in other words, a molecule requires activation energy—a 
kick over an energy barrier—before it can undergo a chemical reaction that leaves 
itin a more stable state (Figure 2-21). In the case of a burning book, the activation 
energy can be provided by the heat of a lighted match. For the molecules in the 
watery solution inside a cell, the kick is delivered by an unusually energetic ran- 
dom collision with surrounding molecules—collisions that become more violent 
as the temperature is raised. 

The chemistry in a living cell is tightly controlled, because the kick over energy 
barriers is greatly aided by a specialized class of proteins—the enzymes. Each 
enzyme binds tightly to one or more molecules, called substrates, and holds 
them in a way that greatly reduces the activation energy of a particular chemical 
reaction that the bound substrates can undergo. A substance that can lower the 
activation energy of a reaction is termed a catalyst; catalysts increase the rate of 
chemical reactions because they allow a much larger proportion of the random 
collisions with surrounding molecules to kick the substrates over the energy bar- 
rier, as illustrated in Figure 2-22. Enzymes are among the most effective catalysts 
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Figure 2-21 The important principle 

of activation energy. (A) Compound Y 

(a reactant) is in a relatively stable state, 
and energy is required to convert it to 
compound X (a product), even though X is 
at a lower overall energy level than Y. This 
conversion will not take place, therefore, 
unless compound Y can acquire enough 
activation energy (energy a minus energy 
b) from its Surroundings to undergo the 
reaction that converts it into compound X. 
This energy may be provided by means of 
an unusually energetic collision with other 
molecules. For the reverse reaction, 

X — Y, the activation energy will be 

much larger (energy a minus energy c); 
this reaction will therefore occur much 
more rarely. Activation energies are 
always positive; note, however, that the 
total energy change for the energetically 
favorable reaction Y — X is energy c 
minus energy b, a negative number. 

(B) Energy barriers for specific reactions 
can be lowered by catalysts, as indicated 
by the line marked d. Enzymes are 
particularly effective catalysts because they 
greatly reduce the activation energy for the 
reactions they perform. 
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known: some are capable of speeding up reactions by factors of 10!4 or more. 
Enzymes thereby allow reactions that would not otherwise occur to proceed rap- 
idly at normal temperatures. 


Enzymes Can Drive Substrate Molecules Along Specific Reaction 
Pathways 


An enzyme cannot change the equilibrium point for a reaction. The reason is sim- 
ple: when an enzyme (or any catalyst) lowers the activation energy for the reaction 
Y — X, of necessity it also lowers the activation energy for the reaction X — Y by 
exactly the same amount (see Figure 2-21). The forward and backward reactions 
will therefore be accelerated by the same factor by an enzyme, and the equilib- 
rium point for the reaction will be unchanged (Figure 2-23). Thus no matter how 
much an enzyme speeds up a reaction, it cannot change its direction. 

Despite the above limitation, enzymes steer all of the reactions in cells through 
specific reaction paths. This is because enzymes are both highly selective and 
very precise, usually catalyzing only one particular reaction. In other words, each 
enzyme Selectively lowers the activation energy of only one of the several possible 
chemical reactions that its bound substrate molecules could undergo. In this way, 
sets of enzymes can direct each of the many different molecules in a cell along a 
particular reaction pathway (Figure 2-24). 

The success of living organisms is attributable to a cell’s ability to make 
enzymes of many types, each with precisely specified properties. Each enzyme 





Figure 2-23 Enzymes cannot change the equilibrium point for reactions. Enzymes, like all 
catalysts, soeed up the forward and backward rates of a reaction by the same factor. Therefore, for 
both the catalyzed and the uncatalyzed reactions shown here, the number of molecules undergoing 
the transition X — Y is equal to the number of molecules undergoing the transition Y — X when the 
ratio of Y molecules to X molecules is 3 to 1. In other words, the two reactions reach equilibrium at 
exactly the same point. 


Figure 2—22 Lowering the activation 
energy greatly increases the probability 
of a reaction. At any given instant, a 
population of identical substrate molecules 
will have a range of energies, distributed as 
shown on the graph. The varying energies 
come from collisions with surrounding 
molecules, which make the substrate 
molecules jiggle, vibrate, and spin. For a 
molecule to undergo a chemical reaction, 
the energy of the molecule must exceed 
the activation-energy barrier for that 
reaction (dashed lines). For most biological 
reactions, this almost never happens 
without enzyme catalysis. Even with 
enzyme catalysis, the substrate molecules 
must experience a particularly energetic 
collision to react (red shaded area). Raising 
the temperature will also increase the 
number of molecules with sufficient energy 
to overcome the activation energy needed 
for a reaction; but in marked contrast to 
enzyme catalysis, this effect is nonselective, 
speeding up all reactions (Movie 2.2). 
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Figure 2—24 Directing substrate molecules through a specific reaction 
pathway by enzyme catalysis. A substrate molecule in a cell (green ball) 

is converted into a different molecule (blue ball) by means of a series of 
enzyme-catalyzed reactions. As indicated (yellow box), several reactions 

are energetically favorable at each step, but only one is catalyzed by each 
enzyme. Sets of enzymes thereby determine the exact reaction pathway that 
is followed by each molecule inside the cell. 


has a unique shape containing an active site, a pocket or groove in the enzyme 
into which only particular substrates will fit (Figure 2-25). Like all other catalysts, 
enzyme molecules themselves remain unchanged after participating in a reaction 
and therefore can function over and over again. In Chapter 3, we discuss further 
how enzymes work. 


How Enzymes Find Their Substrates: The Enormous Rapidity of 
Molecular Motions 


An enzyme will often catalyze the reaction of thousands of substrate molecules 
every second. This means that it must be able to bind a new substrate molecule 
in a fraction of a millisecond. But both enzymes and their substrates are present 
in relatively small numbers in a cell. How do they find each other so fast? Rapid 
binding is possible because the motions caused by heat energy are enormously 
fast at the molecular level. These molecular motions can be classified broadly into 
three kinds: (1) the movement of a molecule from one place to another (transla- 
tional motion), (2) the rapid back-and-forth movement of covalently linked atoms 
with respect to one another (vibrations), and (3) rotations. All of these motions 
help to bring the surfaces of interacting molecules together. 

The rates of molecular motions can be measured by a variety of spectroscopic 
techniques. A large globular protein is constantly tumbling, rotating about its axis 
about a million times per second. Molecules are also in constant translational 
motion, which causes them to explore the space inside the cell very efficiently by 
wandering through it—a process called diffusion. In this way, every molecule in 
a cell collides with a huge number of other molecules each second. As the mol- 
ecules in a liquid collide and bounce off one another, an individual molecule 
moves first one way and then another, its path constituting a random walk (Figure 
2-26). In such a walk, the average net distance that each molecule travels (as the 
“crow flies”) from its starting point is proportional to the square root of the time 
involved: that is, if it takes a molecule 1 second on average to travel 1 um, it takes 
4 seconds to travel 2 um, 100 seconds to travel 10 um, and so on. 

The inside of a cell is very crowded (Figure 2-27). Nevertheless, experiments 
in which fluorescent dyes and other labeled molecules are injected into cells show 
that small organic molecules diffuse through the watery gel of the cytosol nearly 
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Figure 2-25 How enzymes work. Each enzyme has an active site to which one or more substrate 
molecules bind, forming an enzyme-substrate complex. A reaction occurs at the active site, 
producing an enzyme-product complex. The product is then released, allowing the enzyme to bind 
further substrate molecules. 
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as rapidly as they do through water. A small organic molecule, for example, takes 
only about one-fifth of a second on average to diffuse a distance of 10 um. Dif- 
fusion is therefore an efficient way for small molecules to move the limited dis- 
tances in the cell (a typical animal cell is 15 um in diameter). 

Since enzymes move more slowly than substrates in cells, we can think of them 
as sitting still. The rate of encounter of each enzyme molecule with its substrate 
will depend on the concentration of the substrate molecule. For example, some 
abundant substrates are present at a concentration of 0.5 mM. Since pure water 
is 55.5 M, there is only about one such substrate molecule in the cell for every 
10° water molecules. Nevertheless, the active site on an enzyme molecule that 
binds this substrate will be bombarded by about 500,000 random collisions with 
the substrate molecule per second. (For a substrate concentration tenfold lower, 
the number of collisions drops to 50,000 per second, and so on.) A random col- 
lision between the active site of an enzyme and the matching surface of its sub- 
strate molecule often leads immediately to the formation of an enzyme-substrate 
complex. A reaction in which a covalent bond is broken or formed can now occur 
extremely rapidly. When one appreciates how quickly molecules move and react, 
the observed rates of enzymatic catalysis do not seem so amazing. 

Two molecules that are held together by noncovalent bonds can also disso- 
ciate. The multiple weak noncovalent bonds that they form with each other will 
persist until random thermal motion causes the two molecules to separate. In 
general, the stronger the binding of the enzyme and substrate, the slower their 
rate of dissociation. In contrast, whenever two colliding molecules have poorly 
matching surfaces, they form few noncovalent bonds and the total energy of asso- 
ciation will be negligible compared with that of thermal motion. In this case, the 
two molecules dissociate as rapidly as they come together, preventing incorrect 
and unwanted associations between mismatched molecules, such as between an 
enzyme and the wrong substrate. 


The Free-Energy Change for a Reaction, AG, Determines Whether 
It Can Occur Spontaneously 


Although enzymes speed up reactions, they cannot by themselves force ener- 
getically unfavorable reactions to occur. In terms of a water analogy, enzymes 
by themselves cannot make water run uphill. Cells, however, must do just that in 
order to grow and divide: they must build highly ordered and energy-rich mole- 
cules from small and simple ones. We shall see that this is done through enzymes 
that directly couple energetically favorable reactions, which release energy and 
produce heat, to energetically unfavorable reactions, which produce biological 
order. 

What do cell biologists mean by the term “energetically favorable,’ and how 
can this be quantified? According to the second law of thermodynamics the uni- 
verse tends toward maximum disorder (largest entropy or greatest probability). 
Thus, a chemical reaction can proceed spontaneously only if it results in a net 
increase in the disorder of the universe (see Figure 2-16). This disorder of the uni- 
verse can be expressed most conveniently in terms of the free energy of a system, a 
concept we touched on earlier. 

Free energy, G, is an expression of the energy available to do work—for exam- 
ple, the work of driving chemical reactions. The value of Gis of interest only when 
a system undergoes a change, denoted AG (delta G). The change in G is critical 
because, as explained in Panel 2-7 (pp. 102-103), AG is a direct measure of the 


Figure 2-27 The structure of the cytoplasm. The drawing is approximately 
to scale and emphasizes the crowding in the cytoplasm. Only the 
macromolecules are shown: RNAs are shown in blue, ribosomes in green, 
and proteins in red. Enzymes and other macromolecules diffuse relatively 
slowly in the cytoplasm, in part because they interact with many other 
macromolecules; small molecules, by contrast, diffuse nearly as rapidly as 
they do in water (Movie 2.4). (Adapted from D.S. Goodsell, Trends Biochem. 
Sci. 16:203-206, 1991. With permission from Elsevier.) 
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Figure 2—26 A random walk. Molecules 

in solution move in a random fashion as a 
result of the continual buffeting they receive 
in collisions with other molecules. This 
movement allows small molecules 

to diffuse rapidly from one part of the 

cell to another, as described in the text 
(Movie 2.3). 





CATALYSIS AND THE USE OF ENERGY BY CELLS 


amount of disorder created in the universe when a reaction takes place. Energet- 
ically favorable reactions, by definition, are those that decrease free energy; in 
other words, they have a negative AG and disorder the universe (Figure 2-28). 

An example of an energetically favorable reaction on a macroscopic scale is 
the “reaction” by which a compressed spring relaxes to an expanded state, releas- 
ing its stored elastic energy as heat to its surroundings; an example on a micro- 
scopic scale is salt dissolving in water. Conversely, energetically unfavorable reac- 
tions with a positive AG—such as the joining of two amino acids to form a peptide 
bond—by themselves create order in the universe. Therefore, these reactions can 
take place only if they are coupled to a second reaction with a negative AG so large 
that the AG of the overall process is negative (Figure 2-29). 


The Concentration of Reactants Influences the Free-Energy 
Change and a Reaction’s Direction 


As we have just described, a reaction Y <+> X will go in the direction Y — X when 
the associated free-energy change, AG, is negative, just as a tensed spring left to 
itself will relax and lose its stored energy to its surroundings as heat. For a chem- 
ical reaction, however, AG depends not only on the energy stored in each indi- 
vidual molecule, but also on the concentrations of the molecules in the reaction 
mixture. Remember that AG reflects the degree to which a reaction creates a more 
disordered—in other words, a more probable—state of the universe. Recalling our 
coin analogy, it is very likely that a coin will flip from a head to a tail orientation if 
a jiggling box contains 90 heads and 10 tails, but this is a less probable event if the 
box has 10 heads and 90 tails. 

The same is true for a chemical reaction. For a reversible reaction Y <+> X, a 
large excess of Y over X will tend to drive the reaction in the direction Y — X. 
Therefore, as the ratio of Y to X increases, the AG becomes more negative for the 
transition Y — X (and more positive for the transition X — Y). 

The amount of concentration difference that is needed to compensate for a 
given decrease in chemical-bond energy (and accompanying heat release) is not 
intuitively obvious. In the late nineteenth century, the relationship was deter- 
mined through a thermodynamic analysis that makes it possible to separate 
the concentration-dependent and the concentration-independent parts of the 
free-energy change, as we describe next. 


The Standard Free-Energy Change, AG*°, Makes It Possible to 
Compare the Energetics of Different Reactions 


Because AG depends on the concentrations of the molecules in the reaction mix- 
ture at any given time, it is not a particularly useful value for comparing the rel- 
ative energies of different types of reactions. To place reactions on a comparable 
basis, we need to turn to the standard free-energy change of a reaction, AG°. 
The AG” is the change in free energy under a standard condition, defined as that 
where the concentrations of all the reactants are set to the same fixed value of 1 
mole/liter. Defined in this way, AG° depends only on the intrinsic characters of 
the reacting molecules. 
For the simple reaction Y —> X at 37°C, AG is related to AG as follows: 


AG=AG° + RT In K 


where AG is in kilojoules per mole, [Y] and [X] denote the concentrations of Y and 
X in moles/liter, In is the natural logarithm, and RT is the product of the gas con- 
stant, R, and the absolute temperature, T. At 37°C, RT = 2.58 J mole“. (A mole is 
6 x 1073 molecules of a substance.) 

A large body of thermodynamic data has been collected that has made it pos- 
sible to determine the standard free-energy change, AG”, for the important meta- 
bolic reactions of a cell. Given these AG° values, combined with additional infor- 
mation about metabolite concentrations and reaction pathways, it is possible to 
quantitatively predict the course of most biological reactions. 
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The free energy of Y 
is greater than the free 
energy of X. Therefore 


ENERGETICALLY AG <0, and the disorder 
FAVORABLE of the universe increases 


REACTION during the reaction 
Y —X. 





If the reaction X—> Y 
occurred, AG would 
be > 0, and the 
universe would 
become more 
ordered. 


ENERGETICALLY 
UNFAVORABLE 
REACTION 
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Figure 2-28 The distinction between 
energetically favorable and energetically 
unfavorable reactions. 






negative 
AG 


positive 
AG 





the energetically unfavorable 
reaction XY is driven by the 
energetically favorable 
reaction CD, because the net 
free-energy change for the 
pair of coupled reactions is less 
than zero 


Figure 2-29 How reaction coupling is 
used to drive energetically unfavorable 
reactions. 
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when X 2 Y are at equal concentrations, [Y] = [X], the formation of X 
is energetically favored. In other words, the AG of Y — X is negative and 
the AG of X — Y is positive. But because of thermal bombardments, 
there will always be some X converting to Y. 


THUS, FOR EACH INDIVIDUAL MOLECULE, | 
conversion of 
— A  Ytoxwil 
@ x occur often. 
Conversion of X to Y 
r — > will occur less often 
© @ than the transition 


Y — X, because it 
requires a more 


Therefore the ratio of X to Y energetic collision. 
molecules will increase with time 


there will be a large enough excess of X over Y to just 
compensate for the slow rate of X — Y, such that the number of Y molecules 
being converted to X molecules each second is exactly equal to the number 
of X molecules being converted to Y molecules each second. At this point, 
the reaction will be at equilibrium. 
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<< 


there is no net change in the ratio of Y to X, and the 
AG for both forward and backward reactions is zero. 


The Equilibrium Constant and AG® Are Readily Derived from 
Each Other 


Inspection of the above equation reveals that the AG equals the value of AG° 
when the concentrations of Y and X are equal. But as any favorable reaction pro- 
ceeds, the concentrations of the products will increase as the concentration of the 
substrates decreases. This change in relative concentrations will cause [X]/|Y] to 
become increasingly large, making the initially favorable AG less and less negative 
(the logarithm of a number x is positive for x > 1, negative for x < 1, and zero for x 
=1). Eventually, when AG = 0, a chemical equilibrium will be attained; here there 
is no net change in free energy to drive the reaction in either direction, inasmuch 
as the concentration effect just balances the push given to the reaction by AG”. 
As a result, the ratio of product to substrate reaches a constant value at chemical 
equilibrium (Figure 2-30). 
We can define the equilibrium constant, K, for the reaction Y — X as 
K-K 
[Y] 
where [X] is the concentration of the product and [Y] is the concentration of the 
reactant at equilibrium. Remembering that AG = AG° + RT In [X]/[Y], and that 
AG = 0 at equilibrium, we see that 
AG? =-RT In ate --RTInK 
At 37°C, where RT = 2.58, the equilibrium equation is therefore: 


AG = -2.58 In K 


Figure 2-30 Chemical equilibrium. When 
a reaction reaches equilibrium, the forward 
and backward fluxes of reacting molecules 
are equal and opposite. 
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Converting this equation from the natural logarithm (In) to the more com- 
monly used base 10 logarithm (log), we get 


AG = -5.94 log K 


The above equation reveals how the equilibrium ratio of X to Y (expressed as 
the equilibrium constant, K) depends on the intrinsic character of the molecules, 
(as expressed in the value of AG’ in kilojoules per mole). Note that for every 5.94 
kJ/mole difference in free energy at 37°C, the equilibrium constant changes by 
a factor of 10 (Table 2-2). Thus, the more energetically favorable a reaction, the 
more product will accumulate if the reaction proceeds to equilibrium. 

More generally, for a reaction that has multiple reactants and products, such 
asA+B—C+D, 

[C][D] 
[A][B] 

The concentrations of the two reactants and the two products are multiplied 
because the rate of the forward reaction depends on the collision of A and B and 
the rate of the backward reaction depends on the collision of C and D. Thus, at 
37°C, 


K= 





[C][D] 

[A][B] 

where AG” is in kilojoules per mole, and [A], [B], [C], and [D] denote the concen- 
trations of the reactants and products in moles/liter. 


AG? = -5.94 log 





The Free-Energy Changes of Coupled Reactions Are Additive 


We have pointed out that unfavorable reactions can be coupled to favorable ones 
to drive the unfavorable ones forward (see Figure 2-29). In thermodynamic terms, 
this is possible because the overall free-energy change for a set of coupled reac- 
tions is the sum of the free-energy changes in each of its component steps. Con- 
sider, as a simple example, two sequential reactions 


X—Y and YZ 


whose AG values are +5 and -13 kJ/mole, respectively. If these two reactions 
occur sequentially, the AG° for the coupled reaction will be -8 kJ/mole. This 
means that, with appropriate conditions, the unfavorable reaction X — Y can be 
driven by the favorable reaction Y — Z, provided that this second reaction follows 
the first. For example, several of the reactions in the long pathway that converts 
sugars into CO% and H20 have positive AG’ values. But the pathway nevertheless 
proceeds because the total AG” for the series of sequential reactions has a large 
negative value. 

Forming a sequential pathway is not adequate for many purposes. Often the 
desired pathway is simply X — Y, without further conversion of Y to some other 
product. Fortunately, there are other more general ways of using enzymes to cou- 
ple reactions together. These often involve the activated carrier molecules that we 
discuss next. 


Activated Carrier Molecules Are Essential for Biosynthesis 


The energy released by the oxidation of food molecules must be stored temporar- 
ily before it can be channeled into the construction of the many other molecules 
needed by the cell. In most cases, the energy is stored as chemical-bond energy 
in a small set of activated “carrier molecules,’ which contain one or more energy- 
rich covalent bonds. These molecules diffuse rapidly throughout the cell and 
thereby carry their bond energy from sites of energy generation to the sites where 
the energy will be used for biosynthesis and other cell activities (Figure 2-31). 
The activated carriers store energy in an easily exchangeable form, either as 
a readily transferable chemical group or as electrons held at a high energy level, 
and they can serve a dual role as a source of both energy and chemical groups in 
biosynthetic reactions. For historical reasons, these molecules are also sometimes 
referred to as coenzymes. The most important of the activated carrier molecules 
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TABLE 2-2 


Values of the equilibrium constant were 
calculated for the simple chemical 
reaction Y + X using the equation 

given in the text. The AG® given here 

is in kilojoules per mole at 37°C, with 
kilocalories per mole in parentheses. 
One kilojoule (kJ) is equal to 0.239 
kilocalories (kcal) (1 kcal = 4.18 kJ). As 
explained in the text, AG° represents the 
free-energy difference under standard 
conditions (where all components are 
present at a concentration of 1.0 mole/ 
liter). From this table, we see that if 
there is a favorable standard free-energy 
change (AG°) of -17.8 kJ/mole 

(—4.3 kcal/mole) for the transition Y > X, 
there will be 1000 times more molecules 
in state X than in state Y at equilibrium 
(K = 1000). 
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are ATP and two molecules that are closely related to each other, NADH and 
NADPH. Cells use such activated carrier molecules like money to pay for reac- 
tions that otherwise could not take place. 


The Formation of an Activated Carrier Is Coupled to an 
Energetically Favorable Reaction 


Coupling mechanisms require enzymes and are fundamental to all the energy 
transactions of the cell. The nature of a coupled reaction is illustrated by a 
mechanical analogy in Figure 2-32, in which an energetically favorable chemi- 
cal reaction is represented by rocks falling from a cliff. The energy of falling rocks 
would normally be entirely wasted in the form of heat generated by friction when 
the rocks hit the ground (see the falling-brick diagram in Figure 2-17). By careful 
design, however, part of this energy could be used instead to drive a paddle wheel 
that lifts a bucket of water (Figure 2-32B). Because the rocks can now reach the 
ground only after moving the paddle wheel, we say that the energetically favor- 
able reaction of rock falling has been directly coupled to the energetically unfavor- 
able reaction of lifting the bucket of water. Note that because part of the energy is 
used to do work in Figure 2-32B, the rocks hit the ground with less velocity than in 
Figure 2-32A, and correspondingly less energy is dissipated as heat. 

Similar processes occur in cells, where enzymes play the role of the paddle 
wheel. By mechanisms that we discuss later in this chapter, enzymes couple an 





part of the kinetic energy is used to lift 
a bucket of water, and a correspondingly 
smaller amount is transformed into heat 


kinetic energy of falling rocks is 
transformed into heat energy only 


Figure 2-31 Energy transfer and the 
role of activated carriers in metabolism. 
By serving as energy shuttles, activated 
carrier molecules perform their function 

as go-betweens that link the breakdown 
of food molecules and the release of 
energy (catabolism) to the energy-requiring 
biosynthesis of small and large organic 
molecules (anabolism). 


Figure 2-32 A mechanical model 
illustrating the principle of coupled 
chemical reactions. The spontaneous 
reaction shown in (A) could serve as an 
analogy for the direct oxidation of glucose 
to COs and H20, which produces heat 
only. In (B), the same reaction is coupled 
to a second reaction; this second reaction 
is analogous to the synthesis of activated 
carrier molecules. The energy produced in 
(B) is in a more useful form than in (A) and 
can be used to drive a variety of otherwise 
energetically unfavorable reactions (C). 
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energetically favorable reaction, such as the oxidation of foodstuffs, to an ener- 
getically unfavorable reaction, such as the generation of an activated carrier mol- 
ecule. In this example, the amount of heat released by the oxidation reaction is 
reduced by exactly the amount of energy stored in the energy-rich covalent bonds 
of the activated carrier molecule. And the activated carrier molecule picks up a 
packet of energy of a size sufficient to power a chemical reaction elsewhere in the 
cell. 


ATP Is the Most Widely Used Activated Carrier Molecule 


The most important and versatile of the activated carriers in cells is ATP (ade- 
nosine triphosphate). Just as the energy stored in the raised bucket of water in 
Figure 2-32B can drive a wide variety of hydraulic machines, ATP is a convenient 
and versatile store, or currency, of energy used to drive a variety of chemical reac- 
tions in cells. ATP is synthesized in an energetically unfavorable phosphorylation 
reaction in which a phosphate group is added to ADP (adenosine diphosphate). 
When required, ATP gives up its energy packet through its energetically favorable 
hydrolysis to ADP and inorganic phosphate (Figure 2-33). The regenerated ADP 
is then available to be used for another round of the phosphorylation reaction that 
forms ATP. 

The energetically favorable reaction of ATP hydrolysis is coupled to many oth- 
erwise unfavorable reactions through which other molecules are synthesized. 
Many of these coupled reactions involve the transfer of the terminal phosphate in 
ATP to another molecule, as illustrated by the phosphorylation reaction in Figure 
2-34. 

As the most abundant activated carrier in cells, ATP is the principle energy 
currency. To give just two examples, it supplies energy for many of the pumps 
that transport substances into and out of the cell (discussed in Chapter 11), and it 
powers the molecular motors that enable muscle cells to contract and nerve cells 
to transport materials from one end of their long axons to another (discussed in 
Chapter 16). 


Energy Stored in AIP Is Often Harnessed to Join Two Molecules 
Together 


We have previously discussed one way in which an energetically favorable reac- 
tion can be coupled to an energetically unfavorable reaction, X — Y, so as to 
enable it to occur. In that scheme, a second enzyme catalyzes the energetically 
favorable reaction Y — Z, pulling all of the X to Y in the process. But when the 
required productis Y and not Z, this mechanism is not useful. 
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Figure 2-33 The hydrolysis of ATP to 
ADP and inorganic phosphate. The two 
outermost phosphates in ATP are held to 
the rest of the molecule by high-energy 
phosphoanhydride bonds and are readily 
transferred. As indicated, water can be 
added to ATP to form ADP and inorganic 
phosphate (P). Hydrolysis of the terminal 
phosphate of ATP yields between 46 and 
54 kJ/mole of usable energy, depending 
on the intracellular conditions. The large 
negative AG of this reaction arises from 
several factors: release of the terminal 
phosphate group removes an unfavorable 
repulsion between adjacent negative 
charges, and the inorganic phosphate ion 
(P;) released is stabilized by resonance and 
by favorable hydrogen-bond formation with 
water. 


66 Chapter 2: Cell Chemistry and Bioenergetics 


hydroxyl | 
group on eae. 
another HO-C i 

molecule 









phosphoanhydride 
bond 


AG <0 PHOSPHATE TRANSFER 


ADP 


phosphoester 
bond 





A typical biosynthetic reaction is one in which two molecules, A and B, are 
joined together to produce A-B in the energetically unfavorable condensation 
reaction 

A-H + B-OH — A-B + H2O 


There is an indirect pathway that allows A-H and B-OH to form A-B, in which 
a coupling to ATP hydrolysis makes the reaction go. Here, energy from ATP hydro- 
lysis is first used to convert B-OH to a higher-energy intermediate compound, 
which then reacts directly with A-H to give A-B. The simplest possible mecha- 
nism involves the transfer of a phosphate from ATP to B-OH to make B-O-POs3, in 
which case the reaction pathway contains only two steps: 


1. B-OH + ATP — B-O-PO3 + ADP 
2. A-H +B-O-PO3 — A-B + Pi 
Net result: B-OH + ATP + A-H — A-B + ADP + P; 


The condensation reaction, which by itself is energetically unfavorable, is forced 
to occur by being directly coupled to ATP hydrolysis in an enzyme-catalyzed reac- 
tion pathway (Figure 2-35A). 

A biosynthetic reaction of exactly this type synthesizes the amino acid gluta- 
mine (Figure 2-35B). We will see shortly that similar (but more complex) mecha- 
nisms are also used to produce nearly all of the large molecules of the cell. 
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Figure 2-34 An example of a phosphate 
transfer reaction. Because an energy- 
rich phosphoanhydride bond in ATP 

is converted to a phosphoester bond, 

this reaction is energetically favorable, 
having a large negative AG. Reactions of 
this type are involved in the synthesis of 
phospholipids and in the initial steps of 
reactions that catabolize sugars. 


Figure 2-35 An example of an 
energetically unfavorable biosynthetic 
reaction driven by ATP hydrolysis. (A) 
Schematic illustration of the formation of A-B 
in the condensation reaction described in 
the text. (B) The biosynthesis of the common 
amino acid glutamine from glutamic acid and 
ammonia. Glutamic acid is first converted to 
a high-energy phosphorylated intermediate 
(corresponding to the compound B—O-PO3 
described in the text), which then reacts 
with ammonia (corresponding to A-H) to 
form glutamine. In this example, both steps 
occur on the surface of the same enzyme, 
glutamine synthetase. The high-energy 
bonds are shaded red; here, as elsewhere 
throughout the book, the symbol P; = 
HPO,2-, and a yellow “circled P” = PO3*. 
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NADH and NADPH Are Important Electron Carriers 


Other important activated carrier molecules participate in oxidation-reduction 
reactions and are commonly part of coupled reactions in cells. These activated 
carriers are specialized to carry electrons held at a high energy level (sometimes 
called “high-energy” electrons) and hydrogen atoms. The most important of these 
electron carriers are NAD* (nicotinamide adenine dinucleotide) and the closely 
related molecule NADP* (nicotinamide adenine dinucleotide phosphate). Each 
picks up a “packet of energy” corresponding to two electrons plus a proton (H+), 
and they are thereby converted to NADH (reduced nicotinamide adenine dinu- 
cleotide) and NADPH (reduced nicotinamide adenine dinucleotide phosphate), 
respectively (Figure 2-36). These molecules can therefore be regarded as carriers 
of hydride ions (the H* plus two electrons, or H7). 

Like ATP, NADPH is an activated carrier that participates in many important 
biosynthetic reactions that would otherwise be energetically unfavorable. The 
NADPH is produced according to the general scheme shown in Figure 2-36A. 
During a special set of energy-yielding catabolic reactions, two hydrogen atoms 
are removed from a substrate molecule. Both electrons but just one proton (that is, 
a hydride ion, H`) are added to the nicotinamide ring of NADP* to form NADPH; 
the second proton (H+) is released into solution. This is a typical oxidation-reduc- 
tion reaction, in which the substrate is oxidized and NADP* is reduced. 

NADPH readily gives up the hydride ion it carries in a subsequent oxida- 
tion-reduction reaction, because the nicotinamide ring can achieve a more 
stable arrangement of electrons without it. In this subsequent reaction, which 
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(A) | Figure 2-36 NADPH, an important carrier of electrons. 
H—C— (A) NADPH is produced in reactions of the general type shown on 
| | the left, in which two hydrogen atoms are removed from a substrate. 


the other H atom is released into solution. Because NADPH holds 


H—C—OH NADP* m—c— The oxidized form of the carrier molecule, NADP+, receives one 
| | hydrogen atom plus an electron (a hydride ion); the proton (H*) from 
| its hydride ion in a high-energy linkage, the hydride ion can easily 
l 0 C— be transferred to other molecules, as shown on the right. (B) and 
- m | (C) The structures of NADP* and NADPH. The part of the NADP* 
+ 


| molecule known as the nicotinamide ring accepts the hydride ion, 
ee l H7, forming NADPH. The molecules NAD* and NADH are identical in 
oxidation of reduction of structure to NADP+ and NADPH, respectively, except that they lack 
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the indicated phosphate group. 
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regenerates NADP+, it is the NADPH that is oxidized and the substrate that is 
reduced. The NADPH is an effective donor ofits hydride ion to other molecules for 
the same reason that ATP readily transfers a phosphate: in both cases the transfer 
is accompanied by a large negative free-energy change. One example of the use of 
NADPH in biosynthesis is shown in Figure 2-37. 

The extra phosphate group on NADPH has no effect on the electron-trans- 
fer properties of NADPH compared with NADH, being far away from the region 
involved in electron transfer (see Figure 2-36C). It does, however, give a molecule 
of NADPH a slightly different shape from that of NADH, making it possible for 
NADPH and NADH to bind as substrates to completely different sets of enzymes. 
Thus, the two types of carriers are used to transfer electrons (or hydride ions) 
between two different sets of molecules. 

Why should there be this division of labor? The answer lies in the need to 
regulate two sets of electron-transfer reactions independently. NADPH operates 
chiefly with enzymes that catalyze anabolic reactions, supplying the high-energy 
electrons needed to synthesize energy-rich biological molecules. NADH, by con- 
trast, has a special role as an intermediate in the catabolic system of reactions that 
generate ATP through the oxidation of food molecules, as we will discuss shortly. 
The genesis of NADH from NAD‘, and of NADPH from NADP*, occur by different 
pathways and are independently regulated, so that the cell can adjust the supply 
of electrons for these two contrasting purposes. Inside the cell the ratio of NAD* 
to NADH is kept high, whereas the ratio of NADP* to NADPH is kept low. This pro- 
vides plenty of NAD* to act as an oxidizing agent and plenty of NADPH to act as 
a reducing agent (Figure 2-37B)—as required for their special roles in catabolism 
and anabolism, respectively. 


There Are Many Other Activated Carrier Molecules in Cells 


Other activated carriers also pick up and carry a chemical group in an easily trans- 
ferred, high-energy linkage. For example, coenzyme A carries a readily transferable 
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Figure 2-37 NADPH as a reducing agent. (A) The final stage in the biosynthetic route leading to 
cholesterol. As in many other biosynthetic reactions, the reduction of the C=C bond is achieved by 
the transfer of a hydride ion from the carrier molecule NADPH, plus a proton (H+) from the solution. 
(B) Keeping NADPH levels high and NADH levels low alters their affinities for electrons (see 
Panel 14-1, p. 765). This causes NADPH to be a much stronger electron donor (reducing 
agent) than NADH, and NAD* therefore to be a much better electron acceptor (oxidizing 
agent) than NADP*, as indicated. 
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acetyl group in a thioester linkage, and in this activated form is known as acetyl 
CoA (acetyl coenzyme A). Acetyl CoA (Figure 2-38) is used to add two carbon 
units in the biosynthesis of larger molecules. 

In acetyl CoA, as in other carrier molecules, the transferable group makes up 
only a small part of the molecule. The rest consists of a large organic portion that 
serves as a convenient “handle,” facilitating the recognition of the carrier mole- 
cule by specific enzymes. As with acetyl CoA, this handle portion very often con- 
tains a nucleotide (usually adenosine), a curious fact that may be a relic from an 
early stage of evolution. It is currently thought that the main catalysts for early 
life-forms—before DNA or proteins—were RNA molecules (or their close rela- 
tives), as described in Chapter 6. It is tempting to speculate that many of the car- 
rier molecules that we find today originated in this earlier RNA world, where their 
nucleotide portions could have been useful for binding them to RNA enzymes 
(ribozymes). 

Thus, ATP transfers phosphate, NADPH transfers electrons and hydrogen, and 
acetyl CoA transfers two-carbon acetyl groups. FADH»2 (reduced flavin adenine 
dinucleotide) is used like NADH in electron and proton transfers (Figure 2-39). 
The reactions of other activated carrier molecules involve the transfer of a methyl, 
carboxyl, or glucose group for biosyntheses (Table 2-3). These activated carriers 


TABLE 2-3 


NADH, NADPH, FADH»2 Electrons and hydrogens 





69 


Figure 2-38 The structure of the 
important activated carrier molecule 
acetyl CoA. A ball-and-stick model is 
shown above the structure. The sulfur 
atom (yellow) forms a thioester bond to 
acetate. Because this is a high-energy 
linkage, releasing a large amount of free 
energy when it is hydrolyzed, the acetate 
molecule can be readily transferred to other 
molecules. 
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Figure 2-39 FADH+> is a carrier of 
hydrogens and high-energy electrons, 
like NADH and NADPH. (A) Structure of 
FADH», with its hydrogen-carrying atoms 
highlighted in yellow. (B) The formation of 
FADHÞ2 from FAD. 
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are generated in reactions that are coupled to ATP hydrolysis, as in the example 
in Figure 2-40. Therefore, the energy that enables their groups to be used for bio- 
synthesis ultimately comes from the catabolic reactions that generate ATP. Sim- 
ilar processes occur in the synthesis of the very large molecules of the cell—the 
nucleic acids, proteins, and polysaccharides—that we discuss next. 


The Synthesis of Biological Polymers Is Driven by ATP Hydrolysis 


As discussed previously, the macromolecules of the cell constitute most of its dry 
mass (see Figure 2-7). These molecules are made from subunits (or monomers) 
that are linked together in a condensation reaction, in which the constituents of a 
water molecule (OH plus H) are removed from the two reactants. Consequently, 
the reverse reaction—the breakdown of all three types of polymers—occurs by the 
enzyme-catalyzed addition of water (hydrolysis). This hydrolysis reaction is ener- 
getically favorable, whereas the biosynthetic reactions require an energy input 
(see Figure 2-9). 

The nucleic acids (DNA and RNA), proteins, and polysaccharides are all poly- 
mers that are produced by the repeated addition of a monomer onto one end of 
a growing chain. The synthesis reactions for these three types of macromolecules 
are outlined in Figure 2-41. As indicated, the condensation step in each case 
depends on energy from nucleoside triphosphate hydrolysis. And yet, except for 
the nucleic acids, there are no phosphate groups left in the final product mole- 
cules. How are the reactions that release the energy of ATP hydrolysis coupled to 
polymer synthesis? 

For each type of macromolecule, an enzyme-catalyzed pathway exists which 
resembles that discussed previously for the synthesis of the amino acid glutamine 
(see Figure 2-35). The principle is exactly the same, in that the -OH group that will 





ADP 
@@-0-ch, 
Kfis0se > ra 
~ 
C 
H 
a CH oo 
° pyruvate 
Kfiv0s 
CH, 
kd per 
bicarbonate O 5 
o boa 
pyruvate carboxylase oxaloacetate 





Figure 2-40 A carboxyl group-transfer reaction using an activated carrier molecule. Carboxylated biotin is used by the enzyme pyruvate 
carboxylase to transfer a carboxyl group in the production of oxaloacetate, a molecule needed for the citric acid cycle. The acceptor molecule for 
this group-transfer reaction is pyruvate. Other enzymes use biotin, a B-complex vitamin, to transfer carboxyl groups to other acceptor molecules. 
Note that synthesis of carboxylated biotin requires energy that is derived from ATP —a general feature of many activated carriers. 
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(A) POLYSACCHARIDES (B) NUCLEIC ACIDS 
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Figure 2-41 The synthesis of polysaccharides, proteins, and nucleic acids. Synthesis 
of each kind of biological polymer involves the loss of water in a condensation reaction. 

Not shown is the consumption of high-energy nucleoside triphosphates that is required to 
HY activate each monomer before its addition. In contrast, the reverse reaction—the breakdown 
protein of all three types of polymers—occurs by the simple addition of water (hydrolysis). 





be removed in the condensation reaction is first activated by becoming involved 
in a high-energy linkage to a second molecule. However, the actual mechanisms 
used to link ATP hydrolysis to the synthesis of proteins and polysaccharides are 
more complex than that used for glutamine synthesis, since a series of high-en- 
ergy intermediates is required to generate the final high-energy bond that is bro- 
ken during the condensation step (discussed in Chapter 6 for protein synthesis). 
Each activated carrier has limits in its ability to drive a biosynthetic reaction. 
The AG for the hydrolysis of ATP to ADP and inorganic phosphate (P;) depends 
on the concentrations of all of the reactants, but under the usual conditions in a 
cell it is between -46 and -54 kJ/mole. In principle, this hydrolysis reaction could 
drive an unfavorable reaction with a AG of, perhaps, +40 kJ/mole, provided that a 
suitable reaction path is available. For some biosynthetic reactions, however, even 
-50 kJ/mole does not provide enough of a driving force. In these cases, the path 
of ATP hydrolysis can be altered so that it initially produces AMP and pyrophos- 
phate (PP;), which is itself then hydrolyzed in a subsequent step (Figure 2-42). 
The whole process makes available a total free-energy change of about -100 kJ/ 
mole. An important type of biosynthetic reaction that is driven in this way is the 
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(B) Figure 2—42 An alternative pathway of 


(A) 
ATP hydrolysis, in which pyrophosphate 
is first formed and then hydrolyzed. This 
ohofofo ATE route releases about twice as much free 
energy (approximately -100 kJ/mole) as 
the reaction shown earlier in Figure 2-33, 


and it forms AMP instead of ADP. (A) In 
the two successive hydrolysis reactions, 
adenosine triphosphate (ATP) H2O oxygen atoms from the participating water 





H-O molecules are retained in the products, as 

= indicated, whereas the hydrogen atoms 
dissociate to form free hydrogen ions 
(H+, not shown). (B) Summary of overall 
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synthesis of nucleic acids (polynucleotides) from nucleoside triphosphates, as 
illustrated on the right side of Figure 2-43. 

Note that the repetitive condensation reactions that produce macromolecules 
can be oriented in one of two ways, giving rise to either the head polymerization 
or the tail polymerization of monomers. In so-called head polymerization, the 
reactive bond required for the condensation reaction is carried on the end of the 
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Figure 2-43 Synthesis of a polynucleotide, RNA or DNA, is a multistep process driven by ATP 
hydrolysis. In the first step, a nucleoside monophosphate is activated by the sequential transfer of 
the terminal phosphate groups from two ATP molecules. The high-energy intermediate formed—a 
nucleoside triphosphate — exists free in solution until it reacts with the growing end of an RNA or a 
DNA chain with release of pyrophosphate. Hydrolysis of the latter to inorganic phosphate is highly 
favorable and helps to drive the overall reaction in the direction of polynucleotide synthesis. For 
details, see Chapter 5. 
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growing polymer, and it must therefore be regenerated each time thata monomer Figure 2-44 The orientation of the 
is added. In this case, each monomer brings with it the reactive bond that will be sone euler AE AE la 
used in adding the next monomer in the series. In tail polymerization, the reactive 4, ogical polymers. The head growth of 
bond carried by each monomer is instead used immediately for its own addition polymers is compared with its alternative, 
(Figure 2-44). tail growth. As indicated, these two 

We shall see in later chapters that both of these types of polymerization are mechanisms are used to produce different 
used. The synthesis of polynucleotides and some simple polysaccharides occurs  tYP®9S 9f biological macromolecules. 
by tail polymerization, for example, whereas the synthesis of proteins occurs by a 


head polymerization process. 


Summary 


Living cells need to create and maintain order within themselves to survive and 
grow. This is thermodynamically possible only because of a continual input of 
energy, part of which must be released from the cells to their environment as heat 
that disorders the surroundings. The only chemical reactions possible are those that 
increase the total amount of disorder in the universe. The free-energy change for a 
reaction, AG, measures this disorder, and it must be less than zero for a reaction 
to proceed spontaneously. This AG depends both on the intrinsic properties of the 
reacting molecules and their concentrations, and it can be calculated from these 
concentrations if either the equilibrium constant (K) for the reaction or its standard 
free-energy change, AG”, is known. 

The energy needed for life comes ultimately from the electromagnetic radiation 
of the sun, which drives the formation of organic molecules in photosynthetic organ- 
isms such as green plants. Animals obtain their energy by eating organic molecules 
and oxidizing them in a series of enzyme-catalyzed reactions that are coupled to the 
formation of ATP—a common currency of energy in all cells. 

To make possible the continual generation of order in cells, energetically favor- 
able reactions, such as the hydrolysis of ATP, are coupled to energetically unfavor- 
able reactions. In the biosynthesis of macromolecules, ATP is used to form reactive 
phosphorylated intermediates. Because the energetically unfavorable reaction of 
biosynthesis now becomes energetically favorable, ATP hydrolysis is said to drive 
the reaction. Polymeric molecules such as proteins, nucleic acids, and polysaccha- 
rides are assembled from small activated precursor molecules by repetitive conden- 
sation reactions that are driven in this way. Other reactive molecules, called either 
activated carriers or coenzymes, transfer other chemical groups in the course of 
biosynthesis: NADPH transfers hydrogen as a proton plus two electrons (a hydride 
ion), for example, whereas acetyl CoA transfers an acetyl group. 


HOW CELLS OBTAIN ENERGY FROM FOOD 


The constant supply of energy that cells need to generate and maintain the bio- 
logical order that keeps them alive comes from the chemical-bond energy in food 
molecules. 

The proteins, lipids, and polysaccharides that make up most of the food we eat 
must be broken down into smaller molecules before our cells can use them—either 
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as a source of energy or as building blocks for other molecules. Enzymatic diges- 
tion breaks down the large polymeric molecules in food into their monomer sub- 
units—proteins into amino acids, polysaccharides into sugars, and fats into fatty 
acids and glycerol. After digestion, the small organic molecules derived from food 
enter the cytosol of cells, where their gradual oxidation begins. 

Sugars are particularly important fuel molecules, and they are oxidized in 
small controlled steps to carbon dioxide (CO2) and water (Figure 2-45). In this 
section, we trace the major steps in the breakdown, or catabolism, of sugars and 
show how they produce ATP, NADH, and other activated carrier molecules in ani- 
mal cells. A very similar pathway also operates in plants, fungi, and many bacteria. 
As we shall see, the oxidation of fatty acids is equally important for cells. Other 
molecules, such as proteins, can also serve as energy sources when they are fun- 
neled through appropriate enzymatic pathways. 


Glycolysis Is a Central ATP-Producing Pathway 


The major process for oxidizing sugars is the sequence of reactions known as 
glycolysis—from the Greek glukus, “sweet,’ and lusis, “rupture.” Glycolysis pro- 
duces ATP without the involvement of molecular oxygen (Oz gas). It occurs in the 
cytosol of most cells, including many anaerobic microorganisms. Glycolysis prob- 
ably evolved early in the history of life, before photosynthetic organisms intro- 
duced oxygen into the atmosphere. During glycolysis, a glucose molecule with six 
carbon atoms is converted into two molecules of pyruvate, each of which contains 
three carbon atoms. For each glucose molecule, two molecules of ATP are hydro- 
lyzed to provide energy to drive the early steps, but four molecules of ATP are pro- 
duced in the later steps. At the end of glycolysis, there is consequently a net gain 
of two molecules of ATP for each glucose molecule broken down. Two molecules 
of the activated carrier NADH are also produced. 

The glycolytic pathway is outlined in Figure 2-46 and shown in more detail in 
Panel 2-8 (pp. 104-105) and Movie 2.5. Glycolysis involves a sequence of 10 sep- 
arate reactions, each producing a different sugar intermediate and each catalyzed 
by a different enzyme. Like most enzymes, these have names ending in ase—such 
as isomerase and dehydrogenase—to indicate the type of reaction they catalyze. 

Although no molecular oxygen is used in glycolysis, oxidation occurs, in that 
electrons are removed by NAD* (producing NADH) from some of the carbons 
derived from the glucose molecule. The stepwise nature of the process releases 
the energy of oxidation in small packets, so that much of it can be stored in acti- 
vated carrier molecules rather than all of it being released as heat (see Figure 
2-45). Thus, some of the energy released by oxidation drives the direct synthesis 
of ATP molecules from ADP and P;, and some remains with the electrons in the 
electron carrier NADH. 
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Figure 2-45 Schematic representation 
of the controlled stepwise oxidation of 
sugar in a cell, compared with ordinary 
burning. (A) If the sugar were oxidized to 
COs and H20 in a single step, it would 
release an amount of energy much larger 
than could be captured for useful purposes. 
(B) In the cell, enzymes catalyze oxidation 
via a series of small steps in which free 
energy is transferred in conveniently sized 
packets to carrier molecules — most often 
ATP and NADH. At each step, an enzyme 
controls the reaction by reducing the 
activation-energy barrier that has to be 
surmounted before the specific reaction 
can occur. The total free energy released is 
exactly the same in (A) and (B). 


(B) STEPWISE OXIDATION OF SUGAR IN CELLS 
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HOW CELLS OBTAIN ENERGY FROM FOOD 


Two molecules of NADH are formed per molecule of glucose in the course of 
glycolysis. In aerobic organisms, these NADH molecules donate their electrons to 
the electron-transport chain described in Chapter 14, and the NAD* formed from 
the NADH is used again for glycolysis (see step 6 in Panel 2-8, pp. 104-105). 


Fermentations Produce ATP in the Absence of Oxygen 


For most animal and plant cells, glycolysis is only a prelude to the final stage of 
the breakdown of food molecules. In these cells, the pyruvate formed by glycolysis 
is rapidly transported into the mitochondria, where it is converted into CO% plus 
acetyl CoA, whose acetyl group is then completely oxidized to COz and H20. 

In contrast, for many anaerobic organisms—which do not utilize molecular 
oxygen and can grow and divide without it—glycolysis is the principal source of 
the cell’s ATP. Certain animal tissues, such as skeletal muscle, can also continue 
to function when molecular oxygen is limited. In these anaerobic conditions, the 
pyruvate and the NADH electrons stay in the cytosol. The pyruvate is converted 
into products excreted from the cell—for example, into ethanol and CO» in the 
yeasts used in brewing and breadmaking, or into lactate in muscle. In this process, 
the NADH gives up its electrons and is converted back into NAD*. This regener- 
ation of NAD* is required to maintain the reactions of glycolysis (Figure 2-47). 

Energy-yielding pathways like these, in which organic molecules both donate 
and accept electrons (and which are often, as in these cases, anaerobic), are called 
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Figure 2-46 An outline of glycolysis. 
Each of the 10 steps shown is catalyzed 
by a different enzyme. Note that step 4 
cleaves a six-carbon sugar into two three- 
carbon sugars, so that the number of 
molecules at every stage after this doubles. 
As indicated, step 6 begins the energy- 
generation phase of glycolysis. Because 
two molecules of ATP are hydrolyzed in the 
early, energy-investment phase, glycolysis 
results in the net synthesis of 2 ATP and 2 
NADH molecules per molecule of glucose 
(see also Panel 2-8). 
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(A) FERMENTATION LEADING TO EXCRETION OF LACTATE 
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fermentations. Studies of the commercially important fermentations carried out 
by yeasts inspired much of early biochemistry. Work in the nineteenth century 
led in 1896 to the then startling recognition that these processes could be studied 
outside living organisms, in cell extracts. This revolutionary discovery eventually 
made it possible to dissect out and study each of the individual reactions in the 
fermentation process. The piecing together of the complete glycolytic pathway in 
the 1930s was a major triumph of biochemistry, and it was quickly followed by the 
recognition of the central role of ATP in cell processes. 


Glycolysis Illustrates How Enzymes Couple Oxidation to Energy 
Storage 


The formation of ATP during glycolysis provides a particularly clear demonstra- 
tion of how enzymes couple energetically unfavorable reactions with favorable 
ones, thereby driving the many chemical reactions that make life possible. Two 
central reactions in glycolysis (steps 6 and 7) convert the three-carbon sugar inter- 
mediate glyceraldehyde 3-phosphate (an aldehyde) into 3-phosphoglycerate (a 
carboxylic acid; see Panel 2-8, pp. 104-105), thus oxidizing an aldehyde group to a 
carboxylic acid group. The overall reaction releases enough free energy to convert 
a molecule of ADP to ATP and to transfer two electrons (and a proton) from the 
aldehyde to NAD* to form NADH, while still liberating enough heat to the envi- 
ronment to make the overall reaction energetically favorable (AG° for the overall 
reaction is -12.5 kJ/mole). 

Figure 2-48 outlines this remarkable feat of energy harvesting. The chemical 
reactions are precisely guided by two enzymes to which the sugar intermediates 


Figure 2-47 Two pathways for the 
anaerobic breakdown of pyruvate. 

(A) When there is inadequate oxygen, 

for example, in a muscle cell undergoing 
vigorous contraction, the pyruvate 
produced by glycolysis is converted to 
lactate as shown. This reaction regenerates 
the NAD* consumed in step 6 of glycolysis, 
but the whole pathway yields much less 
energy overall than complete oxidation. 

(B) In some organisms that can grow 
anaerobically, such as yeasts, pyruvate is 
converted via acetaldehyde into carbon 
dioxide and ethanol. Again, this pathway 
regenerates NAD* from NADH, as required 
to enable glycolysis to continue. Both (A) 
and (B) are examples of fermentations. 
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A short-lived covalent bond is 
formed between glyceraldehyde 
3-phosphate and the -SH group of 
a cysteine side chain of the enzyme 
glyceraldehyde 3-phosphate 
dehydrogenase. The enzyme also 
binds noncovalently to NAD*. 


Glyceraldehyde 3-phosphate is 
oxidized as the enzyme removes a 
hydrogen atom (yellow) and 
transfers it, along with an electron, 
to NAD+, forming NADH (see 
Figure 2-37). Part of the energy 
released by the oxidation of the 
aldehyde is thus stored in NADH, 
and part is stored in the high- 
energy thioester bond that links 
glyceraldehyde 3-phosphate to the 
enzyme. 


A molecule of inorganic phosphate 
displaces the high-energy thioester 
bond to create 1,3-bisphospho- 
glycerate, which contains a 
high-energy phosphate bond. 
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Figure 2-48 Energy storage in steps 

6 and 7 of glycolysis. (A) In step 6, the 
enzyme glyceraldehyde 3-phosphate 
dehydrogenase couples the energetically 
favorable oxidation of an aldehyde to 

the energetically unfavorable formation 

of a high-energy phosphate bond. At 

the same time, it enables energy to be 
stored in NADH. The formation of the 
high-energy phosphate bond is driven by 
the oxidation reaction, and the enzyme 
thereby acts like the “paddle wheel” 
coupler in Figure 2-32B. In step 7, the 
newly formed high-energy phosphate bond 
in 1,3-bisphosphoglycerate is transferred 
to ADP, forming a molecule of ATP and 
leaving a free carboxylic acid group on the 
oxidized sugar. The part of the molecule 
that undergoes a change is shaded in 
blue; the rest of the molecule remains 
unchanged throughout all these reactions. 
(B) Summary of the overall chemical 
change produced by reactions 6 and 7. 
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are tightly bound. As detailed in Figure 2-48, the first enzyme (glyceraldehyde 
3-phosphate dehydrogenase) forms a short-lived covalent bond to the aldehyde 
through a reactive -SH group on the enzyme, and catalyzes its oxidation by NAD+ 
in this attached state. The reactive enzyme-substrate bond is then displaced by 
an inorganic phosphate ion to produce a high-energy phosphate intermediate, 
which is released from the enzyme. This intermediate binds to the second enzyme 
(phosphoglycerate kinase), which catalyzes the energetically favorable transfer of 
the high-energy phosphate just created to ADP, forming ATP and completing the 
process of oxidizing an aldehyde to a carboxylic acid. Note that the C-H bond oxi- 
dation energy in step 6 drives the formation of both NADH and a high-energy 
phosphate bond. The breakage of the high-energy bond then drives ATP forma- 
tion. 

We have shown this particular oxidation process in some detail because it pro- 
vides a clear example of enzyme-mediated energy storage through coupled reac- 
tions (Figure 2-49). Steps 6 and 7 are the only reactions in glycolysis that create a 
high-energy phosphate linkage directly from inorganic phosphate. As such, they 
account for the net yield of two ATP molecules and two NADH molecules per mol- 
ecule of glucose (see Panel 2-8, pp. 104-105). 

As we have just seen, ATP can be formed readily from ADP when a reaction 
intermediate is formed with a phosphate bond of higher energy than the terminal 
phosphate bond in ATP. Phosphate bonds can be ordered in energy by comparing 
the standard free-energy change (AG°) for the breakage of each bond by hydroly- 
sis. Figure 2-50 compares the high-energy phosphoanhydride bonds in ATP with 
the energy of some other phosphate bonds, several of which are generated during 
glycolysis. 


Organisms Store Food Molecules in Special Reservoirs 


All organisms need to maintain a high ATP/ADP ratio to maintain biological order 
in their cells. Yet animals have only periodic access to food, and plants need to 
survive overnight without sunlight, when they are unable to produce sugar from 
photosynthesis. For this reason, both plants and animals convert sugars and fats 
to special forms for storage (Figure 2-51). 

To compensate for long periods of fasting, animals store fatty acids as fat 
droplets composed of water-insoluble triacylglycerols (also called triglycerides). 
The triacylglycerols in animals are mostly stored in the cytoplasm of specialized 
fat cells called adipocytes. For shorter-term storage, sugar is stored as glucose 
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TOTAL ENERGY CHANGE for step 6 followed by step 7 is a favorable -12.5 kJ/mole 


Figure 2-49 Schematic view of the 
coupled reactions that form NADH and 
ATP in steps 6 and 7 of glycolysis. The 
C-H bond oxidation energy drives the 
formation of both NADH and a high-energy 
phosphate bond. The breakage of the high- 
energy bond then drives ATP formation. 
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Figure 2-50 Phosphate bonds have different energies. Examples of different types of phosphate bonds with 
their sites of hydrolysis are shown in the molecules depicted on the left. Those starting with a gray carbon atom 


show only part of a molecule. Examples of molecules containing such bonds are given on the right, with the 


standard free-energy change for hydrolysis in kilojoules. The transfer of a phosphate group from one molecule 


to another is energetically favorable if the free-energy change (AG) for hydrolysis of the phosphate bond of 


the first molecule is more negative than that for hydrolysis of the phosphate bond in the second. Thus, under 


standard conditions, a phosphate group is readily transferred from 1,3-bisphosphoglycerate to ADP to form 


ATP. (Standard conditions often do not pertain to living cells, where the relative concentrations of reactants and 
products will influence the actual change in free energy.) The hydrolysis reaction can be viewed as the transfer 


of the phosphate group to water. 


subunits in the large branched polysaccharide glycogen, which is present as 
small granules in the cytoplasm of many cells, including liver and muscle. The 
synthesis and degradation of glycogen are rapidly regulated according to need. 
When cells need more ATP than they can generate from the food molecules taken 
in from the bloodstream, they break down glycogen in a reaction that produces 
glucose 1-phosphate, which is rapidly converted to glucose 6-phosphate for gly- 
colysis (Figure 2-52). 

Quantitatively, fat is far more important than glycogen as an energy store for 
animals, presumably because it provides for more efficient storage. The oxidation 
of a gram of fat releases about twice as much energy as the oxidation of a gram 
of glycogen. Moreover, glycogen differs from fat in binding a great deal of water, 
producing a sixfold difference in the actual mass of glycogen required to store the 
same amount of energy as fat. An average adult human stores enough glycogen 
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Figure 2-51 The storage of sugars and fats in animal and plant cells. (A) The structures of starch and glycogen, the 

storage form of sugars in plants and animals, respectively. Both are storage polymers of the sugar glucose and differ only in the 
frequency of branch points. There are many more branches in glycogen than in starch. (B) An electron micrograph of glycogen 
granules in the cytoplasm of a liver cell. (C) A thin section of a chloroplast from a plant cell, snowing the starch granules and lipid 
(fat droplets) that have accumulated as a result of the biosyntheses occurring there. (D) Fat droplets (stained red) beginning to 
accumulate in developing fat cells of an animal. (B, courtesy of Robert Fletterick and Daniel S. Friend; C, courtesy of K. Plaskitt; 
D, courtesy of Ronald M. Evans and Peter Totonoz.) 


for only about a day of normal activities, but enough fat to last for nearly a month. 
If our main fuel reservoir had to be carried as glycogen instead of fat, body weight 
would increase by an average of about 60 pounds. 

The sugar and ATP needed by plant cells are largely produced in separate 
organelles: sugars in chloroplasts (the organelles specialized for photosynthesis), 
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and ATP in mitochondria. Although plants produce abundant amounts of both 
ATP and NADPH in their chloroplasts, this organelle is isolated from the rest of its 
plant cell by a membrane that is impermeable to both types of activated carrier 
molecules. Moreover, the plant contains many cells—such as those in the roots— 
that lack chloroplasts and therefore cannot produce their own sugars. Thus, sug- 
ars are exported from chloroplasts to the mitochondria present in all cells of the 
plant. Most of the ATP needed for general plant cell metabolism is synthesized in 
these mitochondria, using exactly the same pathways for the oxidative breakdown 
of sugars as in nonphotosynthetic organisms; this ATP is then passed to the rest of 
the cell (see Figure 14-42). 

During periods of excess photosynthetic capacity during the day, chloroplasts 
convert some of the sugars that they make into fats and into starch, a polymer of 
glucose analogous to the glycogen of animals. The fats in plants are triacyl-glyc- 
erols (triglycerides), just like the fats in animals, and differ only in the types of 
fatty acids that predominate. Fat and starch are both stored inside the chloroplast 
until needed for energy-yielding oxidation during periods of darkness (see Figure 
2-51C). 

The embryos inside plant seeds must live on stored sources of energy for a 
prolonged period, until they germinate and produce leaves that can harvest the 
energy in sunlight. For this reason plant seeds often contain especially large 
amounts of fats and starch—which makes them a major food source for animals, 
including ourselves (Figure 2-53). 


Most Animal Cells Derive Their Energy from Fatty Acids Between 
Meals 


After a meal, most of the energy that an animal needs is derived from sugars 
obtained from food. Excess sugars, if any, are used to replenish depleted glycogen 
stores, or to synthesize fats as a food store. But soon the fat stored in adipose tissue 
is called into play, and by the morning after an overnight fast, fatty acid oxidation 
generates most of the ATP we need. 

Low glucose levels in the blood trigger the breakdown of fats for energy pro- 
duction. As illustrated in Figure 2-54, the triacylglycerols stored in fat droplets 
in adipocytes are hydrolyzed to produce fatty acids and glycerol, and the fatty 
acids released are transferred to cells in the body through the bloodstream. While 
animals readily convert sugars to fats, they cannot convert fatty acids to sugars. 
Instead, the fatty acids are oxidized directly. 


Sugars and Fats Are Both Degraded to Acetyl CoA in 
Mitochondria 


In aerobic metabolism, the pyruvate that was produced by glycolysis from sugars 
in the cytosol is transported into the mitochondria of eukaryotic cells. There, it is 
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Figure 2-53 Some plant seeds that 
serve as important foods for humans. 
Corn, nuts, and peas all contain rich stores 
of starch and fat that provide the young 
plant embryo in the seed with energy and 
building blocks for biosynthesis. (Courtesy 
of the John Innes Foundation.) 
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rapidly decarboxylated by a giant complex of three enzymes, called the pyruvate 
dehydrogenase complex. The products of pyruvate decarboxylation are a molecule 
of CO; (a waste product), a molecule of NADH, and acetyl CoA (see Panel 2-9). 

The fatty acids imported from the bloodstream are moved into mitochondria, 
where all of their oxidation takes place (Figure 2-55). Each molecule of fatty acid 
(as the activated molecule fatty acyl CoA) is broken down completely by a cycle of 
reactions that trims two carbons at a time from its carboxyl end, generating one 
molecule of acetyl CoA for each turn of the cycle. A molecule of NADH and a mol- 
ecule of FADH; are also produced in this process (Figure 2-56). 

Sugars and fats are the major energy sources for most nonphotosynthetic 
organisms, including humans. However, most of the useful energy that can be 
extracted from the oxidation of both types of foodstuffs remains stored in the ace- 
tyl CoA molecules that are produced by the two types of reactions just described. 
The citric acid cycle of reactions, in which the acetyl group (-COCHs) in acetyl 
CoA is oxidized to CO% and H20, is therefore central to the energy metabolism 
of aerobic organisms. In eukaryotes, these reactions all take place in mitochon- 
dria. We should therefore not be surprised to discover that the mitochondrion is 
the place where most of the ATP is produced in animal cells. In contrast, aerobic 
bacteria carry out all of their reactions, including the citric acid cycle, in a single 
compartment, the cytosol. 


The Citric Acid Cycle Generates NADH by Oxidizing Acetyl Groups 
to COd 


In the nineteenth century, biologists noticed that in the absence of air cells pro- 
duce lactic acid (for example, in muscle) or ethanol (for example, in yeast), while 
in its presence they consume O% and produce CO2 and H20. Efforts to define the 
pathways of aerobic metabolism eventually focused on the oxidation of pyru- 
vate and led in 1937 to the discovery of the citric acid cycle, also known as the 
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Figure 2-54 How stored fats are 
mobilized for energy production in 
animals. Low glucose levels in the blood 
trigger the hydrolysis of the triacylglycerol 
molecules in fat droplets to free fatty 

acids and glycerol. These fatty acids enter 
the bloodstream, where they bind to the 
abundant blood protein, serum albumin. 
Special fatty acid transporters in the 
plasma membrane of cells that oxidize fatty 
acids, such as muscle cells, then pass 
these fatty acids into the cytosol, from 
which they are moved into mitochondria for 
energy production. 


Figure 2-55 Pathways for 

the production of acetyl CoA 
from sugars and fats. The 
mitochondrion in eukaryotic cells 
is where acetyl CoA is produced 
from both types of major food 
molecules. It is therefore the 
place where most of the cell’s 
oxidation reactions occur and 
where most of its ATP is made. 
Amino acids (not shown) can 
also enter the mitochondria, to 
be converted there into acetyl 
CoA or another intermediate of 
the citric acid cycle. The structure 
and function of mitochondria are 
discussed in detail in Chapter 14. 
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tricarboxylic acid cycle or the Krebs cycle. The citric acid cycle accounts for about 
two-thirds of the total oxidation of carbon compounds in most cells, and its major 
end products are CO% and high-energy electrons in the form of NADH. The CO2 
is released as a waste product, while the high-energy electrons from NADH are 
passed to a membrane-bound electron-transport chain (discussed in Chapter 
14), eventually combining with O2 to produce H20. The citric acid cycle itself does 
not use gaseous Oz (it uses oxygen atoms from H20). But the cycle does require O» 
in subsequent reactions to keep it going. This is because there is no other efficient 
way for the NADH to get rid of its electrons and thus regenerate the NAD* that is 
needed. 

The citric acid cycle takes place inside mitochondria in eukaryotic cells. It 
results in the complete oxidation of the carbon atoms of the acetyl groups in ace- 
tyl CoA, converting them into CO2. But the acetyl group is not oxidized directly. 
Instead, this group is transferred from acetyl CoA to a larger, four-carbon mole- 
cule, oxaloacetate, to form the six-carbon tricarboxylic acid, citric acid, for which 
the subsequent cycle of reactions is named. The citric acid molecule is then grad- 
ually oxidized, allowing the energy of this oxidation to be harnessed to produce 
energy-rich activated carrier molecules. The chain of eight reactions forms a cycle 
because at the end the oxaloacetate is regenerated and enters a new turn of the 
cycle, as shown in outline in Figure 2-57. 

We have thus far discussed only one of the three types of activated carrier 
molecules that are produced by the citric acid cycle; NADH, the reduced form of 
the NADt/NADH electron carrier system (see Figure 2-36). In addition to three 
molecules of NADH, each turn of the cycle also produces one molecule of FADH2 
(reduced flavin adenine dinucleotide) from FAD (see Figure 2-39), and one mol- 
ecule of the ribonucleoside triphosphate GTP from GDP. The structure of GTP is 
illustrated in Figure 2-58. GTP is a close relative of ATP, and the transfer of its 
terminal phosphate group to ADP produces one ATP molecule in each cycle. As 
we discuss shortly, the energy that is stored in the readily transferred electrons of 
NADH and FADH; will be utilized subsequently for ATP production through the 
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Figure 2-56 The oxidation of fatty acids 
to acetyl CoA. (A) Electron micrograph 

of a lipid droplet in the cytoplasm. (B) The 
structure of fats. Fats are triacylglycerols. 
The glycerol portion, to which three fatty 
acids are linked through ester bonds, 

is shown in blue. Fats are insoluble in 
water and form large lipid droplets in the 
specialized fat cells (adipocytes) in which 
they are stored. (C) The fatty acid oxidation 
cycle. The cycle is catalyzed by a series of 
four enzymes in mitochondria. Each turn of 
the cycle shortens the fatty acid chain by 
two carbons (shown in red) and generates 
one molecule of acetyl CoA and one 
molecule each of NADH and FADH». 

(A, courtesy of Daniel S. Friend.) 
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process of oxidative phosphorylation, the only step in the oxidative catabolism of 
foodstuffs that directly requires gaseous oxygen (O2) from the atmosphere. 

Panel 2-9 (pp. 106-107) and Movie 2.6 present the complete citric acid cycle. 
Water, rather than molecular oxygen, supplies the extra oxygen atoms required 
to make COz from the acetyl groups entering the citric acid cycle. As illustrated in 
the panel, three molecules of water are split in each cycle, and the oxygen atoms 
of some of them are ultimately used to make CO2. 

In addition to pyruvate and fatty acids, some amino acids pass from the cytosol 
into mitochondria, where they are also converted into acetyl CoA or one of the 
other intermediates of the citric acid cycle. Thus, in the eukaryotic cell, the mito- 
chondrion is the center toward which all energy-yielding processes lead, whether 
they begin with sugars, fats, or proteins. 

Both the citric acid cycle and glycolysis also function as starting points for 
important biosynthetic reactions by producing vital carbon-containing interme- 
diates, such as oxaloacetate and a-ketoglutarate. Some of these substances pro- 
duced by catabolism are transferred back from the mitochondria to the cytosol, 
where they serve in anabolic reactions as precursors for the synthesis of many 
essential molecules, such as amino acids (Figure 2-59). 


Electron Transport Drives the Synthesis of the Majority of the ATP 
in Most Cells 


Most chemical energy is released in the last stage in the degradation of a food 
molecule. In this final process, NADH and FADH; transfer the electrons that they 
gained when oxidizing food-derived organic molecules to the electron-transport 
chain, which is embedded in the inner membrane of the mitochondrion (see 
Figure 14-10). As the electrons pass along this long chain of specialized electron 
acceptor and donor molecules, they fall to successively lower energy states. The 
energy that the electrons release in this process pumps Ht ions (protons) across 
the membrane—from the innermost mitochondrial compartment (the matrix) 
to the intermembrane space (and then to the cytosol)—generating a gradient of 
H* ions (Figure 2-60). This gradient serves as a major source of energy for cells, 
being tapped like a battery to drive a variety of energy-requiring reactions. The 
most prominent of these reactions is the generation of ATP by the phosphoryla- 
tion of ADP. 


Figure 2-57 Simple overview of the citric 
acid cycle. The reaction of acetyl CoA with 
oxaloacetate starts the cycle by producing 
citrate (citric acid). In each turn of the 
cycle, two molecules of COs are produced 
as waste products, plus three molecules 

of NADH, one molecule of GTP, and one 
molecule of FADHs. The number of carbon 
atoms in each intermediate is shown in a 
yellow box. For details, see Panel 2—9 

(op. 106-107). 
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O Figure 2-58 The structure of GTP. GIP 
guanine | and GDP are close relatives of ATP and 
N ADP, respectively. 
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At the end of this series of electron transfers, the electrons are passed to mole- 
cules of oxygen gas (O2) that have diffused into the mitochondrion, which simul- 
taneously combine with protons (H*) from the surrounding solution to produce 
water. The electrons have now reached a low energy level, and all the available 
energy has been extracted from the oxidized food molecule. This process, termed 
oxidative phosphorylation (Figure 2-61), also occurs in the plasma membrane 
of bacteria. As one of the most remarkable achievements of cell evolution, it is a 
central topic of Chapter 14. 

In total, the complete oxidation of a molecule of glucose to H20 and COz is 
used by the cell to produce about 30 molecules of ATP. In contrast, only 2 mol- 
ecules of ATP are produced per molecule of glucose by glycolysis alone. 


Amino Acids and Nucleotides Are Part of the Nitrogen Cycle 


So far we have concentrated mainly on carbohydrate metabolism and have not yet 
considered the metabolism of nitrogen or sulfur. These two elements are import- 
ant constituents of biological macromolecules. Nitrogen and sulfur atoms pass 
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Figure 2—59 Glycolysis and the citric 
acid cycle provide the precursors 
needed to synthesize many important 
biological molecules. The amino acids, 
nucleotides, lipids, sugars, and other 
molecules — shown here as products—in 
turn serve as the precursors for the many 
macromolecules of the cell. Each black 
arrow in this diagram denotes a single 
enzyme-catalyzed reaction; the red arrows 
generally represent pathways with many 
steps that are required to produce the 
indicated products. 
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from compound to compound and between organisms and their environment in 
a series of reversible cycles. 

Although molecular nitrogen is abundant in the Earth’s atmosphere, nitrogen 
is chemically unreactive as a gas. Only a few living species are able to incorpo- 
rate it into organic molecules, a process called nitrogen fixation. Nitrogen fixa- 
tion occurs in certain microorganisms and by some geophysical processes, such 
as lightning discharge. It is essential to the biosphere as a whole, for without it life 
could not exist on this planet. Only a small fraction of the nitrogenous compounds 
in today’s organisms, however, is due to fresh products of nitrogen fixation from 
the atmosphere. Most organic nitrogen has been in circulation for some time, 
passing from one living organism to another. Thus, present-day nitrogen-fixing 
reactions can be said to perform a “topping-up” function for the total nitrogen 
supply. 

Vertebrates receive virtually all of their nitrogen from their dietary intake of 
proteins and nucleic acids. In the body, these macromolecules are broken down 
to amino acids and the components of nucleotides, and the nitrogen they contain 
is used to produce new proteins and nucleic acids—or other molecules. About half 
of the 20 amino acids found in proteins are essential amino acids for vertebrates 
(Figure 2-62), which means that they cannot be synthesized from other ingredi- 
ents of the diet. The other amino acids can be so synthesized, using a variety of 
raw materials, including intermediates of the citric acid cycle. The essential amino 
acids are made by plants and other organisms, usually by long and energetically 
expensive pathways that have been lost in the course of vertebrate evolution. 

The nucleotides needed to make RNA and DNA can be synthesized using spe- 
cialized biosynthetic pathways. All of the nitrogens in the purine and pyrimidine 
bases (as well as some of the carbons) are derived from the plentiful amino acids 
glutamine, aspartic acid, and glycine, whereas the ribose and deoxyribose sugars 
are derived from glucose. There are no “essential nucleotides” that must be pro- 
vided in the diet. 

Amino acids not used in biosynthesis can be oxidized to generate metabolic 
energy. Most of their carbon and hydrogen atoms eventually form CO% or H20, 
whereas their nitrogen atoms are shuttled through various forms and eventually 
appear as urea, which is excreted. Each amino acid is processed differently, and a 
whole constellation of enzymatic reactions exists for their catabolism. 

Sulfur is abundant on Earth in its most oxidized form, sulfate (SO,2-). To be 
useful for life, sulfate must be reduced to sulfide (S*-), the oxidation state of sulfur 
required for the synthesis of essential biological molecules, including the amino 
acids methionine and cysteine, coenzyme A (see Figure 2-39), and the iron-sulfur 
centers essential for electron transport (see Figure 14-16). The sulfur-reduction 
process begins in bacteria, fungi, and plants, where a special group of enzymes 
use ATP and reducing power to create a sulfate assimilation pathway. Humans 
and other animals cannot reduce sulfate and must therefore acquire the sulfur 
they need for their metabolism in the food that they eat. 
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Figure 2-60 The generation of an 

H+ gradient across a membrane by 
electron-transport reactions. An electron 
held in a high-energy state (derived, for 
example, from the oxidation of a metabolite) 
is passed sequentially by carriers A, B, and 
C to a lower energy state. In this diagram, 
carrier B is arranged in the membrane 

in such a way that it takes up H* from 

one side and releases it to the other as 

the electron passes. The result is an Ht 
gradient. As discussed in Chapter 14, this 
gradient is an important form of energy that 
is harnessed by other membrane proteins 
to drive the formation of ATP (for an actual 
example, see Figure 14-21). 


Figure 2-61 The final stages of oxidation 
of food molecules. Molecules of NADH 
and FADH2 (FADHÞ2 is not shown) are 
produced by the citric acid cycle. These 
activated carriers donate high-energy 
electrons that are eventually used to reduce 
oxygen gas to water. A major portion of 
the energy released during the transfer of 
these electrons along an electron-transfer 
chain in the mitochondrial inner membrane 
(or in the plasma membrane of bacteria) is 
harnessed to drive the synthesis of ATP— 
hence the name oxidative phosphorylation 
(discussed in Chapter 14). 
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Metabolism Is Highly Organized and Regulated 


One gets a sense of the intricacy of a cell as a chemical machine from the relation 
of glycolysis and the citric acid cycle to the other metabolic pathways sketched 
out in Figure 2-63. This chart represents only some of the enzymatic pathways in 
a human cell. It is obvious that our discussion of cell metabolism has dealt with 
only a tiny fraction of the broad field of cell chemistry. 

All these reactions occur in a cell that is less than 0.1 mm in diameter, and each 
requires a different enzyme. As is clear from Figure 2-63, the same molecule can 
often be part of many different pathways. Pyruvate, for example, is a substrate for 
half a dozen or more different enzymes, each of which modifies it chemically in 
a different way. One enzyme converts pyruvate to acetyl CoA, another to oxalo- 
acetate; a third enzyme changes pyruvate to the amino acid alanine, a fourth to 
lactate, and so on. All of these different pathways compete for the same pyruvate 
molecule, and similar competitions for thousands of other small molecules go on 
at the same time. 

The situation is further complicated in a multicellular organism. Different cell 
types will in general require somewhat different sets of enzymes. And different 
tissues make distinct contributions to the chemistry of the organism as a whole. 
In addition to differences in specialized products such as hormones or antibod- 
ies, there are significant differences in the “common” metabolic pathways among 
various types of cells in the same organism. 

Although virtually all cells contain the enzymes of glycolysis, the citric acid 
cycle, lipid synthesis and breakdown, and amino acid metabolism, the levels 
of these processes required in different tissues are not the same. For example, 
nerve cells, which are probably the most fastidious cells in the body, maintain 
almost no reserves of glycogen or fatty acids and rely almost entirely on a constant 
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Figure 2-62 The nine essential amino 
acids. These cannot be synthesized by 
human cells and so must be supplied in 
the diet. 


Figure 2-63 Glycolysis and the citric acid cycle are at the center of an elaborate set of metabolic pathways in human cells. Some 2000 
metabolic reactions are shown schematically with the reactions of glycolysis and the citric acid cycle in red. Many other reactions either lead into 
these two central pathways — delivering small molecules to be catabolized with production of energy—or they lead outward and thereby supply 
carbon compounds for the purpose of biosynthesis. (Adapted with permission from Kanehisa Laboratories.) 
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supply of glucose from the bloodstream. In contrast, liver cells supply glucose to 
actively contracting muscle cells and recycle the lactic acid produced by muscle 
cells back into glucose. All types of cells have their distinctive metabolic traits, and 
they cooperate extensively in the normal state, as well as in response to stress and 
starvation. One might think that the whole system would need to be so finely bal- 
anced that any minor upset, such as a temporary change in dietary intake, would 
be disastrous. 

In fact, the metabolic balance of a cell is amazingly stable. Whenever the bal- 
ance is perturbed, the cell reacts so as to restore the initial state. The cell can adapt 
and continue to function during starvation or disease. Mutations of many kinds 
can damage or even eliminate particular reaction pathways, and yet—provided 
that certain minimum requirements are met—the cell survives. It does so because 
an elaborate network of control mechanisms regulates and coordinates the rates 
of all of its reactions. These controls rest, ultimately, on the remarkable abilities 
of proteins to change their shape and their chemistry in response to changes in 
their immediate environment. The principles that underlie how large molecules 
such as proteins are built and the chemistry behind their regulation will be our 
next concern. 


Summary 


Glucose and other food molecules are broken down by controlled stepwise oxidation 
to provide chemical energy in the form of ATP and NADH. There are three main sets 
of reactions that act in series, the products of each being the starting material for the 
next: glycolysis (which occurs in the cytosol), the citric acid cycle (in the mitochon- 
drial matrix), and oxidative phosphorylation (on the inner mitochondrial mem- 
brane). The intermediate products of glycolysis and the citric acid cycle are used 
both as sources of metabolic energy and to produce many of the small molecules 
used as the raw materials for biosynthesis. Cells store sugar molecules as glycogen 
in animals and starch in plants; both plants and animals also use fats extensively 
as a food store. These storage materials in turn serve as a major source of food for 
humans, along with the proteins that comprise the majority of the dry mass of most 
of the cells in the foods we eat. 


PROBLEMS 


WHAT WE DON’T KNOW 


e Did chemiosmosis precede 
fermentation as the source of 
biological energy, or did some form of 
fermentation come first, as had been 
assumed for many years? 


e What is the minimum number of 
components required to make a living 
cell from scratch? How might we find 
out? 


e Are other life chemistries possible 
besides the single one known on Earth 
(and described in this chapter)? When 
screening for life on other planets, 
what type of chemical signatures 
should we search for? 


e Is the shared chemistry inside all 
living cells a clue for deciphering the 
environment on Earth where the first 
cells originated’? For example, what 
might we conclude from the universally 
shared high Kt/Nat ratio, neutral pH, 
and central role of phosphates? 


Which statements are true? Explain why or why not. 
2-1 A10 M solution of HCl has a pH of 8. 


2-2 Most of the interactions between macromolecules 
could be mediated just as well by covalent bonds as by 
noncovalent bonds. 


2-3 Animals and plants use oxidation to extract energy 
from food molecules. 


2-4 If an oxidation occurs in a reaction, it must be 
accompanied by a reduction. 


2-5 Linking the energetically unfavorable reaction A 
— B to a second, favorable reaction B — C will shift the 
equilibrium constant for the first reaction. 


2-6 The criterion for whether a reaction proceeds 
spontaneously is AG not AG°, because AG takes into 
account the concentrations of the substrates and products. 


2-/ The oxygen consumed during the oxidation of glu- 
cose in animal cells is returned as CO% to the atmosphere. 


Discuss the following problems. 


2-8 The organic chemistry of living cells is said to be 
special for two reasons: it occurs in an aqueous environ- 
ment and it accomplishes some very complex reactions. 
But do you suppose it is really all that much different from 
the organic chemistry carried out in the top laboratories in 
the world? Why or why not? 


2-9 ‘The molecular weight of ethanol (CH3CH2OH) is 
46 and its density is 0.789 g/cm’. 

A. What is the molarity of ethanol in beer that is 5% 
ethanol by volume? [Alcohol content of beer varies from 
about 4% (lite beer) to 8% (stout beer). | 

B. The legal limit for a driver’s blood alcohol content 
varies, but 80 mg of ethanol per 100 mL of blood (usually 
referred to as a blood alcohol level of 0.08) is typical. What 
is the molarity of ethanol in a person at this legal limit? 

C. How many 12-0z (355-mL) bottles of 5% beer could 
a 70-kg person drink and remain under the legal limit? A 
70-kg person contains about 40 liters of water. Ignore the 
metabolism of ethanol, and assume that the water content 
of the person remains constant. 


CHAPTER 2 END-OF-CHAPTER PROBLEMS 


D. Ethanol is metabolized at a constant rate of about 
120 mg per hour per kg body weight, regardless of its con- 
centration. If a 70-kg person were at twice the legal limit 
(160 mg/100 mL), how long would it take for their blood 
alcohol level to fall below the legal limit? 


2-10 A histidine side chain is known to play an import- 
ant role in the catalytic mechanism of an enzyme; how- 
ever, it is not clear whether histidine is required in its pro- 
tonated (charged) or unprotonated (uncharged) state. To 
answer this question you measure enzyme activity over a 
range of pH, with the results shown in Figure Q2-1. Which 
form of histidine is required for enzyme activity? 


Figure Q2-1 Enzyme 
activity as a function 
of pH (Problem 2-10). 
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activity (% of maximum) 


2-11 The three molecules in Figure Q2-2 contain the 
seven most common reactive groups in biology. Most mol- 
ecules in the cell are built from these functional groups. 
Indicate and name the functional groups in these mole- 
cules. 


O Figure Q2-2 Three molecules that illustrate the 
| seven most common functional groups in biology 








ak ied (Problem 2-11). 1,3-Bisohosphoglycerate and 
O pyruvate are intermediates in glycolysis and 
cysteine is an amino acid. 
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2-12 


“Diffusion” sounds slow—and over everyday dis- 
tances it is—but on the scale of a cell it is very fast. The aver- 
age instantaneous velocity of a particle in solution—that is, 
the velocity between the very frequent collisions—is 


v = (kT/m)” 


where k = 1.38 x 10716 g cm?/K sec’, T = temperature in K 
(37°C is 310 K), and m = mass in g/molecule. 

Calculate the instantaneous velocity of a water 
molecule (molecular mass = 18 daltons), a glucose mol- 
ecule (molecular mass = 180 daltons), and a myoglobin 
molecule (molecular mass = 15,000 daltons) at 37°C. Just 
for fun, convert these numbers into kilometers/hour. 
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Before you do any calculations, try to guess whether the 
molecules are moving at a slow crawl (<1 km/hr), an easy 
walk (5 km/hr), or a record-setting sprint (40 km/hr). 


2-13 Polymerization of tubulin subunits into microtu- 
bules occurs with an increase in the orderliness of the sub- 
units. Yet tubulin polymerization occurs with an increase 
in entropy (decrease in order). How can that be? 


2-14 A 70-kg adult human (154 lb) could meet his or 
her entire energy needs for one day by eating 3 moles of 
glucose (540 g). (We do not recommend this.) Each mol- 
ecule of glucose generates 30 molecules of ATP when it is 
oxidized to CO». The concentration of ATP is maintained in 
cells at about 2 mM, and a 70-kg adult has about 25 liters 
of intracellular fluid. Given that the ATP concentration 
remains constant in cells, calculate how many times per 
day, on average, each ATP molecule in the body is hydro- 
lyzed and resynthesized. 


2-15 Assuming that there are 5 x 10! cells in the human 
body and that ATP is turning over at a rate of 109 ATP 
molecules per minute in each cell, how many watts is the 
human body consuming? (A watt is a joule per second.) 
Assume that hydrolysis of ATP yields 50 kJ/mole. 


2-16 Does a Snickers™ candy bar (65 g, 1360 kJ) pro- 
vide enough energy to climb from Zermatt (elevation 1660 
m) to the top of the Matterhorn (4478 m, Figure Q2-3), 
or might you need to stop at H6rnli Hut (3260 m) to eat 
another one? Imagine that you and your gear have a mass 
of 75 kg, and that all of your work is done against gravity 
(that is, you are just climbing straight up). Remember from 
your introductory physics course that 


work (J) = mass (kg) x g (m/sec?) x height gained (m) 


where g is acceleration due to gravity (9.8 m/sec”). One 
joule is 1 kg m?/sec?. 

What assumptions made here will greatly under- 
estimate how much candy you need? 


Figure Q2-3 The 
Matterhorn (Problem 
į 2-16). (Courtesy of 
Zermatt Tourism.) 


2-17 


In the absence of oxygen, cells consume glucose 
at a high, steady rate. When oxygen is added, glucose con- 
sumption drops precipitously and is then maintained at 
the lower rate. Why is glucose consumed at a high rate in 
the absence of oxygen and at a low rate in its presence? 


PANEL 2-1: Chemical Bonds and Groups Commonly Encountered in Biological Molecules 


CARBON SKELETONS 


Carbon has a unique role in the cell because of its 
ability to form strong covalent bonds with other 


carbon atoms. Thus carbon atoms can join to form: 


chains 


W a Ww ay 
Oe a et 
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also written as WNN 


COVALENT BONDS 


branched trees 


HYDROCARBONS 


A covalent bond forms when two atoms come very close 


together and share one or more of their electrons. In a single 
bond, one electron from each of the two atoms is shared; in 
a double bond, a total of four electrons are shared. 

Each atom forms a fixed number of covalent bonds in a 
defined spatial arrangement. For example, carbon forms four 
single bonds arranged tetrahedrally, whereas nitrogen forms 
three single bonds and oxygen forms two single bonds arranged 


as shown below. 
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Double bonds exist and have a different spatial arrangement. major influence on the 
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ALTERNATING DOUBLE BONDS 


The carbon chain can include double Alternating double bonds in a ring 
bonds. If these are on alternate carbon can generate a very stable structure. 


atoms, the bonding electrons move 
within the molecule, stabilizing the 
structure by a phenomenon called 
resonance. 


X 

C==C C 

A f 
C=C 
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the truth is somewhere between 
these two structures 


Carbon and hydrogen combine 
together to make stable 
compounds (or chemical groups) 
called hydrocarbons. These are 
nonpolar, do not form 
hydrogen bonds, and are 
generally insoluble in water. 


Atoms joined by two 

or more covalent bonds 
cannot rotate freely 
around the bond axis. 
This restriction is a 


three-dimensional shape 
of many macromolecules. 
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often written as Q 


part of the hydrocarbon “tail” 
of a fatty acid molecule 





C-O CHEMICAL GROUPS C-N CHEMICAL GROUPS 


Many biological compounds contain a carbon 


Amines and amides are two important examples of 
bonded to an oxygen. For example, 


compounds containing a carbon linked to a nitrogen. 


in water combine with an Ht ion to become 


The -OH is called a positively charged. 
group. H 


—C—N +H SS —C—N—H* 
H H 


are formed by combining an acid and an 
amine. Unlike amines, amides are uncharged in water. 
An example is the peptide bond that joins amino acids 
in a protein. 


The C=O is called a 
group. 


+ va 


The -COOH is called a acid amine 


group. In water 


this loses an H* ion to . l . l 
become- C00 Nitrogen also occurs in several ring compounds, including 


important constituents of nucleic acids: purines and pyrimidines. 
Esters are formed by a condensation reaction NH; 
between an acid and an alcohol. 


i cal 
|| | cytosine (a pyrimidine) 


alcohol 


SULFHYDRYL GROUP The -C— H is called a sulfhydryl group. In the amino acid cysteine, the sulfhydryl group 
may exist in the reduced form, -C— H 
or more rarely in an oxidized, cross- bridata form, -C— = -C— 


PHOSPHATES 


Inorganic phosphate is a stable ion formed from Phosphate esters can form between a phosphate and a free hydroxyl group. 
phosphoric acid, H3PQO,. It is also written as (P). are often attached to proteins in this way. 


| O | O also 


written as 


—C—OH+ HO—P—O = —_¢_0- ii? + H,O 


| A | ie 


The combination of a phosphate and a carboxyl group, or two or more phosphate groups, gives an acid anhydride. 
Because compounds of this kind are easily hydrolysed in the cell, they are sometimes said to contain a “high-energy” bond. 


H,O 


also written as 


high-energy acyl phosphate 
bond (carboxylic-phosphoric 
acid anhydride) found in 
some metabolites 


phosphoanhydride—a high- i n 
energy bond found in ee i > 
molecules such as ATP _O-s 





PANEL 2-2: Water and Its Influence on the Behavior of Biological Molecules 


WATER 


Two atoms, connected by a covalent bond, may exert different attractions for 
the electrons of the bond. In such cases the bond is , With one end 
slightly negatively charged (ô) and the other slightly positively charged (+). 


electropositive 
region 


electronegative 
region 


Although a water molecule has an overall neutral charge (having the same 
number of electrons and protons), the electrons are asymmetrically distributed, 
which makes the molecule polar. The oxygen nucleus draws electrons away 
from the hydrogen nuclei, leaving these nuclei with a small net positive charge. 
The excess of electron density on the oxygen atom creates weakly negative 
regions at the other two corners of an imaginary tetrahedron. 


HYDROGEN BONDS 


Because they are polarized, two 
adjacent HO molecules can form 
a linkage known as a 

. Hydrogen bonds have 
only about 1/20 the strength 
of a covalent bond. 

| 

Hydrogen bonds are strongest when hydrogen bond 
the three atoms lie in a straight line. 


HYDROPHILIC MOLECULES 


Substances that dissolve readily in water are termed . They are 
composed of ions or polar molecules that attract water molecules through 
electrical charge effects. Water molecules surround each ion or polar molecule 
on the surface of a solid substance and carry it into solution. 


substances such as sodium chloride 
dissolve because water molecules are 
attracted to the positive (Nat) or negative substances such as urea 
(CI) charge of each ion. dissolve because their molecules 
form hydrogen bonds with the 
surrounding water molecules. 


WATER STRUCTURE 


Molecules of water join together transiently 
in a hydrogen-bonded lattice. Even at 37°C, 
15% of the water molecules are joined to 
four others in a short-lived assembly known 
as a “flickering cluster.” 


The cohesive nature of water is 
responsible for many of its unusual 
properties, such as high surface tension, 
specific heat, and heat of vaporization. 


hydrogen bond 
H 0.17 nm 


\ 
O innit’ H —O— 


P l 
H 0.10 nm 
covalent bond 


HYDROPHOBIC MOLECULES 


Molecules that contain a preponderance 
of nonpolar bonds are usually insoluble in 
water and are termed . This is 
true, especially, of hydrocarbons, which 
contain many C-H bonds. Water molecules 
are not attracted to such molecules and so 
have little tendency to surround them and 
carry them into solution. 
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WATER AS A SOLVENT 


Many substances, such as household sugar, dissolve in water. That is, their 
molecules separate from each other, each becoming surrounded by water molecules. 


od 
sugar 
dissolves 


water 
molecule 


sugar crystal 


ACIDS 


Substances that release hydrogen ions into solution 
are called acids. 


HCl —- H* + Cl 
hydrochloric acid hydrogen ion chloride ion 
(strong acid) 


Many of the acids important in the cell are only partially 
dissociated, and they are therefore weak acids—for example, 
the carboxyl group (-COOH), which dissociates to give a 
hydrogen ion in solution. 


(weak acid) 


Note that this is a reversible reaction. 








pH 


Ht 
conc. 
_ moles/liter 
The acidity of a 


solution is defined 

by the concentration 

of Ht ions it possesses. 
For convenience we 
use the pH scale, where 


pH = -log, [HT] 


For pure water 


[Ht] = 107 moles/liter 


ALKALINE 


When a substance dissolves in a 

liquid, the mixture is termed a solution. 
The dissolved substance (in this case 
sugar) is the solute, and the liquid that 
does the dissolving (in this case water) 
is the solvent. Water is an excellent 
solvent for many substances because 

of its polar bonds. 


HYDROGEN ION EXCHANGE 


Positively charged hydrogen ions (H*) can spontaneously 
move from one water molecule to another, thereby creating 
two ionic species. 


H 
N ‘a 
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H 
hydronium ion hydroxyl ion 


(water acting as (water acting as 
a weak base) a weak acid) 


l , _\ — 
often written as: HO === H + OH 


hydrogen hydroxyl 
ion ion 
Since the process is rapidly reversible, hydrogen ions are 
continually shuttling between water molecules. Pure water 
contains a steady-state concentration of hydrogen ions and 


hydroxyl ions (both 10” M). 


BASES 


Substances that reduce the number of hydrogen ions in 
solution are called bases. Some bases, such as ammonia, 
combine directly with hydrogen ions. 


NH; + # — NH," 
ammonia hydrogen ion ammonium ion 


Other bases, such as sodium hydroxide, reduce the number of 
H* ions indirectly, by making OH ions that then combine 
directly with H* ions to make H,O. 


NaOH Na + OH > 


sodium hydroxide sodium hydroxyl 
(strong base) ion ion 


Many bases found in cells are partially associated with H* ions 
and are termed weak bases. This is true of compounds that 

contain an amino group (-NH>), which has a weak tendency to 
reversibly accept an H* ion from water, increasing the quantity 
of free OH ions. 


-NH, + HÝ -NH;* 











PANEL 2-3: The Principal Types of Weak Noncovalent Bonds that Hold Macromolecules Together 


WEAK NONCOVALENT CHEMICAL BONDS 


Organic molecules can interact with other molecules through three 
types of short-range attractive forces known as noncovalent bonds: 
van der Waals attractions, electrostatic attractions, and hydrogen 
bonds. The repulsion of hydrophobic groups from water is also 
important for the folding of biological macromolecules. 


weak 


noncovalent 
bond 
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Weak noncovalent chemical bonds have less than 1/20 the strength 
of a strong covalent bond. They are strong enough to provide 
tight binding only when many of them are formed simultaneously. 


HYDROGEN BONDS 


As already described for water (see Panel 2-2), 
hydrogen bonds form when a hydrogen atom is 
“sandwiched” between two electron-attracting atoms 
(usually oxygen or nitrogen). 


Hydrogen bonds are strongest when the three atoms are 
in a straight line: 
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Examples in macromolecules: 


Amino acids in a polypeptide chain can be hydrogen-bonded 
together. These stabilize the structure of folded proteins. 
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Two bases, G and C, are hydrogen-bonded in a 
DNA double helix. 
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VAN DER WAALS ATTRACTIONS 


If two atoms are too close together they repel each other 
very strongly. For this reason, an atom can often be 
treated as a sphere with a fixed radius. The characteristic 
“size” for each atom is specified by a unique van der 
Waals radius. The contact distance between any two 
noncovalently bonded atoms is the sum of their van der 


Waals radii. 


0.14 nm 
radius 


0.12 nm 
radius 


0.2 nm 
radius 


0.15 nm 
radius 


At very short distances any two atoms show a weak 
bonding interaction due to their fluctuating electrical 
charges. The two atoms will be attracted to each other 
in this way until the distance between their nuclei is 
approximately equal to the sum of their van der Waals 
radii. Although they are individually very weak, van der 
Waals attractions can become important when two 
macromolecular surfaces fit very close together, 
because many atoms are involved. 

Note that when two atoms form a covalent bond, the 
centers of the two atoms (the two atomic nuclei) are 
much closer together than the sum of the two van der 
Waals radii. Thus, 


LO | 
0.13 nm 


double-bonded 
carbon atoms 


0.4 nm 


two non-bonded 
carbon atoms 


0.15 nm 


single-bonded 
carbon atoms 


HYDROGEN BONDS IN WATER 


Any molecules that can form hydrogen bonds to each other 
can alternatively form hydrogen bonds to water molecules. 
Because of this competition with water molecules, the 
hydrogen bonds formed between two molecules dissolved 
in water are relatively weak. 
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HYDROPHOBIC FORCES 
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ELECTROSTATIC ATTRACTIONS 


Attractive forces occur both between fully charged 
groups (ionic bond) and between the partially charged 
groups on polar molecules. 








The force of attraction between the two charges, òt 
and ð, falls off rapidly as the distance between the 
charges increases. 


In the absence of water, electrostatic forces are very strong. 
They are responsible for the strength of such minerals as 
marble and agate, and for crystal formation in common 
table salt, NaCl. 


a crystal of 
salt, NaCl 


Water forces hydrophobic groups together, 
because doing so minimizes their disruptive 
effects on the hydrogen-bonded water 
network. Hydrophobic groups held 

together in this way are sometimes said 

to be held together by “hydrophobic 
bonds,” even though the apparent attraction 
is actually caused by a repulsion from the 
water. 


ELECTROSTATIC ATTRACTIONS IN 
AQUEOUS SOLUTIONS 


Charged groups are shielded by their 
interactions with water molecules. 
Electrostatic attractions are therefore 
quite weak in water. 
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Similarly, ions in solution can cluster around 
charged groups and further weaken 
these attractions. 
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Despite being weakened by water and salt, 
electrostatic attractions are very important in 
biological systems. For example, an enzyme that 
binds a positively charged substrate will often 
have a negatively charged amino acid side chain 
at the appropriate place. 


enzyme 





PANEL 2-4: An Outline of Some of the Types of Sugars Commonly Found in Cells 


MONOSACCHARIDES 


Monosaccharides usually have the general formula (CHO) „ where n can be 3, 4, 5, 6, 7, or 8, and have two or more hydroxyl groups. 
O 
They either contain an aldehyde group (-cZ,, ) and are called aldoses, or a ketone group ( p =0 ) and are called ketoses. 


3-carbon (TRIOSES) 5-carbon (PENTOSES) 6-carbon (HEXOSES) 


ALDOSES 


glyceraldehyde ribose 
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dihydroxyacetone ribulose fructose 


RING FORMATION ISOMERS 


In aqueous solution, the aldehyde or ketone group of a sugar 
molecule tends to react with a hydroxyl group of the same 
molecule, thereby closing the molecule into a ring. 


Many monosaccharides differ only in the spatial arrangement 
of atoms—that is, they are isomers. For example, glucose, 
galactose, and mannose have the same formula (C6H1206) but 
CH.OH differ in the arrangement of groups around one or two carbon 
oy ? atoms. 


glucose 


ribose 
glucose S 
mannose 
These small differences make only minor changes in the 
chemical properties of the sugars. But they are recognized by 
Note that each carbon atom enzymes and other proteins and therefore can have major 
has a number. biological effects. 














a AND B LINKS 


The hydroxyl group on the carbon that carries the 
aldehyde or ketone can rapidly change from one 
position to the other. These two positions are 
called a and ß. 


B hydroxyl a hydroxy! 
As soon as one sugar is linked to another, the a or 
| B form is frozen. 





SUGAR DERIVATIVES 


The hydroxyl groups of a simple 
monosaccharide such as glucose 
can be replaced by other groups. 
For example, 


glucuronic acid 


Caron 








DISACCHARIDES 


The carbon that carries the aldehyde 
or the ketone can react with any 
hydroxyl group on a second sugar 
molecule to form a disaccharide. 
The linkage is called a glycosidic 
bond. 


Three common disaccharides are 


maltose (glucose + glucose) 
lactose (galactose + glucose) 
sucrose (glucose + fructose) 


The reaction forming sucrose is 
shown here. 





sucrose 


B fructose 








OLIGOSACCHARIDES AND POLYSACCHARIDES 


Large linear and branched molecules can be made from simple repeating sugar 
subunits. Short chains are called oligosaccharides, while long chains are called 
polysaccharides. Glycogen, for example, is a polysaccharide made entirely of 


glucose units joined together. 
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COMPLEX OLIGOSACCHARIDES 


In many cases a sugar sequence 

is nonrepetitive. Many different 
molecules are possible. Such 

complex oligosaccharides are 

usually linked to proteins or to lipids, 
as is this oligosaccharide, which is 
part of a cell-surface molecule 

that defines a particular blood group. 


glycogen 











PANEL 2-5: Fatty Acids and Other Lipids 


COMMON FATTY Hundreds of different kinds of fatty acids exist. Some have one or more double bonds in their 
ACIDS hydrocarbon tail and are said to be . Fatty acids with no double bonds are 


These are carboxylic acids 
with long hydrocarbon tails. 


e e 
| | h This double bond 

is rigid and creates 

f°? kink in the chain. 
The rest of the chain 
is free to rotate 
about the other C-C 
bonds. 
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space-filling model carbon skeleton 


UNSATURATED SATURATED 


TRIACYLGLYCEROLS Fatty acids are stored as an energy reserve (fats and 
oils) through an ester linkage to to form 
triacylglycerols, also known as triglycerides. 


CARBOXYL GROUP PHOSPHOLIPIDS Phospholipids are the major constituents 


of cell membranes. 
If free, the carboxyl group of a 


fatty acid will be ionized. 4 } A | ot 
hydrophilic a 
) Ae 
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But more usually it is linked to 
other groups to form either 
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space-filling model of 
the phospholipid 
phosphatidylcholine 


hydrophobic 
fatty acid tails 


In phospholipids, two of the -OH groups in glycerol are 
linked to fatty acids, while the third -OH group is linked 
to phosphoric acid. The phosphate is further linked to 
one of a variety of small polar groups, such as choline. 





LIPID AGGREGATES 


Fatty acids have a hydrophilic head — 
and a hydrophobic tail. ———_ 


micelle 


In water they can form a surface film 
or form small micelles. 


Their derivatives can form larger aggregates held together by hydrophobic forces: 


(triglycerides) can form and form self-sealing lipid 
large spherical fat droplets in the cell bilayers that are the basis for all cell membranes. 


cytoplasm. 
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OTHER LIPIDS Lipids are defined as the water-insoluble 
molecules in cells that are soluble in organic 
solvents. Two other common types of lipids 
are steroids and polyisoprenoids. Both are 
made from isoprene units. 


STEROIDS Steroids have a common multiple-ring structure. 


S d 


—found in many membranes —male steroid hormone 


GLYCOLIPIDS 


Like phospholipids, these compounds are composed of a hydrophobic 
region, containing two long hydrocarbon tails and a polar region, 
which contains one or more sugars and, unlike phospholipids, 

no phosphate. 


POLYISOPRENOIDS 


long-chain polymers of isoprene 


—used 
to carry activated sugars 
in the membrane-associated 
synthesis of glycoproteins 
and some polysaccharides 





PANEL 2-6: A Survey of the Nucleotides 


PHOSPHATES 


The phosphates are normally joined to 
the C5 hydroxyl of the ribose or 
deoxyribose sugar (designated 5'). Mono-, 
di-, and triphosphates are common. 


The phosphate makes a nucleotide 
negatively charged. 


SUGARS 


PENTOSE 


a five-carbon sugar 


The bases are nitrogen-containing ring 
compounds, either pyrimidines or purines. 


guanine N SN 


PYRIMIDINE PURINE 


NUCLEOTIDES 


A nucleotide consists of a nitrogen-containing 
base, a five-carbon sugar, and one or more 
phosphate groups. 


BASE 


PHOSPHATE 
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Nucleotides 

are the 

subunits of 

the nucleic acids. 





two kinds are used 


Each numbered carbon on the sugar of a nucleotide is 
followed by a prime mark; therefore, one speaks of the 


“5-prime carbon,” etc. 


adenine 
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HC | 
Cc 
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NH, 


BASIC SUGAR 
LINKAGE 


N-glycosidic 
bond 


The base is linked to 
the same carbon (C1) 
used in sugar-sugar 
bonds. 


B-b-ribose 
used in ribonucleic acid 


B-b-2-deoxyribose 
used in deoxyribonucleic acid 





NOMENCLATURE A nucleoside or nucleotide is named according to its nitrogenous base. 
Single-letter abbreviations are used variously as 
shorthand for (1) the base alone, (2) the 
BASE NUCLEOSIDE nucleoside, or (3) the whole nucleotide— Sugar > 
the context will usually make clear which of 
adenine adenosine the three entities is meant. When the context BASE + SUGAR = NUCLEOSIDE 
is not sufficient, we will add the terms “base”, o 
“nucleoside”, “nucleotide”, or—as in the 
examples below—use the full 3-letter nucleotide 
code. 


guanine guanosine 


cytosine cytidine 


AMP = adenosine monophosphate 

dAMP = deoxyadenosine monophosphate 

UDP = uridine diphosphate 

ATP = adenosine triphosphate BASE + SUGAR + PHOSPHATE = NUCLEOTIDE 


uracil uridine 


thymine thymidine 


NUCLEIC ACIDS NUCLEOTIDES HAVE MANY OTHER FUNCTIONS 


Nucleotides are joined together by a 

phosphodiester linkage between 5’ and D They carry chemical energy in their easily hydrolyzed phosphoanhydride bonds. 

3' carbon atoms to form nucleic acids. 

The linear sequence of nucleotides in a NH, 

nucleic acid chain is commonly phosphoanhydride bonds 

abbreviated by a one-letter code, such as < | SN 
AZ 


A—G—C—T—T—A—C—A, with the 5’ 
end of the chain at the left. 


example: ATP (or ) 











example: coenzyme A (CoA) 
5’ end of chain 
5/ 


O They are used as specific signaling molecules in the cell. 


phosphodiester example: cyclic AMP (cAMP) 


linkage 
example: DNA 


3’ OH 


3' end of chain 





PANEL 2-7: Free Energy and Biological Reactions 


THE IMPORTANCE OF FREE ENERGY FOR CELLS 


Life is possible because of the complex network of interacting 
chemical reactions occurring in every cell. In viewing the 
metabolic pathways that comprise this network, one might 
suspect that the cell has had the ability to evolve an enzyme to 
carry out any reaction that it needs. But this is not so. Although 
enzymes are powerful catalysts, they can speed up only those 
reactions that are thermodynamically possible; other reactions 
proceed in cells only because they are coupled to very favorable 
reactions that drive them. The question of whether a reaction 


can occur spontaneously, or instead needs to be coupled to 
another reaction, is central to cell biology. The answer is 
obtained by reference to a quantity called the free energy: the 
total change in free energy during a set of reactions determines 
whether or not the entire reaction sequence can occur. In this 
panel, we shall explain some of the fundamental ideas—derived 
from a special branch of chemistry and physics called thermo- 
dynamics—that are required for understanding what free 
energy is and why it is so important to cells. 


ENERGY RELEASED BY CHANGES IN CHEMICAL BONDING IS CONVERTED INTO HEAT 


L~~ UNIVERSE — 


An enclosed system is defined as a collection of molecules that 
does not exchange matter with the rest of the universe (for 
example, the “cell in a box” shown above). Any such system will 
contain molecules with a total energy E. This energy will be 
distributed in a variety of ways: some as the translational energy 
of the molecules, some as their vibrational and rotational energies, 
but most as the bonding energies between the individual atoms 
that make up the molecules. Suppose that a reaction occurs in 

the system. The places a constraint 

on what types of reactions are possible: it states that 


For example, suppose that reaction A— B occurs somewhere in 
the box and releases a great deal of chemical-bond energy. This 
energy will initially increase the intensity of molecular motions 
(translational, vibrational, and rotational) in the system, which 

is equivalent to raising its temperature. However, these increased 
motions will soon be transferred out of the system by a series 


THE SECOND LAW OF THERMODYNAMICS 


Consider a container in which 1000 coins are all lying heads up. 
If the container is shaken vigorously, subjecting the coins to 

the types of random motions that all molecules experience due 
to their frequent collisions with other molecules, one will end 
up with about half the coins oriented heads down. The 

reason for this reorientation is that there is only a single way in 
which the original orderly state of the coins can be reinstated 
(every coin must lie heads up), whereas there are many different 
ways (about 1078) to achieve a disorderly state in which there is 
an equal mixture of heads and tails; in fact, there are more ways 


of molecular collisions that heat up first the walls of the box 
and then the outside world (represented by the sea in 

our example). In the end, the system returns to its initial 
temperature, by which time all the chemical-bond energy 
released in the box has been converted into heat energy and 
transferred out of the box to the surroundings. According to 
the first law, the change in the energy in the box (AE,,,,, which 
we shall denote as AE) must be equal and opposite to the 
amount of heat energy transferred, which we shall designate 
as h: that is, AE = —h. Thus, the energy in the box (E) decreases 
when heat leaves the system. 

E also can change during a reaction as a result of work being 
done on the outside world. For example, suppose that there is 
a small increase in the volume (AV) of the box during a reaction. 
Since the walls of the box must push against the constant 
pressure (P) in the surroundings in order to expand, this does 
work on the outside world and requires energy. The energy 
used is P(AV), which according to the first law must decrease 
the energy in the box (E) by the same amount. In most reactions, 
chemical-bond energy is converted into both work and heat. 
Enthalpy (H) is a composite function that includes both of these 
(H = E + PV). To be rigorous, it is the change in enthalpy 
(AH) in an enclosed system, and not the change in energy, that 
is equal to the heat transferred to the outside world during a 
reaction. Reactions in which H decreases release heat to the 
surroundings and are said to be “exothermic,” while reactions 
in which H increases absorb heat from the surroundings and 
are said to be “endothermic.” Thus, -h = AH. However, the 
volume change is negligible in most biological reactions, so to 
a good approximation 


to achieve a 50-50 state than to achieve any other state. Each 

state has a probability of occurrence that is proportional to the 

number of ways it can be realized. The 
states that 


Since states of lower probability are more “ordered” than 
states of high probability, the second law can be restated: 
“the universe constantly changes so as to become more 
disordered.” 





THE ENTROPY, S 


The second law (but not the first law) allows one to predict the 
direction of a particular reaction. But to make it useful for this 
purpose, one needs a convenient measure of the probability or, 
equivalently, the degree of disorder of a state. The entropy (S) 
is such a measure. It is a logarithmic function of the probability 
such that the change in entropy (AS) that occurs when the 
reaction A —>B converts one mole of A into one mole of B is 


AS =R IN pp /Pa 


where p; and pg are the probabilities of the two states A and B, 
R is the gas constant (8.31 J K- mole™1), and AS is measured 

in entropy units (eu). In our initial example of 1000 coins, the 
relative probability of all heads (state A) versus half heads and 
half tails (state B) is equal to the ratio of the number of different 
ways that the two results can be obtained. One can calculate 
that p, = 1 and pz = 1000!(500! x 500!) = 10299. Therefore, 

the entropy change for the reorientation of the coins when their 


THE GIBBS FREE ENERGY, G 


When dealing with an enclosed biological system, one would 
like to have a simple way of predicting whether a given reaction 
will or will not occur spontaneously in the system. We have 
seen that the crucial question is whether the entropy change for 
the universe is positive or negative when that reaction occurs. 
In our idealized system, the cell in a box, there are two separate 
components to the entropy change of the universe—the entropy 
change for the system enclosed in the box and the entropy 
change for the surrounding “sea”—and both must be added 
together before any prediction can be made. For example, it is 
possible for a reaction to absorb heat and thereby decrease the 
entropy of the sea (AS,,, < 0) and at the same time to cause 
such a large degree of disordering inside the box (ASpox > 0) 
that the total AS, niverse = ASseq + ASpox Is greater than 0. In this 
case, the reaction will occur spontaneously, even though the 
sea gives up heat to the box during the reaction. An example of 
such a reaction is the dissolving of sodium chloride in a beaker 
containing water (the “box”), which is a spontaneous process 
even though the temperature of the water drops as the salt 
goes into solution. 

Chemists have found it useful to define a number of new 
“composite functions” that describe combinations of physical 
properties of a system. The properties that can be combined 
include the temperature (7), pressure (P), volume (V), energy 
(E), and entropy (S). The enthalpy (H) is one such composite 
function. But by far the most useful composite function for 
biologists is the Gibbs free energy, G. It serves as an accounting 
device that allows one to deduce the entropy change of the 
universe resulting from a chemical reaction in the box, while 
avoiding any separate consideration of the entropy change in 
the sea. The definition of G is 


GS I 15 


where, for a box of volume V, H is the enthalpy described above 
(E + PV), T is the absolute temperature, and S is the entropy. 

Each of these quantities applies to the inside of the box only. The 
change in free energy during a reaction in the box (the G of the 
products minus the G of the starting materials) is denoted as AG 
and, as we shall now demonstrate, it is a direct measure of the 
amount of disorder that is created in the universe when the 
reaction occurs. 


container is vigorously shaken and an equal mixture of heads 
and tails is obtained is R In (10278), or about 1370 eu per mole of 
such containers (6 x 10%? containers). We see that, because AS 
defined above is positive for the transition from state A to 

state B (pp /p,» > 1), reactions with a large increase in S (that is, 
for which AS > 0) are favored and will occur spontaneously. 

As discussed in Chapter 2, heat energy causes the random 
commotion of molecules. Because the transfer of heat from an 
enclosed system to its surroundings increases the number of 
different arrangements that the molecules in the outside world 
can have, it increases their entropy. It can be shown that the 
release of a fixed quantity of heat energy has a greater disor- 
dering effect at low temperature than at high temperature, and 
that the value of AS for the surroundings, as defined above 
(AS.o3), is precisely equal to h, the amount of heat transferred to 
the surroundings from the system, divided by the absolute 
temperature (T): 


AS.eq = h/T 


At constant temperature the change in free energy (AG) 
during a reaction equals AH — TAS. Remembering that 
AH = —h, the heat absorbed from the sea, we have 


—AG = —AH + TAS 
-AG = h + TAS, so —AG/T = h/T + AS 


But h/T is equal to the entropy change of the sea (AS,ea), and 
the AS in the above equation is AS... Therefore 


—AG/T = AS cea + NS nay z AS universe 


We conclude that 

. A reaction will proceed 
in the direction that causes the change in the free energy (AG) 
to be less than zero, because in this case there will be a positive 
entropy change in the universe when the reaction occurs. 

For a complex set of coupled reactions involving many 
different molecules, the total free-energy change can be com- 
puted simply by adding up the free energies of all the different 
molecular species after the reaction and comparing this value 
with the sum of free energies before the reaction; for common 
substances the required free-energy values can be found from 
published tables. In this way, one can predict the direction of 
a reaction and thereby readily check the feasibility of any 
proposed mechanism. Thus, for example, from the observed 
values for the magnitude of the electrochemical proton gradient 
across the inner mitochondrial membrane and the AG for ATP 
hydrolysis inside the mitochondrion, one can be certain that ATP 
synthase requires the passage of more than one proton for each 
molecule of ATP that it synthesizes. 


. The large negative value for 
ATP hydrolysis in a cell merely reflects the fact that cells keep 
the ATP hydrolysis reaction as much as 10 orders of magnitude 
away from equilibrium. If a reaction reaches equilibrium, 
AG = 0, the reaction then proceeds at precisely equal rates 
in the forward and backward direction. For ATP hydrolysis, 
equilibrium is reached when the vast majority of the ATP 
has been hydrolyzed, as occurs in a dead cell. 





PANEL 2-8: Details of the 10 Steps of Glycolysis 


For each step, the part of the molecule that undergoes a change is shadowed in blue, 
and the name of the enzyme that catalyzes the reaction is in a yellow box. 


step 1 Glucose is 
phosphorylated by ATP to 
form a sugar phosphate. 
The negative charge of the 
phosphate prevents passage 
of the sugar 

phosphate through the 
plasma membrane, 
trapping glucose inside 


the cell. glucose 


Step 2 


reversible 
rearrangement of 

the chemical 

structure 
(isomerization) 4 
moves the 
carbonyl oxygen HO 
from carbon 1 to 
carbon 2, forming a 
ketose from an 

aldose sugar. 

(See Panel 2-3, 

pp. 70-71.) 


A readily 


6 CH,O—P) 
5 O 


(ring form) 


Step 3 The new hydroxy! 
group on carbon 1 is 
phosphorylated by ATP, in 
preparation for the formation 
of two three-carbon sugar 
phosphates. The entry of sugars 
into glycolysis is controlled at 
this step, through regulation 
of the enzyme 
phosphofructokinase. 


OH,C O 


HO 


OH 





six-carbon sugar is 
cleaved to produce 
two three-carbon 
molecules. Only the 
glyceraldehyde 
3-phosphate can 
proceed immediately 
through glycolysis. 


(ring form) 


fructose 1,6-bisphosphate 


Step5 The other 
product of step 4, 
dihydroxyacetone 
phosphate, is | 
isomerized to form 
glyceraldehyde 
3-phosphate. 


(open-chain form) 
glucose 6-phosphate 


CH,OH 


OH 


fructose 6-phosphate 


CH,OH 
C=O 


CH,0—®) 


dihydroxyacetone 
phosphate 


hexokinase 


glucose 6-phosphate 


H — C—OH phosphoglucose 


isomerase 
HO— C— H 
3 
H — Cm OH 
H — C — OH 
5 


| Sy 
s CH:0 —@ ar 


(open-chain form) 
fructose 6-phosphate 


O CH,O 


HO 
OH 


phosphofructokinase @— 0n,c 


+ D 


OH 
fructose 1,6-bisohosphate 


(open-chain form) dihydroxyacetone 


phosphate 


H O 
SF 
triose phosphate isomerase C 


a 
ooo 


| 
H—C—OH 


| 
CH,O —® 


glyceraldehyde 
3-phosphate 


(ring form) 


—P) 


+ ADP + H* 


glyceraldehyde 
3-phosphate 





step 6 = The two molecules 
of glyceraldehyde 3-phosphate 
are oxidized. The 
energy-generation phase of 
glycolysis begins, as NADH and 
a new high-energy anhydride 
linkage to phosphate are 
formed (see Figure 13-5). 


+ NAD* + 


H—C—OH 


| 
CH,O-® 


glyceraldehyde 3-phosphate 


step7 The transfer 
to ADP of the 
high-energy phosphate 
group that was 
generated in step 6 
forms ATP. 


Step 8 The remaining 
phosphate ester linkage in 
3-phosphoglycerate, 
which has a relatively low 
free energy of hydrolysis, 
is moved from carbon 3 
to carbon 2 to form 
2-phosphoglycerate. 


step9 = The removal of 
water from 2-phosphoglycerate 
creates a high-energy enol 
phosphate linkage. 


step 10 The transfer to 
ADP of the high-energy 
phosphate group that was 
generated in step 9 forms 
ATP, completing 
glycolysis. 


NET RESULT OF GLYCOLYSIS 


CH OH 


glucose 


H—C—OH 


| 
CH,04P) 


1,3-bispohosphoglycerate 


H =C — Je 


| 
CHORO 


3-phosphoglycerate 


| 
O 
CON 


2-phosphoglycerate 


phosphoenolpyruvate 


glyceraldehyde 3-phosphate 


dehydrogenase 


phosphoglycerate mutase 


e—a 
eim 


enolase 


In addition to the pyruvate, the net products are 
two molecules of ATP and two molecules of NADH. 


H—C—OH 


| 
CHO —® 


3-phosphoglycerate 


| 
i — 


CH,OH 


2-phosphoglycerate 


pyruvate 


two molecules 
of pyruvate 





PANEL 2-9: The Complete Citric Acid Cycle 


NNER +H" coenzyme A 
pe, Hs ae Overview of the complete citric acid cycle. 
E, -E The two carbons from acetyl CoA that 
enter this turn of the cycle (shadowed in 
pyruvate O acetyl CoA (2C) red ) will be converted to CO, in 
uw subsequent turns of the cycle: it is the two 
GH;-C-S-CoA carbons shadowed in blue that are 
HS—CoA converted to CO, in this cycle. 


next cycle 


NAD* COO oxaloacetate (4C) 
G- O J goo 
Step 8 cH; citrate (6C) 
COO HC- COO 
oxaloacetate (4C) HO -CH 
| 


CH late (4C = 
poe e (4C) 7 N COO 
COO CITRIC ACID CYCLE 


H,O Ney Step 3 


isocitrate (6C) 


Coo 
fumarate (4C) a-ketoglutarate (5C) CH 
2 


succinyl CoA (4C) 
succinate (4C) 


Details of these eight steps are shown below. In this part of the panel, for each step, the part of the molecule that undergoes 
a change is shadowed in blue, and the name of the enzyme that catalyzes the reaction is in a yellow box. 


step1 After the 
= = a © _J A 
enzyme removes a proton COO TNE a o H2O 


from the CH, group on | 

acetyl CoA, the negatively O= C —S-CoA C=O Synthase | 

charged CH,” forms a + O — + HS-CoA + 
bond to a carbonyl carbon a HO 7 COO 

of oxaloacetate. The 

subsequent loss by 

hydrolysis of the coenzyme 

A (HS—CoA) drives the 


reaction strongly forward. 
acetyl CoA oxaloacetate S-citryl-CoA citrate 


intermediate 


Step 2 An isomerization 


reaction, in which water is 


| 
first removed and then , 
added back, moves the | aconitase | 
hydroxyl group from one 

| | 


carbon atom to its neighbor. 


citrate cis-aconitate intermediate isocitrate 





step3 In the first of 
four oxidation steps in the 
cycle, the carbon carrying 
the hydroxyl group is 
converted to a carbonyl 
group. The immediate 
product is unstable, losing 
CO, while still bound to 
the enzyme. 


Step4 The a-ketoglutarate 
dehydrogenase complex closely 
resembles the large enzyme 
complex that converts pyruvate 
to acetyl CoA, the pyruvate 
dehydrogenase complex in 
Figure 13-10. It likewise 
catalyzes an oxidation that 
produces NADH, CO,, and a 
high-energy thioester bond to 
coenzyme A (CoA). 


Step5 A phosphate 
molecule from solution 
displaces the CoA, forming a 
high-energy phosphate 
linkage to succinate. This 
phosphate is then passed to 
GDP to form GTP. (In bacteria 
and plants, ATP is formed 
instead.) 


Step 6 Inthe third oxidation 
step in the cycle, FAD accepts two 
hydrogen atoms from succinate. 


Step 7 The addition of 
water to fumarate places a 
hydroxyl group next to a 
carbonyl carbon. 


Step 8 Inthe last of four 
oxidation steps in the cycle, the 
carbon carrying the hydroxyl 
group is converted 
to a carbonyl group, 
regenerating the oxaloacetate 
needed for step 1. 


isocitrate 
H—C—H dehydrogenase 


O Z/N 
HOTO EH 
+H" 


oxalosuccinate intermediate 


ni 
S —CoA 


succinyl-CoA 


succinyl-CoA synthetase 


succinate 


succinate dehydrogenase 








fumarase 


malate dehydrogenase 


malate oxaloacetate 
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Proteins 


When we look at a cell through a microscope or analyze its electrical or bio- 
chemical activity, we are, in essence, observing proteins. Proteins constitute most 
of a cell’s dry mass. They are not only the cell’s building blocks; they also execute 
the majority of the cell’s functions. Thus, proteins that are enzymes provide 
the intricate molecular surfaces inside a cell that catalyze its many chemical 
reactions. Proteins embedded in the plasma membrane form channels and 
pumps that control the passage of small molecules into and out of the cell. Other 
proteins carry messages from one cell to another, or act as signal integrators that 
relay sets of signals inward from the plasma membrane to the cell nucleus. Yet 
others serve as tiny molecular machines with moving parts: kinesin, for example, 
propels organelles through the cytoplasm; topoisomerase can untangle knotted 
DNA molecules. Other specialized proteins act as antibodies, toxins, hormones, 
antifreeze molecules, elastic fibers, ropes, or sources of luminescence. Before 
we can hope to understand how genes work, how muscles contract, how nerves 
conduct electricity, how embryos develop, or how our bodies function, we must 
attain a deep understanding of proteins. 


THE SHAPE AND STRUCTURE OF PROTEINS 


From a chemical point of view, proteins are by far the most structurally complex 
and functionally sophisticated molecules known. This is perhaps not surpris- 
ing, once we realize that the structure and chemistry of each protein has been 
developed and fine-tuned over billions of years of evolutionary history. The the- 
oretical calculations of population geneticists reveal that, over evolutionary time 
periods, a surprisingly small selective advantage is enough to cause a randomly 
altered protein sequence to spread through a population of organisms. Yet, even 
to experts, the remarkable versatility of proteins can seem truly amazing. 

In this section, we consider how the location of each amino acid in the long 
string of amino acids that forms a protein determines its three-dimensional shape. 
Later in the chapter, we use this understanding of protein structure at the atomic 
level to describe how the precise shape of each protein molecule determines its 
function in a cell. 


The Shape of a Protein Is Specified by Its Amino Acid Sequence 


There are 20 different of amino acids in proteins that are coded for directly in an 
organism’s DNA, each with different chemical properties. A protein molecule 
is made from a long unbranched chain of these amino acids, each linked to its 
neighbor through a covalent peptide bond. Proteins are therefore also known as 
polypeptides. Each type of protein has a unique sequence of amino acids, and 
there are many thousands of different proteins in a cell. 

The repeating sequence of atoms along the core of the polypeptide chain is 
referred to as the polypeptide backbone. Attached to this repetitive chain are 
those portions of the amino acids that are not involved in making a peptide bond 
and that give each amino acid its unique properties: the 20 different amino acid 
side chains (Figure 3-1). Some of these side chains are nonpolar and hydrophobic 
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(“water-fearing”), others are negatively or positively charged, some readily form 
covalent bonds, and so on. Panel 3-1 (pp. 112-113) shows their atomic structures 
and Figure 3-2 lists their abbreviations. 

As discussed in Chapter 2, atoms behave almost as if they were hard spheres 
with a definite radius (their van der Waals radius). The requirement that no two 
atoms overlap plus other constraints limit the possible bond angles in a poly- 
peptide chain (Figure 3-3), severely restricting the possible three-dimensional 
arrangements (or conformations) of atoms. Nevertheless, a long flexible chain 
such as a protein can still fold in an enormous number of ways. 

The folding of a protein chain is also determined by many different sets of 
weak noncovalent bonds that form between one part of the chain and another. 
These involve atoms in the polypeptide backbone, as well as atoms in the amino 
acid side chains. There are three types of these weak bonds: hydrogen bonds, elec- 
trostatic attractions, and van der Waals attractions, as explained in Chapter 2 (see 
p. 44). Individual noncovalent bonds are 30-300 times weaker than the typical 
covalent bonds that create biological molecules. But many weak bonds acting in 
parallel can hold two regions of a polypeptide chain tightly together. In this way, 
the combined strength of large numbers of such noncovalent bonds determines 
the stability of each folded shape (Figure 3-4). 





AMINO ACID SIDE CHAIN AMINO ACID 
Aspartic acid Asp D negative 


Glutamic acid Glu E negative 





Figure 3-1 The components of a protein. 
A protein consists of a polypeptide 
backbone with attached side chains. Each 
type of protein differs in its sequence and 
number of amino acids; therefore, it is the 
sequence of the chemically different side 
chains that makes each protein distinct. 
The two ends of a polypeptide chain are 
chemically different: the end carrying the 
free amino group (NH3*, also written NH>2) 
is the amino terminus, or N-terminus, 

and that carrying the free carboxyl group 
(COO, also written COOH) is the carboxyl 
terminus or C-terminus. The amino acid 
sequence of a protein is always presented 
in the N-to-C direction, reading from left 
to right. 


SIDE CHAIN 





L~~ POLAR AMINO ACIDS —————! .++—— NONPOLAR AMINO ACIDS — 


Figure 3-2 The 20 amino acids commonly found in proteins. Each amino acid has a three-letter and a one- 
letter abbreviation. There are equal numbers of polar and nonpolar side chains; however, some side chains listed 
here as polar are large enough to have some nonpolar properties (for example, Tyr, Thr, Arg, Lys). For atomic 


structures, see Panel 3-1 (pp. 112-113). 
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i © aminoacid 











peptide bonds 


Figure 3-3 Steric limitations on the bond angles in a polypeptide 


chain. (A) Each amino acid contributes three bonds (red) to the backbone -180 

of the chain. The peptide bond is planar (gray shading) and does not permit 

rotation. By contrast, rotation can occur about the Ca-C bond, whose @ beta sheet 
angle of rotation is called psi (y), and about the N—C, bond, whose angle of left-handed 


rotation is called phi (b). By convention, an R group is often used to denote helix 
an amino acid side chain (purple circles). (B) The conformation of the main- 

chain atoms in a protein is determined by one pair of @ and w angles for each 

amino acid; because of steric collisions between atoms within each amino 

acid, most of the possible pairs of @ and y angles do not occur. In this so- 

called Ramachandran plot, each dot represents an observed pair of angles in 

a protein. The three differently shaded clusters of dots reflect three different 

“secondary structures” repeatedly found in proteins, as will be described in 

the text. (B, from J. Richardson, Adv. Prot. Chem. 34:174-175, 1981. 
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A fourth weak force—a hydrophobic clustering force—also has a central role 
in determining the shape of a protein. As described in Chapter 2, hydrophobic 
molecules, including the nonpolar side chains of particular amino acids, tend to 
be forced together in an aqueous environment in order to minimize their disrup- 
tive effect on the hydrogen-bonded network of water molecules (see Panel 2-2, 
pp. 92-93). Therefore, an important factor governing the folding of any protein is 
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Figure 3-4 Three types of noncovalent bonds help proteins fold. Although a single one of these bonds is quite weak, many 
of them act together to create a strong bonding arrangement, as in the example shown. As in the previous figure, R is used as a 
general designation for an amino acid side chain. 
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THE AMINO ACID OPTICAL ISOMERS The a-carbon atom is asymmetric, which 
allows for two mirror images (or stereo-) 


The general formula of an amino acid is i 
isomers, L and D. 
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group H-N -S COOH group 


side-chain group 


R is commonly one of 20 different side chains. 
At pH 7 both the amino and carboxyl groups 
are ionized. 














Proteins consist exclusively of L-amino acids. 
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Amino acids are commonly joined together by an amide linkage, Peptide bond: The four atoms in each gray box form a rigid 
called a peptide bond. planar unit. There is no rotation around the C-N bond. 
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the distribution ofits polar and nonpolar amino acids. The nonpolar (hydropho- 
bic) side chains in a protein—belonging to such amino acids as phenylalanine, 
leucine, valine, and tryptophan—tend to cluster in the interior of the molecule 
(just as hydrophobic oil droplets coalesce in water to form one large droplet). This 
enables them to avoid contact with the water that surrounds them inside a cell. 
In contrast, polar groups—such as those belonging to arginine, glutamine, and 
histidine—tend to arrange themselves near the outside of the molecule, where 
they can form hydrogen bonds with water and with other polar molecules (Figure 
3-5). Polar amino acids buried within the protein are usually hydrogen-bonded to 
other polar amino acids or to the polypeptide backbone. 


Proteins Fold into a Conformation of Lowest Energy 


As a result of all of these interactions, most proteins have a particular three-di- 
mensional structure, which is determined by the order of the amino acids in its 
chain. The final folded structure, or conformation, of any polypeptide chain is 
generally the one that minimizes its free energy. Biologists have studied pro- 
tein folding in a test tube using highly purified proteins. Treatment with certain 
solvents, which disrupt the noncovalent interactions holding the folded chain 
together, unfolds, or denatures, a protein. This treatment converts the protein into 
a flexible polypeptide chain that has lost its natural shape. When the denaturing 
solvent is removed, the protein often refolds spontaneously, or renatures, into its 
original conformation. This indicates that the amino acid sequence contains all of 
the information needed for specifying the three-dimensional shape of a protein, a 
critical point for understanding cell biology. 

Most proteins fold up into a single stable conformation. However, this confor- 
mation changes slightly when the protein interacts with other molecules in the 
cell. This change in shape is often crucial to the function of the protein, as we see 
later. 

Although a protein chain can fold into its correct conformation without out- 
side help, in a living cell special proteins called molecular chaperones often assist 
in protein folding. Molecular chaperones bind to partly folded polypeptide chains 
and help them progress along the most energetically favorable folding pathway. In 
the crowded conditions of the cytoplasm, chaperones are required to prevent the 
temporarily exposed hydrophobic regions in newly synthesized protein chains 
from associating with each other to form protein aggregates (see p. 355). However, 
the final three-dimensional shape of the protein is still specified by its amino acid 
sequence: chaperones simply make reaching the folded state more reliable. 
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Figure 3-5 How a protein folds into a 
compact conformation. The polar amino 
acid side chains tend to lie on the outside 
of the protein, where they can interact with 
water; the nonpolar amino acid side chains 
are buried on the inside forming a tightly 
packed hydrophobic core of atoms that 
are hidden from water. In this schematic 
drawing, the protein contains only about 
35 amino acids. 
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Proteins come in a wide variety of shapes, and most are between 50 and 
2000 amino acids long. Large proteins usually consist of several distinct protein 
domains—structural units that fold more or less independently of each other, as 
we discuss below. The structure of even a small domain is complex, and for clarity, 
several different representations are conventionally used, each of which empha- 
sizes distinct features. As an example, Figure 3-6 presents four representations 
of a protein domain called SH2, a structure present in many different proteins in 
eukaryotic cells and involved in cell signaling (see Figure 15-46). 

Descriptions of protein structures are aided by the fact that proteins are built 
up from combinations of several common structural motifs, as we discuss next. 


The a Helix and the B Sheet Are Common Folding Patterns 


When we compare the three-dimensional structures of many different protein 
molecules, it becomes clear that, although the overall conformation of each pro- 
tein is unique, two regular folding patterns are often found within them. Both pat- 
terns were discovered more than 60 years ago from studies of hair and silk. The 
first folding pattern to be discovered, called the a helix, was found in the protein 
a-keratin, which is abundant in skin and its derivatives—such as hair, nails, and 
horns. Within a year of the discovery of the a helix, a second folded structure, 
called a B sheet, was found in the protein fibroin, the major constituent of silk. 
These two patterns are particularly common because they result from hydro- 
gen-bonding between the N-H and C=O groups in the polypeptide backbone, 
without involving the side chains of the amino acids. Thus, although incompatible 
with some amino acid side chains, many different amino acid sequences can form 
them. In each case, the protein chain adopts a regular, repeating conformation. 
Figure 3-7 illustrates the detailed structures of these two important conforma- 
tions, which in ribbon models of proteins are represented by a helical ribbon and 
by a set of aligned arrows, respectively. 
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Figure 3-6 Four representations 
describing the structure of a small 
protein domain. Constructed from a string 
of 100 amino acids, the SH2 domain is part 
of many different proteins (See, for example, 
Figure 3-61). Here, the structure of the 
SH2 domain is displayed as (A) a 
polypeptide backbone model, (B) a ribbon 
model, (C) a wire model that includes the 
amino acid side chains, and (D) a space- 
filling model (Movie 3.1). These images 

are colored in a way that allows the 
polypeptide chain to be followed from its 
N-terminus (purple) to its C-terminus (red) 
(PDB code: 1SHA). 
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Figure 3-7 The regular conformation of the polypeptide backbone in the a helix and the B sheet. The a helix is shown in 
(A) and (B). The N-H of every peptide bond is hydrogen-bonded to the C=O of a neighboring peptide bond located four peptide 
bonds away in the same chain. Note that all of the N-H groups point up in this diagram and that all of the C=O groups point 
down (toward the C-terminus); this gives a polarity to the helix, with the C-terminus having a partial negative and the N-terminus 
a partial positive charge (Movie 3.2). The B sheet is shown in (C) and (D). In this example, adjacent peptide chains run in 
opposite (antiparallel) directions. Hydrogen-bonding between peptide bonds in different strands holds the individual polypeptide 
chains (strands) together in a B sheet, and the amino acid side chains in each strand alternately project above and below the 
plane of the sheet (Movie 3.3). (A) and (C) show all the atoms in the polypeptide backbone, but the amino acid side chains are 
truncated and denoted by R. In contrast, (B) and (D) show only the carbon and nitrogen backbone atoms. 


The cores of many proteins contain extensive regions of B sheet. As shown 
in Figure 3-8, these P sheets can form either from neighboring segments of the 
polypeptide backbone that run in the same orientation (parallel chains) or from 
a polypeptide backbone that folds back and forth upon itself, with each section 
of the chain running in the direction opposite to that of its immediate neigh- 
bors (antiparallel chains). Both types of P sheet produce a very rigid structure, 
held together by hydrogen bonds that connect the peptide bonds in neighboring 
chains (see Figure 3-7C). 

An @ helix is generated when a single polypeptide chain twists around on itself 
to form a rigid cylinder. A hydrogen bond forms between every fourth peptide 
bond, linking the C=O of one peptide bond to the N-H of another (see Figure 
3-7A). This gives rise to a regular helix with a complete turn every 3.6 amino acids. 
The SH2 protein domain illustrated in Figure 3-6 contains two @ helices, as well as 
a three-stranded antiparallel B sheet. 

Regions of a helix are abundant in proteins located in cell membranes, such 
as transport proteins and receptors. As we discuss in Chapter 10, those portions 
of a transmembrane protein that cross the lipid bilayer usually cross as a heli- 
ces composed largely of amino acids with nonpolar side chains. The polypeptide 
backbone, which is hydrophilic, is hydrogen-bonded to itself in the a helix and 
shielded from the hydrophobic lipid environment of the membrane by its pro- 
truding nonpolar side chains (see also Figure 3-75A). 

In other proteins, a helices wrap around each other to form a particularly sta- 
ble structure, known as a coiled-coil. This structure can form when the two (or in 
some cases, three or four) a helices have most of their nonpolar (hydrophobic) 
side chains on one side, so that they can twist around each other with these side 
chains facing inward (Figure 3-9). Long rodlike coiled-coils provide the structural 
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framework for many elongated proteins. Examples are a-keratin, which forms the 
intracellular fibers that reinforce the outer layer of the skin and its appendages, 
and the myosin molecules responsible for muscle contraction. 


Protein Domains Are Modular Units from Which Larger Proteins 
Are Built 


Even a small protein molecule is built from thousands of atoms linked together by 
precisely oriented covalent and noncovalent bonds. Biologists are aided in visu- 
alizing these extremely complicated structures by various graphic and comput- 
er-based three-dimensional displays. The student resource site that accompanies 
this book contains computer-generated images of selected proteins, displayed 
and rotated on the screen in a variety of formats. 

Scientists distinguish four levels of organization in the structure of a protein. 
The amino acid sequence is known as the primary structure. Stretches of poly- 
peptide chain that form a helices and f sheets constitute the protein’s second- 
ary structure. The full three-dimensional organization of a polypeptide chain is 
sometimes referred to as the tertiary structure, and if a particular protein mol- 
ecule is formed as a complex of more than one polypeptide chain, the complete 
structure is designated as the quaternary structure. 

Studies of the conformation, function, and evolution of proteins have also 
revealed the central importance of a unit of organization distinct from these four. 
This is the protein domain, a substructure produced by any contiguous part of 
a polypeptide chain that can fold independently of the rest of the protein into a 
compact, stable structure. A domain usually contains between 40 and 350 amino 
acids, and it is the modular unit from which many larger proteins are constructed. 

The different domains of a protein are often associated with different func- 
tions. Figure 3-10 shows an example—the Src protein kinase, which functions in 
signaling pathways inside vertebrate cells (Src is pronounced “sarc” ). This protein 
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Figure 3-8 Two types of B sheet 
structures. (A) An antiparallel B sheet (see 
Figure 3-7C). (B) A parallel B sheet. Both of 
these structures are common in proteins. 


Figure 3-9 A coiled-coil. (A) A single a 
helix, with Successive amino acid side 
chains labeled in a sevenfold sequence, 
“abcdefg” (from bottom to top). Amino 
acids “a” and “d” in such a sequence lie 
close together on the cylinder surface, 
forming a “stripe” (green) that winds 
slowly around the a helix. Proteins that 
form coiled-coils typically have nonpolar 
amino acids at positions “a” and “d.” 
Consequently, as shown in (B), the two a 
helices can wrap around each other with 
the nonpolar side chains of one a helix 
interacting with the nonpolar side chains 
of the other. (C) The atomic structure 

of a coiled-coil determined by x-ray 
crystallography. The alpha helical backbone 
is shown in red and the nonpolar side 
chains in green, while the more hydrophilic 
amino acid side chains, shown in gray, are 
left exposed to the aqueous environment 
(Movie 3.4). (PDB code: 3NMD.) 
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(A) SH2 domain (B) 


is considered to have three domains: the SH2 and SH3 domains have regulatory 
roles, while the C-terminal domain is responsible for the kinase catalytic activity. 
Later in the chapter, we shall return to this protein, in order to explain how pro- 
teins can form molecular switches that transmit information throughout cells. 

Figure 3-11 presents ribbon models of three differently organized protein 
domains. As these examples illustrate, the central core of a domain can be con- 
structed from a helices, from p sheets, or from various combinations of these two 
fundamental folding elements. 

The smallest protein molecules contain only a single domain, whereas larger 
proteins can contain several dozen domains, often connected to each other by 
short, relatively unstructured lengths of polypeptide chain that can act as flexible 
hinges between domains. 


Few of the Many Possible Polypeptide Chains Will Be Useful 
to Cells 


Since each of the 20 amino acids is chemically distinct and each can, in princi- 
ple, occur at any position in a protein chain, there are 20 x 20 x 20 x 20 = 160,000 
different possible polypeptide chains four amino acids long, or 20” different pos- 
sible polypeptide chains n amino acids long. For a typical protein length of about 
300 amino acids, a cell could theoretically make more than 1099% (20300) different 
polypeptide chains. This is such an enormous number that to produce just one 
molecule of each kind would require many more atoms than exist in the universe. 

Only a very small fraction of this vast set of conceivable polypeptide chains 
would adopt a stable three-dimensional conformation—by some estimates, less 





(C) 


Figure 3-10 A protein formed from 
multiple domains. In the Src protein 
shown, a C-terminal domain with two lobes 
(yellow and orange) forms a protein kinase 
enzyme, while the SH2 and SH3 domains 
perform regulatory functions. (A) A ribbon 
model, with ATP substrate in red. (B) A 
space-filling model, with ATP substrate in 
red. Note that the site that binds ATP is 
positioned at the interface of the two lobes 
that form the kinase. The structure of the 
SH2 domain was illustrated in Figure 3-6. 
(PDB code: 2SRC.) 


Figure 3-11 Ribbon models of three 
different protein domains. (A) Cytochrome 
bs62, a single-domain protein involved in 
electron transport in mitochondria. This 
protein is composed almost entirely of 

a helices. (B) The NAD-binding domain of 
the enzyme lactic dehydrogenase, which 

is composed of a mixture of a helices and 
parallel B sheets. (C) The variable domain 
of an immunoglobulin (antibody) light 
chain, composed of a sandwich of two 
antiparallel B sheets. In these examples, the 
a helices are shown in green, while strands 
organized as B sheets are denoted by red 
arrows. Note how the polypeptide chain 
generally traverses back and forth across 
the entire domain, making sharp turns only 
at the protein surface (Movie 3.5). It is the 
protruding loop regions (yellow) that often 
form the binding sites for other molecules. 
(Adapted from drawings courtesy of Jane 
Richardson.) 
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than one in a billion. And yet the majority of proteins present in cells do adopt 
unique and stable conformations. How is this possible? The answer lies in natu- 
ral selection. A protein with an unpredictably variable structure and biochemical 
activity is unlikely to help the survival of a cell that contains it. Such proteins would 
therefore have been eliminated by natural selection through the enormously long 
trial-and-error process that underlies biological evolution. 

Because evolution has selected for protein function in living organisms, the 
amino acid sequence of most present-day proteins is such that a single confor- 
mation is stable. In addition, this conformation has its chemical properties finely 
tuned to enable the protein to perform a particular catalytic or structural function 
in the cell. Proteins are so precisely built that the change of even a few atoms in 
one amino acid can sometimes disrupt the structure of the whole molecule so 
severely that all function is lost. And, as discussed later in this chapter, when cer- 
tain rare protein misfolding accidents occur, the results can be disastrous for the 
organisms that contain them. 


Proteins Can Be Classified into Many Families 


Once a protein had evolved that folded up into a stable conformation with use- 
ful properties, its structure could be modified during evolution to enable it to 
perform new functions. This process has been greatly accelerated by genetic 
mechanisms that occasionally duplicate genes, allowing one gene copy to evolve 
independently to perform a new function (discussed in Chapter 4). This type of 
event has occurred very often in the past; as a result, many present-day proteins 
can be grouped into protein families, each family member having an amino acid 
sequence and a three-dimensional conformation that resemble those of the other 
family members. 

Consider, for example, the serine proteases, a large family of protein-cleaving 
(proteolytic) enzymes that includes the digestive enzymes chymotrypsin, trypsin, 
and elastase, and several proteases involved in blood clotting. When the prote- 
ase portions of any two of these enzymes are compared, parts of their amino acid 
sequences are found to match. The similarity of their three-dimensional con- 
formations is even more striking: most of the detailed twists and turns in their 
polypeptide chains, which are several hundred amino acids long, are virtually 
identical (Figure 3-12). The many different serine proteases nevertheless have 
distinct enzymatic activities, each cleaving different proteins or the peptide bonds 
between different types of amino acids. Each therefore performs a distinct func- 
tion in an organism. 

The story we have told for the serine proteases could be repeated for hundreds 
of other protein families. In general, the structure of the different members of a 
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Figure 3-12 A comparison of the 
conformations of two serine proteases. 
The backbone conformations of elastase 
and chymotrypsin. Although only those 
amino acids in the polypeptide chain 
shaded in green are the same in the two 
proteins, the two conformations are very 
similar nearly everywhere. The active site of 
each enzyme is circled in red; this is where 
the peptide bonds of the proteins that 
serve as substrates are bound and cleaved 
by hydrolysis. The serine proteases derive 
their name from the amino acid serine, 
whose side chain is part of the active site 
of each enzyme and directly participates 

in the cleavage reaction. The two dots on 
the right side of the chymotrypsin molecule 
mark the new ends created when this 
enzyme cuts its own backbone. 
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Figure 3-13 A comparison of a class of DNA-binding domains, called homeodomains, in a pair of proteins from 

two organisms separated by more than a billion years of evolution. (A) A ribbon model of the structure common to 

both proteins. (B) A trace of the a-carbon positions. The three-dimensional structures shown were determined by x-ray 
crystallography for the yeast a2 protein (green) and the Drosophila engrailed protein (red). (C) A comparison of amino acid 
sequences for the region of the proteins shown in (A) and (B). Black dots mark sites with identical amino acids. Orange dots 
indicate the position of a three-amino-acid insert in the a2 protein. (Adapted from C. Wolberger et al., Cell 67:517-528, 1991. 
With permission from Elsevier.) 


protein family has been more highly conserved than has the amino acid sequence. 
In many cases, the amino acid sequences have diverged so far that we cannot be 
certain of a family relationship between two proteins without determining their 
three-dimensional structures. The yeast a2 protein and the Drosophila engrailed 
protein, for example, are both gene regulatory proteins in the homeodomain fam- 
ily (discussed in Chapter 7). Because they are identical in only 17 of their 60 amino 
acid residues, their relationship became certain only by comparing their three-di- 
mensional structures (Figure 3-13). Many similar examples show that two pro- 
teins with more than 25% identity in their amino acid sequences usually share the 
same overall structure. 

The various members of a large protein family often have distinct functions. 
Some of the amino acid changes that make family members different were no 
doubt selected in the course of evolution because they resulted in useful changes 
in biological activity, giving the individual family members the different functional 
properties they have today. But many other amino acid changes are effectively 
“neutral,” having neither a beneficial nor a damaging effect on the basic structure 
and function of the protein. In addition, since mutation is a random process, there 
must also have been many deleterious changes that altered the three-dimensional 
structure of these proteins sufficiently to harm them. Such faulty proteins would 
have been lost whenever the individual organisms making them were at enough 
of a disadvantage to be eliminated by natural selection. 

Protein families are readily recognized when the genome of any organism is 
sequenced; for example, the determination of the DNA sequence for the entire 
human genome has revealed that we contain about 21,000 protein-coding genes. 
(Note, however, that as a result of alternative RNA splicing, human cells can pro- 
duce much more than 21,000 different proteins, as will be explained in Chapter 
6.) Through sequence comparisons, we can assign the products of at least 40% of 
our protein-coding genes to known protein structures, belonging to more than 
500 different protein families. Most of the proteins in each family have evolved 
to perform somewhat different functions, as for the enzymes elastase and chy- 
motrypsin illustrated previously in Figure 3-12. As explained in Chapter 1 (see 
Figure 1-21), these are sometimes called paralogs to distinguish them from the 
many corresponding proteins in different organisms (orthologs, such as mouse 
and human elastase). 
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As described in Chapter 8, because of the powerful techniques of x-ray crys- 
tallography and nuclear magnetic resonance (NMR), we now know the three-di- 
mensional shapes, or conformations, of more than 100,000 proteins. By carefully 
comparing the conformations of these proteins, structural biologists (that is, 
experts on the structure of biological molecules) have concluded that there are 
a limited number of ways in which protein domains fold up in nature—maybe as 
few as 2000, if we consider all organisms. For most of these so-called protein folds, 
representative structures have been determined. 

The present database of known protein sequences contains more than twenty 
million entries, and it is growing very rapidly as more and more genomes are 
sequenced—revealing huge numbers of new genes that encode proteins. The 
encoded polypeptides range widely in size, from 6 amino acids to a gigantic pro- 
tein of 33,000 amino acids. Protein comparisons are important because related 
structures often imply related functions. Many years of experimentation can be 
saved by discovering that a new protein has an amino acid sequence similarity 
with a protein of known function. Such sequence relationships, for example, first 
indicated that certain genes that cause mammalian cells to become cancerous 
encode protein kinases (discussed in Chapter 20). 


some Protein Domains Are Found in Many Different Proteins 


As previously stated, most proteins are composed of a series of protein domains, 
in which different regions of the polypeptide chain fold independently to form 
compact structures. Such multidomain proteins are believed to have originated 
from the accidental joining of the DNA sequences that encode each domain, cre- 
ating a new gene. In an evolutionary process called domain shuffling, many large 
proteins have evolved through the joining of preexisting domains in new com- 
binations (Figure 3-14). Novel binding surfaces have often been created at the 
juxtaposition of domains, and many of the functional sites where proteins bind to 
small molecules are found to be located there. 

A subset of protein domains has been especially mobile during evolution; 
these seem to have particularly versatile structures and are sometimes referred to 
as protein modules. The structure of one, the SH2 domain, was illustrated in Figure 
3-6. Three other abundant protein domains are illustrated in Figure 3-15. 

Each of the domains shown has a stable core structure formed from strands 
of B sheets, from which less-ordered loops of polypeptide chain protrude. The 
loops are ideally situated to form binding sites for other molecules, as most clearly 
demonstrated for the immunoglobulin fold, which forms the basis for antibody 
molecules. Such f-sheet-based domains may have achieved their evolutionary 
success because they provide a convenient framework for the generation of new 
binding sites for ligands, requiring only small changes to their protruding loops 
(see Figure 3-42). 
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Figure 3-14 Domain shuffling. An 
extensive shuffling of blocks of protein 
sequence (protein domains) has occurred 
during protein evolution. Those portions 

of a protein denoted by the same shape 
and color in this diagram are evolutionarily 
related. Serine proteases like chymotrypsin 
are formed from two domains (brown). In 
the three other proteases shown, which 
are highly regulated and more specialized, 
these two protease domains are connected 
to one or more domains that are similar to 
domains found in epidermal growth factor 
(EGF; green), to a calcium-binding protein 
(yellow), or to a “kringle” domain (blue). 
Chymotrypsin is illustrated in Figure 3-12. 


Figure 3-15 The three-dimensional 
structures of three commonly used 
protein domains. In these ribbon 
diagrams, B-sheet strands are shown 

as arrows, and the N- and C-termini are 
indicated by red spheres. Many more such 
“modules” exist in nature. (Adapted from 
M. Baron, D.G. Norman and I.D. Campbell, 
Trends Biochem. Sci. 16:13-17, 1991, with 
permission from Elsevier, and D.J. Leahy 
et al., Science 258:987-991, 1992, with 
permission from AAAS.) 
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Figure 3-16 An extended structure formed from a series of protein 
domains. Four fibronectin type 3 domains (see Figure 3-15) from the 
extracellular matrix molecule fibronectin are illustrated in (A) ribbon and (B) 
space-filling models. (Adapted from D.J. Leahy, |. Aukhil and H.P. Erickson, 
Cell 84:155-164, 1996. With permission from Elsevier.) 


A second feature of these protein domains that explains their utility is the ease 
with which they can be integrated into other proteins. Two of the three domains 
illustrated in Figure 3-15 have their N- and C-terminal ends at opposite poles of 
the domain. When the DNA encoding such a domain undergoes tandem duplica- 
tion, which is not unusual in the evolution of genomes (discussed in Chapter 4), 
the duplicated domains with this “in-line” arrangement can be readily linked in 
series to form extended structures—either with themselves or with other in-line 
domains (Figure 3-16). Stiff extended structures composed of a series of domains 
are especially common in extracellular matrix molecules and in the extracellular 
portions of cell-surface receptor proteins. Other frequently used domains, includ- 
ing the kringle domain illustrated in Figure 3-15 and the SH2 domain, are of a 
“plug-in” type, with their N- and C-termini close together. After genomic rear- 
rangements, such domains are usually accommodated as an insertion into a loop 
region of a second protein. 

A comparison of the relative frequency of domain utilization in different 
eukaryotes reveals that, for many common domains, such as protein kinases, this 
frequency is similar in organisms as diverse as yeast, plants, worms, flies, and 
humans. But there are some notable exceptions, such as the Major Histocom- 
patibility Complex (MHC) antigen-recognition domain (see Figure 24-36) that 
is present in 57 copies in humans, but absent in the other four organisms just 
mentioned. Domains such as these have specialized functions that are not shared 
with the other eukaryotes; they are assumed to have been strongly selected for 
during recent evolution to produce the multiple copies observed. Similarly, the 
SH2 domain shows an unusual increase in its numbers in higher eukaryotes; such 
domains might be assumed to be especially useful for multicellularity. 


Certain Pairs of Domains Are Found Together in Many Proteins 


We can construct a large table displaying domain usage for each organism whose 
genome sequence is known. For example, the human genome contains the DNA 
sequences for about 1000 immunoglobulin domains, 500 protein kinase domains, 
250 DNA-binding homeodomains, 300 SH3 domains, and 120 SH2 domains. In 
addition, we find that more than two-thirds of all proteins consist of two or more 
domains, and that the same pairs of domains occur repeatedly in the same rela- 
tive arrangement in a protein. Although half of all domain families are common 
to archaea, bacteria, and eukaryotes, only about 5% of the two-domain combi- 
nations are similarly shared. This pattern suggests that most proteins containing 
especially useful two-domain combinations arose through domain shuffling rel- 
atively late in evolution. 


The Human Genome Encodes a Complex Set of Proteins, 
Revealing That Much Remains Unknown 


The result of sequencing the human genome has been surprising, because it 
reveals that our chromosomes contain only about 21,000 protein-coding genes. 
Based on this number alone, we would appear to be no more complex than the 
tiny mustard weed, Arabidopsis, and only about 1.3-fold more complex than a 
nematode worm. The genome sequences also reveal that vertebrates have inher- 
ited nearly all of their protein domains from invertebrates—with only 7% of iden- 
tified human domains being vertebrate-specific. 

Each of our proteins is on average more complicated, however (Figure 3-17). 
Domain shuffling during vertebrate evolution has given rise to many novel 
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Figure 3-17 Domain structure of a group 
of evolutionarily related proteins that 

are thought to have a similar function. In 
general, there is a tendency for the proteins 
in more complex organisms, Such as 
humans, to contain additional domains —as 
is the case for the DNA-binding protein 
compared here. 
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combinations of protein domains, with the result that there are nearly twice as 
many combinations of domains found in human proteins as in a worm or a fly. 
Thus, for example, the trypsinlike serine protease domain is linked to at least 18 
other types of protein domains in human proteins, whereas it is found covalently 
joined to only 5 different domains in the worm. This extra variety in our proteins 
greatly increases the range of protein-protein interactions possible (see Figure 
3-79), but how it contributes to making us human is not known. 

The complexity of living organisms is staggering, and it is quite sobering to 
note that we currently lack even the tiniest hint of what the function might be 
for more than 10,000 of the proteins that have thus far been identified through 
examining the human genome. There are certainly enormous challenges ahead 
for the next generation of cell biologists, with no shortage of fascinating mysteries 
to solve. 


Larger Protein Molecules Often Contain More Than One 
Polypeptide Chain 


The same weak noncovalent bonds that enable a protein chain to fold into a spe- 
cific conformation also allow proteins to bind to each other to produce larger 
structures in the cell. Any region of a protein’s surface that can interact with 
another molecule through sets of noncovalent bonds is called a binding site. A 
protein can contain binding sites for various large and small molecules. If a bind- 
ing site recognizes the surface of a second protein, the tight binding of two folded 
polypeptide chains at this site creates a larger protein molecule with a precisely 
defined geometry. Each polypeptide chain in such a protein is called a protein 
subunit. 

In the simplest case, two identical folded polypeptide chains bind to each 
other in a “head-to-head” arrangement, forming a symmetric complex of two 
protein subunits (a dimer) held together by interactions between two identical 
binding sites. The Cro repressor protein—a viral gene regulatory protein that binds 
to DNA to turn specific viral genes off in an infected bacterial cell—provides an 
example (Figure 3-18). Cells contain many other types of symmetric protein com- 
plexes, formed from multiple copies of a single polypeptide chain (for example, 
see Figure 3-20 below). 

Many of the proteins in cells contain two or more types of polypeptide chains. 
Hemoglobin, the protein that carries oxygen in red blood cells, contains two 
identical a-globin subunits and two identical B-globin subunits, symmetrically 
arranged (Figure 3-19). Such multisubunit proteins are very common in cells, 
and they can be very large (Movie 3.6). 


some Globular Proteins Form Long Helical Filaments 


Most of the proteins that we have discussed so far are globular proteins, in which 
the polypeptide chain folds up into a compact shape like a ball with an irregu- 
lar surface. Some of these protein molecules can nevertheless assemble to form 
filaments that may span the entire length of a cell. Most simply, a long chain of 
identical protein molecules can be constructed if each molecule has a binding 
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Figure 3-18 Two identical protein 
subunits binding together to form 

a symmetric protein dimer. The Cro 
repressor protein from bacteriophage 
lambda binds to DNA to turn off a specific 
subset of viral genes. Its two identical 
subunits bind head-to-head, held together 
by a combination of hydrophobic forces 
(blue) and a set of hydrogen bonds (yellow 
region). (Adapted from D.H. Ohlendort, 
D.E. Tronrud and B.W. Matthews, J. Mol. 
Biol. 280:129-136, 1998. With permission 
from Academic Press.) 





Figure 3-19 A protein formed as a 
symmetric assembly using two each of 
two different subunits. Hemoglobin is an 
abundant protein in red blood cells that 
contains two copies of a-globin (green) 
and two copies of B-globin (b/ue). Each of 
these four polypeptide chains contains a 
heme molecule (red), which is the site that 
binds oxygen (O2). Thus, each molecule 
of hemoglobin in the blood carries four 
molecules of oxygen. (PDB code: 2DHB.) 
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Figure 3-20 Protein assemblies. (A) A protein with just one binding site can 
form a dimer with another identical protein. (B) Identical proteins with two 
different binding sites often form a long helical filament. (C) If the two binding 
sites are disposed appropriately in relation to each other, the protein subunits 
may form a closed ring instead of a helix. (For an example of A, see 

Figure 3-18; for an example of B, see Figure 3-21; for examples of C, see 
Figures 5-14 and 14-21.) 


site complementary to another region of the surface of the same molecule (Figure 
3-20). An actin filament, for example, is a long helical structure produced from 
many molecules of the protein actin (Figure 3-21). Actin is a globular protein that 
is very abundant in eukaryotic cells, where it forms one of the major filament sys- 
tems of the cytoskeleton (discussed in Chapter 16). 

We will encounter many helical structures in this book. Why is a helix such a 
common structure in biology? As we have seen, biological structures are often 
formed by linking similar subunits into long, repetitive chains. If all the subunits 
are identical, the neighboring subunits in the chain can often fit together in only 
one way, adjusting their relative positions to minimize the free energy of the con- 
tact between them. As a result, each subunit is positioned in exactly the same way 
in relation to the next, so that subunit 3 fits onto subunit 2 in the same way that 
subunit 2 fits onto subunit 1, and so on. Because it is very rare for subunits to join 
up in a straight line, this arrangement generally results in a helix—a regular struc- 
ture that resembles a spiral staircase, as illustrated in Figure 3-22. Depending on 
the twist of the staircase, a helix is said to be either right-handed or left-handed 
(see Figure 3-22E). Handedness is not affected by turning the helix upside down, 
but it is reversed if the helix is reflected in the mirror. 

The observation that helices occur commonly in biological structures holds 
true whether the subunits are small molecules linked together by covalent bonds 
(for example, the amino acids in an a helix) or large protein molecules that are 
linked by noncovalent forces (for example, the actin molecules in actin filaments). 
This is not surprising. A helix is an unexceptional structure, and it is generated 
simply by placing many similar subunits next to each other, each in the same 
strictly repeated relationship to the one before—that is, with a fixed rotation fol- 
lowed by a fixed translation along the helix axis, as in a spiral staircase. 


Many Protein Molecules Have Elongated, Fibrous Shapes 


Enzymes tend to be globular proteins: even though many are large and compli- 
cated, with multiple subunits, most have an overall rounded shape. In Figure 3-21, 
we Saw that a globular protein can also associate to form long filaments. But there 
are also functions that require each individual protein molecule to span a large 
distance. These proteins generally have a relatively simple, elongated three-di- 
mensional structure and are commonly referred to as fibrous proteins. 

One large family of intracellular fibrous proteins consists of a-keratin, intro- 
duced when we presented the a helix, and its relatives. Keratin filaments are 
extremely stable and are the main component in long-lived structures such as 
hair, horn, and nails. An o-keratin molecule is a dimer of two identical subunits, 
with the long a helices of each subunit forming a coiled-coil (see Figure 3-9). The 
coiled-coil regions are capped at each end by globular domains containing bind- 
ing sites. This enables this class of protein to assemble into ropelike intermediate 
filaments—an important component of the cytoskeleton that creates the cell’s 
internal structural framework (see Figure 16-67). 

Fibrous proteins are especially abundant outside the cell, where they are a 
main component of the gel-like extracellular matrix that helps to bind collections 
of cells together to form tissues. Cells secrete extracellular matrix proteins into 
their surroundings, where they often assemble into sheets or long fibrils. Colla- 
gen is the most abundant of these proteins in animal tissues. A collagen molecule 
consists of three long polypeptide chains, each containing the nonpolar amino 
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Figure 3-21 Actin filaments. 

(A) Transmission electron micrographs of 
negatively stained actin filaments. (B) The 
helical arrangement of actin molecules in an 
actin filament. (A, courtesy of Roger Craig.) 
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Figure 3-22 Some properties of a helix. 
(A-D) A helix forms when a series of 
subunits bind to each other in a regular 
way. At the bottom, each of these helices 
is viewed from directly above the helix and 
seen to have two (A), three (B), and six 
(C and D) subunits per helical turn. Note 
that the helix in (D) has a wider path 
than that in (C), but the same number of 
subunits per turn. (E) As discussed in the 
text, a helix can be either right-handed or 
left-handed. As a reference, it is useful to 
remember that standard metal screws, 
which insert when turned clockwise, are 

| i right-handed. Note that a helix retains the 

eft- right- = 
handed handed same handedness when it is turned upside 
(E) down. (PDB code: 2DHB.) 
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acid glycine at every third position. This regular structure allows the chains to 
wind around one another to generate a long regular triple helix (Figure 3-23A). 
Many collagen molecules then bind to one another side-by-side and end-to- 
end to create long overlapping arrays—thereby generating the extremely tough 
collagen fibrils that give connective tissues their tensile strength, as described in 
Chapter 19. 


Proteins Contain a Surprisingly Large Amount of Intrinsically 
Disordered Polypeptide Chain 


It has been well known for a long time that, in complete contrast to collagen, 
another abundant protein in the extracellular matrix, elastin, is formed as a highly 
disordered polypeptide. This disorder is essential for elastin’s function. Its rela- 
tively loose and unstructured polypeptide chains are covalently cross-linked to 
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Figure 3-23 Collagen and elastin. (A) Collagen is a triple helix formed by three extended protein chains that wrap around one another (bottom). 
Many rodlike collagen molecules are cross-linked together in the extracellular space to form unextendable collagen fibrils (top) that have the tensile 
strength of steel. The striping on the collagen fibril is caused by the regular repeating arrangement of the collagen molecules within the fibril. 

(B) Elastin polypeptide chains are cross-linked together in the extracellular space to form rubberlike, elastic fibers. Each elastin molecule uncoils into a 
more extended conformation when the fiber is stretched and recoils spontaneously as soon as the stretching force is relaxed. The cross-linking in the 
extracellular soace mentioned creates covalent linkages between lysine side chains, but the chemistry is different for collagen and elastin. 
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produce a rubberlike, elastic meshwork that can be reversibly pulled from one 
conformation to another, as illustrated in Figure 3-23B. The elastic fibers that 
result enable skin and other tissues, such as arteries and lungs, to stretch and 
recoil without tearing. 

Intrinsically disordered regions of proteins are frequent in nature, and they 
have important functions in the interior of cells. As we have already seen, proteins 
often have loops of polypeptide chain that protrude from the core region of a pro- 
tein domain to bind other molecules. Some of these loops remain largely unstruc- 
tured until they bind to a target molecule, adopting a specific folded conforma- 
tion only when this other molecule is bound. Many proteins were also known to 
have intrinsically disordered tails at one or the other end of a structured domain 
(see, for example, the histones in Figure 4-24). But the extent of such disordered 
structure only became clear when genomes were sequenced. This allowed bio- 
informatic methods to be used to analyze the amino acid sequences that genes 
encode, searching for disordered regions based on their unusually low hydropho- 
bicity and relatively high net charge. Combining these results with other data, it is 
now thought that perhaps a quarter of all eukaryotic proteins can adopt structures 
that are mostly disordered, fluctuating rapidly between many different conforma- 
tions. Many such intrinsically disordered regions contain repeated sequences of 
amino acids. What do these disordered regions do? 

Some known functions are illustrated in Figure 3-24. One predominant func- 
tion is to form specific binding sites for other protein molecules that are of high 
specificity, but readily altered by protein phosphorylation, protein dephosphor- 
ylation, or any of the other covalent modifications that are triggered by cell sig- 
naling events (Figure 3-24A and B). We shall see, for example, that the eukaryotic 
RNA polymerase enzyme that produces mRNAs contains a long, unstructured 
C-terminal tail that is covalently modified as its RNA synthesis proceeds, thereby 
attracting specific other proteins to the transcription complex at different times 
(see Figure 6-22). And this unstructured tail interacts with a different type of low 
complexity domain when the RNA polymerase is recruited to the specific sites on 
the DNA where it begins synthesis. 

As illustrated in Figure 3-24C, an unstructured region can also serve as a 
“tether” to hold two protein domains in close proximity to facilitate their inter- 
action. For example, it is this tethering function that allows substrates to move 
between active sites in large multienzyme complexes (see Figure 3-54). A simi- 
lar tethering function allows large scaffold proteins with multiple protein-binding 
sites to concentrate sets of interacting proteins, both increasing reaction rates and 
confining their reaction to a particular site in a cell (see Figure 3-78). 

Like elastin, other proteins have a function that directly requires that they 
remain largely unstructured. Thus, large numbers of disordered protein chains 
in close proximity can create micro-regions of gel-like consistency inside the cell 
that restrict diffusion. For example, the abundant nucleoporins that coat the inner 
surface of the nuclear pore complex form a random coil meshwork (Figure 3-24) 
that is critical for selective nuclear transport (see Figure 12-8). 





Figure 3-24 Some important functions 
for intrinsically disordered protein 
sequences. (A) Unstructured regions 

of polypeptide chain often form binding 
sites for other proteins. Although these 
binding events are of high specificity, 

they are often of low affinity due to the 
free-energy cost of folding the normally 
unfolded partner (and they are thus readily 
reversible). (B) Unstructured regions can 
be easily modified covalently to change 
their binding preferences, and they are 
therefore frequently involved in cell signaling 
processes. In this schematic, multiple sites 
of protein phosphorylation are indicated. 
(C) Unstructured regions frequently create 
“tethers” that hold interacting protein 
domains in close proximity. (D) A dense 
network of unstructured proteins can form 
a diffusion barrier, as the nucleoporins do 
for the nuclear pore. 
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Covalent Cross-Linkages Stabilize Extracellular Proteins 


Many protein molecules are either attached to the outside of a cell’s plasma mem- 
brane or secreted as part of the extracellular matrix. All such proteins are directly 
exposed to extracellular conditions. To help maintain their structures, the poly- 
peptide chains in such proteins are often stabilized by covalent cross-linkages. 
These linkages can either tie together two amino acids in the same protein, or 
connect different polypeptide chains in a multisubunit protein. Although many 
other types exist, the most common cross-linkages in proteins are covalent sulfur- 
sulfur bonds. These disulfide bonds (also called S-S bonds) form as cells prepare 
newly synthesized proteins for export. As described in Chapter 12, their formation 
is catalyzed in the endoplasmic reticulum by an enzyme that links together two 
pairs of -SH groups of cysteine side chains that are adjacent in the folded protein 
(Figure 3-25). Disulfide bonds do not change the conformation of a protein but 
instead actas atomic staples to reinforce its most favored conformation. For exam- 
ple, lysozyme—an enzyme in tears that dissolves bacterial cell walls—retains its 
antibacterial activity for a long time because it is stabilized by such cross-linkages. 

Disulfide bonds generally fail to form in the cytosol, where a high concentra- 
tion of reducing agents converts S-S bonds back to cysteine -SH groups. Appar- 
ently, proteins do not require this type of reinforcement in the relatively mild envi- 
ronment inside the cell. 


Protein Molecules Often Serve as Subunits for the Assembly of 
Large Structures 


The same principles that enable a protein molecule to associate with itself to form 
rings or a long filament also operate to generate much larger structures formed 
from a set of different macromolecules, such as enzyme complexes, ribosomes, 
viruses, and membranes. These large objects are not made as single, giant, cova- 
lently linked molecules. Instead they are formed by the noncovalent assembly of 
many separately manufactured molecules, which serve as the subunits of the final 
structure. 
The use of smaller subunits to build larger structures has several advantages: 


1. Alarge structure built from one or a few repeating smaller subunits requires 
only a small amount of genetic information. 


2. Both assembly and disassembly can be readily controlled reversible pro- 
cesses, because the subunits associate through multiple bonds of relatively 
low energy. 


3. Errors in the synthesis of the structure can be more easily avoided, since 
correction mechanisms can operate during the course of assembly to 
exclude malformed subunits. 
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Figure 3-25 Disulfide bonds. Covalent 
disulfide bonds form between adjacent 
cysteine side chains. These cross- 
linkages can join either two parts of the 
same polypeptide chain or two different 
polypeptide chains. Since the energy 
required to break one covalent bond is 
much larger than the energy required to 
break even a whole set of noncovalent 
bonds (see Table 2-1, p. 45), a disulfide 
bond can have a major stabilizing effect on 
a protein (Movie 3.7). 
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Some protein subunits assemble into flat sheets in which the subunits are 
arranged in hexagonal patterns. Specialized membrane proteins are sometimes 
arranged this way in lipid bilayers. With a slight change in the geometry of the 
individual subunits, a hexagonal sheet can be converted into a tube (Figure 3-26) 
or, with more changes, into a hollow sphere. Protein tubes and spheres that bind 
specific RNA and DNA molecules in their interior form the coats of viruses. 

The formation of closed structures, such as rings, tubes, or spheres, provides 
additional stability because it increases the number of bonds between the protein 
subunits. Moreover, because such a structure is created by mutually dependent, 
cooperative interactions between subunits, a relatively small change that affects 
each subunit individually can cause the structure to assemble or disassemble. 
These principles are dramatically illustrated in the protein coat or capsid of many 
simple viruses, which takes the form of a hollow sphere based on an icosahedron 
(Figure 3-27). Capsids are often made of hundreds of identical protein subunits 
that enclose and protect the viral nucleic acid (Figure 3-28). The protein in such 
a capsid must have a particularly adaptable structure: not only must it make 
several different kinds of contacts to create the sphere, it must also change this 
arrangement to let the nucleic acid out to initiate viral replication once the virus 
has entered a cell. 


Many Structures in Cells Are Capable of Self-Assembly 


The information for forming many of the complex assemblies of macromolecules 
in cells must be contained in the subunits themselves, because purified subunits 
can spontaneously assemble into the final structure under the appropriate con- 
ditions. The first large macromolecular aggregate shown to be capable of self-as- 
sembly from its component parts was tobacco mosaic virus (TMV). This virus is 
a long rod in which a cylinder of protein is arranged around a helical RNA core 
(Figure 3-29). If the dissociated RNA and protein subunits are mixed together in 
solution, they recombine to form fully active viral particles. The assembly process 
is unexpectedly complex and includes the formation of double rings of protein, 
which serve as intermediates that add to the growing viral coat. 

Another complex macromolecular aggregate that can reassemble from its 
component parts is the bacterial ribosome. This structure is composed of about 
55 different protein molecules and 3 different rRNA molecules. Incubating a mix- 
ture of the individual components under appropriate conditions in a test tube 
causes them to spontaneously re-form the original structure. Most importantly, 
such reconstituted ribosomes are able to catalyze protein synthesis. As might be 
expected, the reassembly of ribosomes follows a specific pathway: after certain 
proteins have bound to the RNA, this complex is then recognized by other pro- 
teins, and so on, until the structure is complete. 

It is still not clear how some of the more elaborate self-assembly processes 
are regulated. Many structures in the cell, for example, seem to have a precisely 
defined length that is many times greater than that of their component macromol- 
ecules. How such length determination is achieved is in many cases a mystery. In 


Figure 3-26 Single protein subunits form 
protein assemblies that feature multiple 
protein-protein contacts. Hexagonally 
packed globular protein subunits are 
shown here forming either flat sheets or 
tubes. Generally, such large structures are 
not considered to be single “molecules.” 
Instead, like the actin filament described 
previously, they are viewed as assemblies 
formed of many different molecules. 
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Figure 3-27 The protein capsid of a 
virus. The structure of the simian virus 
SV40 capsid has been determined by x-ray 
crystallography and, as for the capsids of 
many other viruses, it is Known in atomic 
detail. (Courtesy of Robert Grant, Stephan 
Crainic, and James M. Hogle.) 
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the simplest case, a long core protein or other macromolecule provides a scaffold 
that determines the extent of the final assembly. This is the mechanism that deter- 
mines the length of the TMV particle, where the RNA chain provides the core. 
Similarly, a core protein interacting with actin is thought to determine the length 
of the thin filaments in muscle. 
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Figure 3-28 The structure of a spherical 
virus. In viruses, many copies of a single 
protein subunit often pack together 

to create a spherical shell (a capsid). 

This capsid encloses the viral genome, 
composed of either RNA or DNA (see also 
Figure 3-27). For geometric reasons, no 
more than 60 identical subunits can pack 
together in a precisely symmetric way. If 
slight irregularities are allowed, however, 
more subunits can be used to produce 

a larger capsid that retains icosahedral 
symmetry. The tomato bushy stunt virus 
(TBSV) shown here, for example, is a 
spherical virus about 33 nm in diameter 
formed from 180 identical copies of a 
386-amino-acid capsid protein plus an 
RNA genome of 4500 nucleotides. To 
construct such a large capsid, the protein 
must be able to fit into three somewhat 
different environments. This requires three 
slightly different conformations, each of 
which is differently colored in the virus 
particle shown here. The postulated 
pathway of assembly is shown; the precise 
three-dimensional structure has been 
determined by x-ray diffraction. (Courtesy 
of Steve Harrison.) 
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Figure 3-29 The structure of tobacco mosaic virus (TMV). (A) An electron micrograph of the viral particle, which consists of 
a single long RNA molecule enclosed in a cylindrical protein coat composed of identical protein subunits. (B) A model showing 
part of the structure of TMV. A single-stranded RNA molecule of 6395 nucleotides is packaged in a helical coat constructed 
from 2130 copies of a coat protein 158 amino acids long. Fully infective viral particles can self-assemble in a test tube from 
purified RNA and protein molecules. (A, courtesy of Robley Williams; B, courtesy of Richard J. Feldmann.) 
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Figure 3-30 Proteolytic cleavage in insulin assembly. The polypeptide 
hormone insulin cannot spontaneously re-form efficiently if its disulfide bonds 
are disrupted. It is synthesized as a larger protein (oroinsulin) that is cleaved 
by a proteolytic enzyme after the protein chain has folded into a specific 
shape. Excision of part of the proinsulin polypeptide chain removes some of 
the information needed for the protein to fold spontaneously into its normal 
conformation. Once insulin has been denatured and its two polypeptide 
chains have separated, its ability to reassemble is lost. 


Assembly Factors Often Aid the Formation of Complex Biological 
otructures 


Not all cellular structures held together by noncovalent bonds self-assemble. A cil- 
ium, or a myofibril of a muscle cell, for example, cannot form spontaneously from 
a solution of its component macromolecules. In these cases, part of the assembly 
information is provided by special enzymes and other proteins that perform the 
function of templates, serving as assembly factors that guide construction but take 
no part in the final assembled structure. 

Even relatively simple structures may lack some of the ingredients necessary 
for their own assembly. In the formation of certain bacterial viruses, for example, 
the head, which is composed of many copies of a single protein subunit, is assem- 
bled on a temporary scaffold composed of a second protein that is produced by 
the virus. Because the second protein is absent from the final viral particle, the 
head structure cannot spontaneously reassemble once it has been taken apart. 
Other examples are known in which proteolytic cleavage is an essential and irre- 
versible step in the normal assembly process. This is even the case for some small 
protein assemblies, including the structural protein collagen and the hormone 
insulin (Figure 3-30). From these relatively simple examples, it seems certain that 
the assembly of a structure as complex as a cilium will involve a temporal and 
spatial ordering that is imparted by numerous other components. 


Amyloid Fibrils Can Form from Many Proteins 


A special class of protein structures, utilized for some normal cell functions, can 
also contribute to human diseases when not controlled. These are self-propagat- 
ing, stable B-sheet aggregates called amyloid fibrils. These fibrils are built from a 
series of identical polypeptide chains that become layered one over the other to 
create a continuous stack of B sheets, with the B strands oriented perpendicular 
to the fibril axis to form a cross-beta filament (Figure 3-31). Typically, hundreds of 
monomers will aggregate to form an unbranched fibrous structure that is several 
micrometers long and 5 to 15 nm in width. A surprisingly large fraction of pro- 
teins have the potential to form such structures, because the short segment of the 
polypeptide chain that forms the spine of the fibril can have a variety of different 
sequences and follow one of several different paths (Figure 3-32). However, very 
few proteins will actually form this structure inside cells. 

In normal humans, the quality control mechanisms governing proteins grad- 
ually decline with age, occasionally permitting normal proteins to form patho- 
logical aggregates. The protein aggregates may be released from dead cells and 
accumulate as amyloid in the extracellular matrix. In extreme cases, the accumu- 
lation of such amyloid fibrils in the cell interior can kill the cells and damage tis- 
sues. Because the brain is composed of a highly organized collection of nerve cells 
that cannot regenerate, the brain is especially vulnerable to this sort of cumula- 
tive damage. Thus, although amyloid fibrils may form in different tissues, and are 
known to cause pathologies in several sites in the body, the most severe amyloid 
pathologies are neurodegenerative diseases. For example, the abnormal forma- 
tion of highly stable amyloid fibrils is thought to play a central causative role in 
both Alzheimer’s and Parkinson’s diseases. 

Prion diseases are a special type of these pathologies. They have attained 
special notoriety because, unlike Parkinson’s or Alzheimer’s, prion diseases can 
spread from one organism to another, providing that the second organism eats a 
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Figure 3-31 Detailed structure of the core of an amyloid fibril. Illustrated here 
is the cross-beta spine of the amyloid fibril that is formed by a peptide of seven 
amino acids from the protein Sup35, an extensively studied yeast prion. Consisting 
of the sequence glycine-asparagine-asparagine-glutamine-glutamine-asparagine- 
tyrosine (GNNQQNY), its structure was determined by X-ray crystallography. 
Although the cross-beta spines of other amyloids are similar, being composed of 
two long B sheets held together by a “steric zipper,” different detailed structures 
are observed depending on the short peptide sequence involved. (A) One half 

of the spine is illustrated. Here, a standard parallel B-sheet structure (see 

p. 116) is held together by a set of hydrogen bonds between two side chains plus 
hydrogen bonds between two backbone atoms, as illustrated (oxygen atoms red 
and nitrogen atoms blue). Note that in this example, the adjacent peptides are 
exactly in register. Although only five layers are shown (each layer depicted as an 
arrow), the actual structure would extend for many tens of thousands of layers 

in the plane of the paper. (B) The complete cross-beta spine. A second, identical 
B-sheet is paired with the first one to form a two-sheet motif that runs the entire 
length of the fibril. (C) View of the complete spine in (B) from the top. The closely 
interdigitated side chains form a tight, water-free junction known as a Steric zipper. 
(Courtesy of David Eisenberg and Michael Sawaya, UCLA; based on R. Nelson et 
al., Nature 435:773-778, 2005. With permission from Macmillan Publishers Ltd.) 


tissue containing the protein aggregate. A set of closely related diseases—scra- 
pie in sheep, Creutzfeldt-Jakob disease (CJD) in humans, Kuru in humans, and 
bovine spongiform encephalopathy (BSE) in cattle—are caused by a misfolded, 
ageregated form of a particular protein called PrP (for prion protein). PrP is nor- 
mally located on the outer surface of the plasma membrane, most prominently in 
neurons, and it has the unfortunate property of forming amyloid fibrils that are 
“infectious” because they convert normally folded molecules of PrP to the same 
pathological form (Figure 3-33). This property creates a positive feedback loop 
that propagates the abnormal form of PrP, called PrP*, and allows the pathological 
conformation to spread rapidly from cell to cell in the brain, eventually causing 
death. It can be dangerous to eat the tissues of animals that contain PrP%*, as wit- 
nessed by the spread of BSE (commonly referred to as “mad cow disease”) from 
cattle to humans. Fortunately, in the absence of PrP*, PrP is extraordinarily diffi- 
cult to convert to its abnormal form. 

A closely related “protein-only inheritance” has been observed in yeast cells. 
The ability to study infectious proteins in yeast has clarified another remarkable 
feature of prions. These protein molecules can form several distinctively different 
types of amyloid fibrils from the same polypeptide chain. Moreover, each type of 
ageregate can be infectious, forcing normal protein molecules to adopt the same 
type of abnormal structure. Thus, several different “strains” of infectious particles 
can arise from the same polypeptide chain. 
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Figure 3-32 The structure of an amyloid 
fibril. (A) Schematic diagram of the 
structure of a amyloid fibril that is formed 
by the aggregation of a protein. Only 

the cross-beta spine of an amyloid fibril 
resembles the structure shown in Figure 
3-31. (B) A cut-away view of a structure 
proposed for the amyloid fibril that can 

be formed in a test tube by the enzyme 
ribonuclease A, illustrating how the core 
of the fibril—formed by a short segment— 
relates to the rest of the structure. 

(C) Electron micrograph of amyloid fibrils. 
(A, from L. Esposito, C. Pedone and 

L. Vitagliano, Proc. Nat! Acad. Sci. USA 
103:11533-11538, 2006; B, from S. 
Sambashivan et al., Nature 437:266-269, 
2005; C, courtesy of David Eisenberg.) 
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Figure 3-33 The special protein aggregates that cause prion diseases. 
(A) Schematic illustration of the type of conformational change in the PrP 
protein (prion protein) that produces material for an amyloid fibril. (B) The self- 
infectious nature of the protein aggregation that is central to prion diseases. 
PrP is highly unusual because the misfolded version of the protein, called 
PrP*, induces the normal PrP protein it contacts to change its conformation, 
as shown. 


Amyloid Structures Can Perform Useful Functions in Cells 


Amyloid fibrils were initially studied because they cause disease. But the same type 
of structure is now known to be exploited by cells for useful purposes. Eukaryotic 
cells, for example, store many different peptide and protein hormones that they 
will secrete in specialized “secretory granules,’ which package a high concentra- 
tion of their cargo in dense cores with a regular structure (see Figure 13-65). We 
now know that these structured cores consist of amyloid fibrils, which in this case 
have a structure that causes them to dissolve to release soluble cargo after being 
secreted by exocytosis to the cell exterior (Figure 3-34A). Many bacteria use the 
amyloid structure in a very different way, secreting proteins that form long amy- 
loid fibrils projecting from the cell exterior that help to bind bacterial neighbors 
into biofilms (Figure 3-34B). Because these biofilms help bacteria to survive in 
adverse environments (including in humans treated with antibiotics), new drugs 
that specifically disrupt the fibrous networks formed by bacterial amyloids have 
promise for treating human infections. 


Many Proteins Contain Low-complexity Domains that Can Form 
“Reversible Amyloids” 


Until recently, those amyloids with useful functions were thought to be either 
confined to the interior of specialized vesicles or expressed on the exterior of cells, 
as in Figure 3-34. However, new experiments reveal that a large set of low com- 
plexity domains can form amyloid fibers that have functional roles in both the 
cell nucleus and the cell cytoplasm. These domains are normally unstructured 
and consist of stretches of amino acid sequence that can span hundreds of amino 
acids, while containing only a small subset of the 20 different amino acids. In con- 
trast to the disease-associated amyloid in Figure 3-33, these newly discovered 
structures are held together by weaker noncovalent bonds and readily dissociate 
in response to signals—hence their name reversible amyloids. 

Many proteins with such domains also contain a different set of domains that 
bind to specific other protein or RNA molecules. Thus, their controlled aggregation 
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Figure 3-34 Two normal functions for 
amyloid fibrils. (A) In eukaryotic cells, 
protein cargo can be packed very densely 
in secretory vesicles and stored until 
signals cause a release of this cargo by 
exocytosis. For example, proteins and 
peptide hormones of the endocrine system, 
such as glucagon and calcitonin, are 
efficiently stored as short amyloid fibrils, 
which dissociate when they reach the cell 
exterior. (B) Bacteria produce amyloid fibrils 
on their surface by secreting the precursor 
proteins; these fibrils then create biofilms 
that link together, and help to protect, large 
numbers of individual bacteria. 
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Figure 3-35 Measuring the association between “reversible amyloids.” (A) Experimental setup. The fiber-forming domains 
from proteins that contain a low-complexity domain were produced in large quantities by cloning the DNA sequence that encodes 
them into an E. coli plasmid so as to allow overproduction of that domain (see p. 483). After these protein domains were purified 
by affinity chromatography, a tiny droplet of concentrated solution of one of the domains (here the FUS low-complexity domain) 
was deposited onto a microscope dish and allowed to gel. The gel was then soaked in a dilute solution of a fluorescently 

labeled low-complexity domain from the same or a different protein, making the gel fluorescent. After replacing the dilute protein 
solution with buffer, the relative strength of binding of the various domains to each other could then be measured by the decay of 
fluorescence, as indicated. (B) Results. The low-complexity domain from the FUS protein binds more tightly to itself than it does 
to the low-complexity domains from the proteins hnRNPA1 or hnRNPA2. A separate experiment reveals that these three different 
RNA binding proteins associate by forming mixed amyloid fibrils. (Adapted from M.Kato et al., Cell 149: 753-767, 2012). 


in the cell can form a hydrogel that pulls these and other molecules into punctate 
structures called intracellular bodies, or granules. Specific mRNAs can be seques- 
tered in such granules, where they are stored until made available by a controlled 
disassembly of the core amyloid structure that holds them together. 

Consider the FUS protein, an essential nuclear protein with roles in the tran- 
scription, processing, and transport of specific mRNA molecules. Over 80 per- 
cent of its C-terminal domain of two hundred amino acids is composed of only 
four amino acids: glycine, serine, glutamine, and tyrosine. This low complexity 
domain is attached to several other domains that bind to RNA molecules. At high 
enough concentrations in a test tube, it forms a hydrogel that will associate with 
either itself or with the low complexity domains from other proteins. As illustrated 
by the experiment in Figure 3-35, although different low complexity domains 
bind to each other, homotypic interactions appear to be of greatest affinity (thus, 
the FUS low complexity domain binds most tightly to itself). Further experiments 
reveal that that both the homotypic and the heterotypic bindings are mediated 
through a B-sheet core structure forming amyloid fibrils, and that these structures 
bind to other types of repeat sequences in the manner indicated in Figure 3-36. 
Many of these interactions appear to be controlled by the phosphorylation of ser- 
ine side chains in the one or both of the interacting partners. However, a great 
deal remains to be learned concerning these newly discovered structures and the 
varied roles that they play in the cell biology of eukaryotic cells. 
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Figure 3-36 One type of complex that 

is formed by reversible amyloids. The 
structure shown is based on the observed 
interaction of RNA polymerase with a 
low-complexity domain of a protein that 
regulates DNA transcription. (Adapted from 
I. Kwon et al., Cell 155:1049-1060, 2013.) 
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Summary 


A protein molecule’s amino acid sequence determines its three-dimensional con- 
formation. Noncovalent interactions between different parts of the polypeptide 
chain stabilize its folded structure. The amino acids with hydrophobic side chains 
tend to cluster in the interior of the molecule, and local hydrogen-bond interactions 
between neighboring peptide bonds give rise to a helices and f sheets. 

Regions of amino acid sequence known as domains are the modular units from 
which many proteins are constructed. Such domains generally contain 40-350 
amino acids, often folded into a globular shape. Small proteins typically consist 
of only a single domain, while large proteins are formed from multiple domains 
linked together by various lengths of polypeptide chain, some of which can be rela- 
tively disordered. As proteins have evolved, domains have been modified and com- 
bined with other domains to construct large numbers of new proteins. 

Proteins are brought together into larger structures by the same noncovalent 
forces that determine protein folding. Proteins with binding sites for their own sur- 
face can assemble into dimers, closed rings, spherical shells, or helical polymers. The 
amyloid fibril is a long unbranched structure assembled through a repeating aggre- 
gate ofp sheets. Although some mixtures of proteins and nucleic acids can assemble 
spontaneously into complex structures in a test tube, not all structures in the cell are 
capable of spontaneous reassembly after they have been dissociated into their com- 
ponent parts, because many biological assembly processes involve assembly factors 
that are not present in the final structure. 


PROTEIN FUNCTION 


We have seen that each type of protein consists of a precise sequence of amino 
acids that allows it to fold up into a particular three-dimensional shape, or con- 
formation. But proteins are not rigid lumps of material. They often have precisely 
engineered moving parts whose mechanical actions are coupled to chemical 
events. It is this coupling of chemistry and movement that gives proteins the 
extraordinary capabilities that underlie the dynamic processes in living cells. 

In this section, we explain how proteins bind to other selected molecules and 
how a protein’s activity depends on such binding. We show that the ability to bind 
to other molecules enables proteins to act as catalysts, signal receptors, switches, 
motors, or tiny pumps. The examples we discuss in this chapter by no means 
exhaust the vast functional repertoire of proteins. You will encounter the special- 
ized functions of many other proteins elsewhere in this book, based on similar 
principles. 


All Proteins Bind to Other Molecules 


A protein molecule’s physical interaction with other molecules determines its 
biological properties. Thus, antibodies attach to viruses or bacteria to mark them 
for destruction, the enzyme hexokinase binds glucose and ATP so as to catalyze a 
reaction between them, actin molecules bind to each other to assemble into actin 
filaments, and so on. Indeed, all proteins stick, or bind, to other molecules. In 
some cases, this binding is very tight; in others it is weak and short-lived. But the 
binding always shows great specificity, in the sense that each protein molecule can 
usually bind just one or a few molecules out of the many thousands of different 
types it encounters. The substance that is bound by the protein—whether it is an 
ion, a small molecule, or a macromolecule such as another protein—is referred to 
as a ligand for that protein (from the Latin word ligare, meaning “to bind”). 

The ability of a protein to bind selectively and with high affinity to a ligand 
depends on the formation of a set of weak noncovalent bonds—hydrogen bonds, 
electrostatic attractions, and van der Waals attractions—plus favorable hydropho- 
bic interactions (see Panel 2-3, pp. 94-95). Because each individual bond is weak, 
effective binding occurs only when many of these bonds form simultaneously. 
Such binding is possible only if the surface contours of the ligand molecule fit very 
closely to the protein, matching it like a hand in a glove (Figure 3-37). 
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The region ofa protein that associates with a ligand, known as the ligand’s bind- 
ing site, usually consists of a cavity in the protein surface formed by a particular 
arrangement of amino acids. These amino acids can belong to different portions 
of the polypeptide chain that are brought together when the protein folds (Figure 
3-38). Separate regions of the protein surface generally provide binding sites for 
different ligands, allowing the protein’s activity to be regulated, as we shall see 
later. And other parts of the protein act as a handle to position the protein in the 
cell—an example is the SH2 domain discussed previously, which often moves a 
protein containing it to particular intracellular sites in response to signals. 

Although the atoms buried in the interior of the protein have no direct contact 
with the ligand, they form the framework that gives the surface its contours and 
its chemical and mechanical properties. Even small changes to the amino acids in 
the interior of a protein molecule can change its three-dimensional shape enough 
to destroy a binding site on the surface. 


The Surface Conformation of a Protein Determines Its Chemistry 


The impressive chemical capabilities of proteins often require that the chemical 
groups on their surface interact in ways that enhance the chemical reactivity of 
one or more amino acid side chains. These interactions fall into two main cate- 
gories. 

First, the interaction of neighboring parts of the polypeptide chain may restrict 
the access of water molecules to that protein’s ligand-binding sites. Because water 
molecules readily form hydrogen bonds that can compete with ligands for sites 
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Figure 3-37 The selective binding of a 
protein to another molecule. Many weak 
bonds are needed to enable a protein 

to bind tightly to a second molecule, or 
ligand. A ligand must therefore fit precisely 
into a protein’s binding site, like a hand 
into a glove, so that a large number of 
noncovalent bonds form between the 
protein and the ligand. (A) Schematic; 

(B) space-filling model. (PDB code: 1G6N.) 


Figure 3-38 The binding site of a 
protein. (A) The folding of the polypeptide 
chain typically creates a crevice or cavity on 
the protein surface. This crevice contains a 
set of amino acid side chains disposed in 
such a way that they can form noncovalent 
bonds only with certain ligands. (B) A 
close-up of an actual binding site, showing 
the hydrogen bonds and electrostatic 
interactions formed between a protein and 
its ligand. In this example, a molecule of 
cyclic AMP is the bound ligand. 
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on the protein surface, a ligand will form tighter hydrogen bonds (and electro- 
static interactions) with a protein if water molecules are kept away. It might be 
hard to imagine a mechanism that would exclude a molecule as small as water 
from a protein surface without affecting the access of the ligand itself. However, 
because of the strong tendency of water molecules to form water-water hydrogen 
bonds, water molecules exist in a large hydrogen-bonded network (see Panel 2-2, 
pp. 92-93). In effect, a protein can keep a ligand-binding site dry, increasing that 
site's reactivity, because it is energetically unfavorable for individual water mole- 
cules to break away from this network—as they must do to reach into a crevice on 
a protein’s surface. 

Second, the clustering of neighboring polar amino acid side chains can alter 
their reactivity. If protein folding forces together a number of negatively charged 
side chains against their mutual repulsion, for example, the affinity of the site for 
a positively charged ion is greatly increased. In addition, when amino acid side 
chains interact with one another through hydrogen bonds, normally unreactive 
groups (such as the -CH2OH on the serine shown in Figure 3-39) can become 
reactive, enabling them to be used to make or break selected covalent bonds. 

The surface of each protein molecule therefore has a unique chemical reac- 
tivity that depends not only on which amino acid side chains are exposed, but 
also on their exact orientation relative to one another. For this reason, two slightly 
different conformations of the same protein molecule can differ greatly in their 
chemistry. 


Sequence Comparisons Between Protein Family Members 
Highlight Crucial Ligand-Binding Sites 


As we have described previously, genome sequences allow us to group many of 
the domains in proteins into families that show clear evidence of their evolution 
from a common ancestor. The three-dimensional structures of members of the 
same domain family are remarkably similar. For example, even when the amino 
acid sequence identity falls to 25%, the backbone atoms in a domain can follow a 
common protein fold within 0.2 nanometers (2 A). 

We can use a method called evolutionary tracing to identify those sites in a 
protein domain that are the most crucial to the domain’s function. Those sites 
that bind to other molecules are the most likely to be maintained, unchanged as 
organisms evolve. Thus, in this method, those amino acids that are unchanged, or 
nearly unchanged, in all of the known protein family members are mapped onto 
a model of the three-dimensional structure of one family member. When this is 
done, the most invariant positions often form one or more clusters on the protein 
surface, as illustrated in Figure 3-40A for the SH2 domain described previously 
(see Figure 3-6). These clusters generally correspond to ligand-binding sites. 

The SH2 domain functions to link two proteins together. It binds the protein 
containing it to a second protein that contains a phosphorylated tyrosine side 
chain in a specific amino acid sequence context, as shown in Figure 3-40B. The 
amino acids located at the binding site for the phosphorylated polypeptide have 
been the slowest to change during the long evolutionary process that produced 
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Figure 3-39 An unusually reactive amino 
acid at the active site of an enzyme. 
This example is the “catalytic triad” Asp- 
His-Ser found in chymotrypsin, elastase, 
and other serine proteases (See Figure 
3-12). The aspartic acid side chain (Asp) 
induces the histidine (His) to remove the 
proton from a particular serine (Ser). This 
activates the serine and enables it to form 
a covalent bond with an enzyme substrate, 
hydrolyzing a peptide bond. The many 
convolutions of the polypeptide chain are 
omitted here. 
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the large SH2 family of peptide recognition domains. Mutation is a random pro- 
cess; survival is not. Thus, natural selection (random mutation followed by non- 
random survival) produces the sequence conservation by preferentially eliminat- 
ing organisms whose SH2 domains become altered in a way that inactivates the 
SH2 binding site, destroying SH2 function. 

Genome sequencing has revealed huge numbers of proteins whose functions 
are unknown. Once a three-dimensional structure has been determined for one 
member of a protein family, evolutionary tracing allows biologists to determine 
binding sites for the members of that family, providing a useful start in decipher- 
ing protein function. 


Proteins Bind to Other Proteins Through Several Types of 
Interfaces 


Proteins can bind to other proteins in multiple ways. In many cases, a portion 
of the surface of one protein contacts an extended loop of polypeptide chain (a 
“string” ) on a second protein (Figure 3-41A). Such a surface-string interaction, 
for example, allows the SH2 domain to recognize a phosphorylated polypeptide 
loop on a second protein, as just described, and it also enables a protein kinase to 
recognize the proteins that it will phosphorylate (see below). 

A second type of protein-protein interface forms when two @ helices, one from 
each protein, pair together to form a coiled-coil (Figure 3-41B). This type of pro- 
tein interface is found in several families of gene regulatory proteins, as discussed 
in Chapter 7. 

The most common way for proteins to interact, however, is by the precise 
matching of one rigid surface with that of another (Figure 3-41C). Such interac- 
tions can be very tight, since a large number of weak bonds can form between two 
surfaces that match well. For the same reason, such surface-surface interactions 
can be extremely specific, enabling a protein to select just one partner from the 
many thousands of different proteins found in a cell. 
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Figure 3-40 The evolutionary trace 
method applied to the SH2 domain. 

(A) Front and back views of a space- 

filling model of the SH2 domain, with 
evolutionarily conserved amino acids on the 
protein surface colored yellow, and those 
more toward the protein interior colored 
red. (B) The structure of one specific SH2 
domain with its bound polypeptide. Here, 
those amino acids located within 0.4 nm 
of the bound ligand are colored blue. The 
two key amino acids of the ligand are 
yellow, and the others are purple. Note the 
high degree of correspondence between 
(A) and (B). (Adapted from O. Lichtarge, 
H.R. Bourne and F.E. Cohen, J. Mol. Biol. 
257:342-358, 1996. With permission from 
Elsevier; PDB codes: 1SPR, 1SPS.) 


Figure 3-41 Three ways in which two 
proteins can bind to each other. Only 
the interacting parts of the two proteins 
are shown. (A) A rigid surface on one 
protein can bind to an extended loop of 
polypeptide chain (a “string”) on a 
second protein. (B) Two a helices can 
bind together to form a coiled-coil. 

(C) Two complementary rigid surfaces 
often link two proteins together. Binding 
interactions can also involve the pairing of 
B strands (see, for example, Figure 3-18). 
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Antibody Binding Sites Are Especially Versatile 


All proteins must bind to particular ligands to carry out their various functions. 
The antibody family is notable for its capacity for tight, highly selective binding 
(discussed in detail in Chapter 24). 

Antibodies, or immunoglobulins, are proteins produced by the immune sys- 
tem in response to foreign molecules, such as those on the surface of an invad- 
ing microorganism. Each antibody binds tightly to a particular target molecule, 
thereby either inactivating the target molecule directly or marking it for destruc- 
tion. An antibody recognizes its target (called an antigen) with remarkable spec- 
ificity. Because there are potentially billions of different antigens that humans 
might encounter, we have to be able to produce billions of different antibodies. 

Antibodies are Y-shaped molecules with two identical binding sites that are 
complementary to a small portion of the surface of the antigen molecule. A 
detailed examination of the antigen-binding sites of antibodies reveals that they 
are formed from several loops of polypeptide chain that protrude from the ends 
of a pair of closely juxtaposed protein domains (Figure 3-42). Different antibod- 
ies generate an enormous diversity of antigen-binding sites by changing only the 
length and amino acid sequence of these loops, without altering the basic protein 
structure. 

Loops of this kind are ideal for grasping other molecules. They allow a large 
number of chemical groups to surround a ligand so that the protein can link to it 
with many weak bonds. For this reason, loops often form the ligand-binding sites 
in proteins. 


The Equilibrium Constant Measures Binding Strength 


Molecules in the cell encounter each other very frequently because of their con- 
tinual random thermal movements. Colliding molecules with poorly matching 
surfaces form few noncovalent bonds with one another, and the two molecules 
dissociate as rapidly as they come together. At the other extreme, when many 
noncovalent bonds form between two colliding molecules, the association can 
persist for a very long time (Figure 3-43). Strong interactions occur in cells when- 
ever a biological function requires that molecules remain associated for a long 
time—for example, when a group of RNA and protein molecules come together to 
make a subcellular structure such as a ribosome. 
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Figure 3-42 An antibody molecule. 

A typical antibody molecule is Y-shaped 
and has two identical binding sites for 

its antigen, one on each arm of the Y. As 
explained in Chapter 24, the protein is 
composed of four polypeptide chains (two 
identical heavy chains and two identical 
and smaller light chains) held together 

by disulfide bonds. Each chain is made 

up of several different immunoglobulin 
domains, here shaded either blue or gray. 
The antigen-binding site is formed where 
a heavy-chain variable domain (Vy) and 

a light-chain variable domain (VL) come 
close together. These are the domains that 
differ most in their Sequence and structure 
in different antibodies. At the end of each 
of the two arms of the antibody molecule, 
these two domains form loops that bind to 
the antigen (see Movie 24.5). 
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We can measure the strength with which any two molecules bind to each 
other. As an example, consider a population of identical antibody molecules that 
suddenly encounters a population of ligands diffusing in the fluid surrounding 
them. At frequent intervals, one of the ligand molecules will bump into the bind- 
ing site of an antibody and form an antibody-ligand complex. The population of 
antibody-ligand complexes will therefore increase, but not without limit: over 
time, a second process, in which individual complexes break apart because of 
thermally induced motion, will become increasingly important. Eventually, any 
population of antibody molecules and ligands will reach a steady state, or equilib- 
rium, in which the number of binding (association) events per second is precisely 
equal to the number of “unbinding” (dissociation) events (see Figure 2-30). 

From the concentrations ofthe ligand, antibody, and antibody-ligand complex 
at equilibrium, we can calculate a convenient measure of the strength of bind- 
ing—the equilibrium constant (K)— (Figure 3-44A). This constant was described 
in detail in Chapter 2, where its connection to free energy differences was derived 
(see p. 62). The equilibrium constant for a reaction in which two molecules (A and 
B) bind to each other to form a complex (AB) has units of liters/mole, and half 
of the binding sites will be occupied by ligand when that ligand’s concentration 
(in moles/liter) reaches a value that is equal to 1/K. This equilibrium constant is 
larger the greater the binding strength, and it is a direct measure of the free-en- 
ergy difference between the bound and free states (Figure 3-44B). Even a change 
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the surfaces of molecules A and B, 
and A and C, are a poor match and 
are capable of forming only a few 
weak bonds; thermal motion rapidly 
breaks them apart 


the surfaces of molecules A and D 
match well and therefore can form 
enough weak bonds to withstand 
thermal jolting; they therefore 
stay bound to each other 


Figure 3-43 How noncovalent bonds 
mediate interactions between 
macromolecules (see Movie 2.1). 


Figure 3-44 Relating standard 
free-energy difference (AG°) to the 
equilibrium constant (K). (A) The 
equilibrium between molecules A and 

B and the complex AB is maintained by 

a balance between the two opposing 
reactions shown in panels 1 and 2. 
Molecules A and B must collide if they 

are to react, and the association rate is 
therefore proportional to the product of their 
individual concentrations [A] x [B]. (Square 
brackets indicate concentration.) As shown 
in panel 3, the ratio of the rate constants 
for the association and the dissociation 
reactions is equal to the equilibrium 
constant (K) for the reaction (See also p. 
63). (B) The equilibrium constant in panel 

3 is that for the reaction A + B + AB, and 
the larger its value, the stronger the binding 
between A and B. Note that for every 5.91 
kJ/mole decrease in standard free energy, 
the equilibrium constant increases by a 
factor of 10 at 37°C. 

The equilibrium constant here has units of 
liters/mole; for simple binding interactions 
it is also called the affinity constant or 
association constant, denoted Ka. The 
reciprocal of Ka is called the dissociation 
constant, Kg (in units of moles/liter). 
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of a few noncovalent bonds can have a striking effect on a binding interaction, 
as shown by the example in Figure 3-45. (Note that the equilibrium constant, as 
defined here, is also known as the association or affinity constant, Ka.) 

We have used the case of an antibody binding to its ligand to illustrate the 
effect of binding strength on the equilibrium state, but the same principles apply 
to any molecule and its ligand. Many proteins are enzymes, which, as we now 
discuss, first bind to their ligands and then catalyze the breakage or formation of 
covalent bonds in these molecules. 


Enzymes Are Powerful and Highly Specific Catalysts 


Many proteins can perform their function simply by binding to another molecule. 
An actin molecule, for example, need only associate with other actin molecules 
to form a filament. There are other proteins, however, for which ligand binding is 
only a necessary first step in their function. This is the case for the large and very 
important class of proteins called enzymes. As described in Chapter 2, enzymes 


Consider 1000 molecules of A and 
1000 molecules of B in a eukaryotic 
cell. The concentration of both will 
be about 10°? M. 

If the equilibrium constant (K) 
for A + B = AB is 10'°, then one can 
calculate that at equilibrium there 
will be 


A BO AB 


molecules molecules molecules 


If the equilibrium constant is a little 
weaker at 108, which represents 

a loss of 11.9 kilojoule/mole of 
binding energy from the example 
above, or 2-3 fewer hydrogen 
bonds, then there will be 


EE = 


A B AB 
molecules molecules molecules 


are remarkable molecules that cause the chemical transformations that make and 
break covalent bonds in cells. They bind to one or more ligands, called substrates, 
and convert them into one or more chemically modified products, doing this over 
and over again with amazing rapidity. Enzymes speed up reactions, often by a 
factor of a million or more, without themselves being changed—that is, they act 
as catalysts that permit cells to make or break covalent bonds in a controlled way. 
It is the catalysis of organized sets of chemical reactions by enzymes that creates 
and maintains the cell, making life possible. 

We can group enzymes into functional classes that perform similar chemical 
reactions (Table 3-1). Each type of enzyme within such a class is highly specific, 





Figure 3-45 Small changes in the 
number of weak bonds can have drastic 
effects on a binding interaction. This 
example illustrates the dramatic effect of 
the presence or absence of a few weak 
noncovalent bonds in a biological context. 


TABLE 3-1 


Hydrolases General term for enzymes that catalyze a hydrolytic cleavage reaction; nucleases and proteases are 
more specific names for subclasses of these enzymes 


Break down nucleic acids by hydrolyzing bonds between nucleotides. Endo- and exonucleases 
cleave nucleic acids within and from the ends of the polynucleotide chains, respectively 


Break down proteins by hydrolyzing bonds between amino acids 
Synthesize molecules in anabolic reactions by condensing two smaller molecules together 


Join together (ligate) two molecules in an energy-dependent process. DNA ligase, for example, joins 
two DNA molecules together end-to-end through phosphodiester bonds 


Catalyze the rearrangement of bonds within a single molecule 
Catalyze polymerization reactions such as the synthesis of DNA and RNA 


Catalyze the addition of phosphate groups to molecules. Protein kinases are an important group of 
kinases that attach phosphate groups to proteins 
Catalyze the hydrolytic removal of a phosphate group from a molecule 


General name for enzymes that catalyze reactions in which one molecule is oxidized while the 
other is reduced. Enzymes of this type are often more specifically named oxidases, reductases, or 
dehydrogenases 


Nucleases 


Proteases 
Synthases 


Ligases 


lsomerases 
Polymerases 


Kinases 


Phosphatases 


Oxido-Reductases 


ATPases Hydrolyze ATP. Many proteins with a wide range of roles have an energy-harnessing ATPase activity 
as part of their function; for example, motor proteins such as myosin and membrane transport 


proteins such as the sodium—potassium pump 


Hydrolyze GTP. A large family of GTP-binding proteins are GTPases with central roles in the 
regulation of cell processes 


Enzyme names typically end in “-ase,” with the exception of some enzymes, such as pepsin, trypsin, thrombin, and lysozyme, that were 
discovered and named before the convention became generally accepted at the end of the nineteenth century. The common name of an enzyme 
usually indicates the substrate or product and the nature of the reaction catalyzed. For example, citrate synthase catalyzes the synthesis of citrate 
by a reaction between acetyl CoA and oxaloacetate. 


GTPases 
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catalyzing only a single type of reaction. Thus, hexokinase adds a phosphate group 
to D-glucose but ignores its optical isomer L-glucose; the blood-clotting enzyme 
thrombin cuts one type of blood protein between a particular arginine and its 
adjacent glycine and nowhere else, and so on. As discussed in detail in Chapter 2, 
enzymes work in teams, with the product of one enzyme becoming the substrate 
for the next. The result is an elaborate network of metabolic pathways that pro- 
vides the cell with energy and generates the many large and small molecules that 
the cell needs (see Figure 2-63). 


Substrate Binding Is the First Step in Enzyme Catalysis 


For a protein that catalyzes a chemical reaction (an enzyme), the binding of each 
substrate molecule to the protein is an essential prelude. In the simplest case, if 
we denote the enzyme by E, the substrate by S, and the product by P, the basic 
reaction path is E+ S — ES — EP — E+ P. There is a limit to the amount of sub- 
strate that a single enzyme molecule can process in a given time. Although an 
increase in the concentration of substrate increases the rate at which product is 
formed, this rate eventually reaches a maximum value (Figure 3-46). At that point 
the enzyme molecule is saturated with substrate, and the rate of reaction (Vmax) 
depends only on how rapidly the enzyme can process the substrate molecule. This 
maximum rate divided by the enzyme concentration is called the turnover num- 
ber. Turnover numbers are often about 1000 substrate molecules processed per 
second per enzyme molecule, although turnover numbers between 1 and 10,000 
are known. 

The other kinetic parameter frequently used to characterize an enzyme is its 
Km, the concentration of substrate that allows the reaction to proceed at one-half 
its maximum rate (0.5 Vmax) (see Figure 3-46). A low Km value means that the 
enzyme reaches its maximum catalytic rate at a low concentration of substrate and 
generally indicates that the enzyme binds to its substrate very tightly, whereas a 
high Km value corresponds to weak binding. The methods used to characterize 
enzymes in this way are explained in Panel 3-2 (pp. 142-143). 


Enzymes Speed Reactions by Selectively Stabilizing Transition 
States 


Enzymes achieve extremely high rates of chemical reaction—rates that are far 
higher than for any synthetic catalysts. There are several reasons for this effi- 
ciency. First, when two molecules need to react, the enzyme greatly increases the 
local concentration of both of these substrate molecules at the catalytic site, hold- 
ing them in the correct orientation for the reaction that is to follow. More impor- 
tantly, however, some of the binding energy contributes directly to the catalysis. 
Substrate molecules must pass through a series of intermediate states of altered 
geometry and electron distribution before they form the ultimate products of the 
reaction. The free energy required to attain the most unstable intermediate state, 
called the transition state, is known as the activation energy for the reaction, and 
it is the major determinant of the reaction rate. Enzymes have a much higher 
affinity for the transition state of the substrate than they have for the stable form. 
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Figure 3-46 Enzyme kinetics. The rate 
of an enzyme reaction (V) increases as 
the substrate concentration increases 
until a maximum value (Vmax) is reached. 
At this point all substrate-binding sites on 
the enzyme molecules are fully occupied, 
and the rate of reaction is limited by 

the rate of the catalytic process on the 
enzyme surface. For most enzymes, 

the concentration of substrate at which 
the reaction rate is half-maximal (Km) is 

a measure of how tightly the substrate 

is bound, with a large value of Km 
corresponding to weak binding. 
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PANEL 3-2: Some of the Methods Used to Study Enzymes 





WHY ANALYZE THE KINETICS OF ENZYMES? 


Enzymes are the most selective and powerful catalysts known. 
An understanding of their detailed mechanisms provides a 
critical tool for the discovery of new drugs, for the large-scale 
industrial synthesis of useful chemicals, and for appreciating 
the chemistry of cells and organisms. A detailed study of the 
rates of the chemical reactions that are catalyzed by a purified 
enzyme—more specifically how these rates change with 
changes in conditions such as the concentrations of substrates, 
products, inhibitors, and regulatory ligands—allows 


STEADY-STATE ENZYME KINETICS 


Many enzymes have only one substrate, which they bind and 
then process to produce products according to the scheme 
outlined in Figure 3-50A. In this case, the reaction is written as 


kı Keat 
ES ——> E+P 


—> 
— 


ky 


E+S 


Here we have assumed that the reverse reaction, in which E + P 
recombine to form EP and then ES, occurs so rarely that we can 
ignore it. In this case, EP need not be represented, and we can 
express the rate of the reaction—known as its velocity, V, as 


V = Keat [ES] 


where [ES] is the concentration of the enzyme-substrate complex, 
and Keat is the turnover number, a rate constant that has a value 
equal to the number of substrate molecules processed per 
enzyme molecule each second. 

But how does the value of [ES] relate to the concentrations that 
we know directly, which are the total concentration of the 
enzyme, [E,], and the concentration of the substrate, [S]? When 
enzyme and substrate are first mixed, the concentration [ES] will 
rise rapidly from zero to a so-called steady-state level, as 
illustrated below. 


concentrations ——» 


time —» 


pre-steady 
state: 
ES forming 


steady state: 
ES almost constant 


biochemists to figure out exactly how each enzyme works. 
For example, this is the way that the ATP-producing reactions 
of glycolysis, shown previously in Figure 2-48, were 
deciphered—allowing us to appreciate the rationale for this 
critical enzymatic pathway. 

In this Panel, we introduce the important field of enzyme 
kinetics, which has been indispensable for deriving much of 
the detailed knowledge that we now have about cell 
chemistry. 


At this steady state, [ES] is nearly constant, so that 


or, since the concentration of the free enzyme, [E], is equal 


to [Eo] - [ES], 
k, 
_ E : es] a 
k_, + Keat 


kı 
E [E][S] = 
k_, + Keat 


Rearranging, and defining the constant Km as 


[ES] = 


k_, + Keat 
k, 


[ESIS] 
Km + [S] 


[ES] = 


or, remembering that V = k,,;, [ES], we obtain the famous 
Michaelis-Menten equation 





As [S] is increased to higher and higher levels, essentially all of 
the enzyme will be bound to substrate at steady state; at this 
point, a maximum rate of reaction, Vmax, will be reached where | 
V = Vmax = Keat [Eo]. Thus, it is convenient to rewrite the | 
Michaelis-Menten equation as 


























THE DOUBLE-RECIPROCAL PLOT THE SIGNIFICANCE OF Km, Keat and Keat /Km 















A typical plot of V versus [S] for an enzyme that follows 
Michaelis-Menten kinetics is shown below. From this plot, 
neither the value of Vmax nor of Km is immediately clear. 









As described in the text, K,, is an approximate measure of 
substrate affinity for the enzyme: it is numerically equal to 
the concentration of [S] at V = 0.5 Vmax In general, a lower 
value of Km means tighter substrate binding. In fact, for 
those cases where k,,; is much smaller than k_,, the Km will 
be equal to Ky, the dissociation constant for substrate 
binding to the enzyme (Ky = 1/K,; see Figure 3-44). 

We have seen that k ,,; is the turnover number for the 
enzyme. At very low substrate concentrations, where 






































_ = 80 [S] << K,,, most of the enzyme is free. Thus we can think of 
S [E] = [E,], so that the Michaelis-Menten equation becomes 
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For simplicity, in this Panel we have discussed enzymes 
that have only one substrate, such as the lysozyme enzyme 
described in the text (see p. 144). Most enzymes have two 
substrates, one of which is often an active carrier 
molecule—such as NADH or ATP. 

A similar, but more complex, analysis is used to determine 
the kinetics of such enzymes—allowing the order of substrate 
binding and the presence of covalent intermediates along 
the pathway to be revealed. 










To obtain Vmax and Km from such data, a double-reciprocal 
plot is often used, in which the Michaelis-Menten equation 
has merely been rearranged, so that 1/V can be plotted 

versus 1/ [S]. 
















SOME ENZYMES ARE DIFFUSION LIMITED 


The values of Kcatr Km, and k,54/K,, for some selected 
enzymes are given below: 


k K k-n K 
APAE substrate cat m cat’ ^m 


acetylcholinesterase acetylcholine 1.4x10f 9x10°  1.6x108 














0.04 
catalase H,O, 4x10” 1 4x10’ 


aE fumarase fumarate 8x102 5x10 1.6x108 





slope = Ky, / V, Because an enzyme and its substrate must collide before 


max 


they can react, kcat/Km has a maximum possible value that is 
1 limited by collision rates. If every collision forms an 
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enzyme-substrate complex, one can calculate from diffusion 
Po | theory that kcat/Km will be between 108 and 10? sec'M“|, in 
05 0.25 0 0.25 0.5 0.75 10 the case where all subsequent steps proceed immediately. 
— litenmamole Thus, it is claimed that enzymes like acetylcholinesterase and 
1 1 a fumarase are “perfect enzymes,” each enzyme having 
Ts] 7 Ka I evolved to the point where nearly every collision with its 


substrate converts the substrate to a product. 
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Figure 3-47 Enzymatic acceleration of chemical reactions by decreasing 
the activation energy. There is a single transition state in this example. 
However, often both the uncatalyzed reaction (A) and the enzyme-catalyzed 
reaction (B) go through a series of transition states. In that case, it is the 
transition state with the highest energy (S' and ES!) that determines the 
activation energy and limits the rate of the reaction. (S = substrate; 

P = product of the reaction; ES = enzyme-—substrate complex; EP = enzyme- 
product complex.) 


Because this tight binding greatly lowers the energy of the transition state, the 
enzyme greatly accelerates a particular reaction by lowering the activation energy 
that is required (Figure 3-47). 


Enzymes Can Use Simultaneous Acid and Base Catalysis 


Figure 3-48 compares the spontaneous reaction rates and the corresponding 
enzyme-catalyzed rates for five enzymes. Rate accelerations range from 10° to 
1023. Enzymes not only bind tightly to a transition state, they also contain precisely 
positioned atoms that alter the electron distributions in the atoms that participate 
directly in the making and breaking of covalent bonds. Peptide bonds, for exam- 
ple, can be hydrolyzed in the absence of an enzyme by exposing a polypeptide to 
either a strong acid or a strong base. Enzymes are unique, however, in being able 
to use acid and base catalysis simultaneously, because the rigid framework of the 
protein constrains the acidic and basic residues and prevents them from combin- 
ing with each other, as they would do in solution (Figure 3-49). 

The fit between an enzyme and its substrate needs to be precise. A small 
change introduced by genetic engineering in the active site of an enzyme can 
therefore have a profound effect. Replacing a glutamic acid with an aspartic acid 
in one enzyme, for example, shifts the position of the catalytic carboxylate ion by 
only 1 A (about the radius of a hydrogen atom); yet this is enough to decrease the 
activity of the enzyme a thousandfold. 


Lysozyme Illustrates How an Enzyme Works 


To demonstrate how enzymes catalyze chemical reactions, we examine an enzyme 
that acts as a natural antibiotic in egg white, saliva, tears, and other secretions. 
Lysozyme catalyzes the cutting of polysaccharide chains in the cell walls of bac- 
teria. The bacterial cell is under pressure from osmotic forces, and cutting even a 
small number of these chains causes the cell wall to rupture and the cell to burst. 
A relatively small and stable protein that can be easily isolated in large quantities, 
lysozyme was the first enzyme to have its structure worked out in atomic detail by 
x-ray crystallography (in the mid-1960s). 

The reaction that lysozyme catalyzes is a hydrolysis: it adds a molecule of water 
to a single bond between two adjacent sugar groups in the polysaccharide chain, 
thereby causing the bond to break (see Figure 2-9). The reaction is energetically 
favorable because the free energy of the severed polysaccharide chain is lower 
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Figure 3-48 The rate accelerations 
caused by five different enzymes. 
(Adapted from A. Radzicka and 

R. Wolfenden, Science 267:90-93, 1995.) 
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than the free energy of the intact chain. However, there is an energy barrier to the 
reaction, and a colliding water molecule can break a bond linking two sugars only 
if the polysaccharide molecule is distorted into a particular shape—the transition 
state—in which the atoms around the bond have an altered geometry and elec- 
tron distribution. Because of this requirement, random collisions must supply a 
very large activation energy for the reaction to take place. In an aqueous solution 
at room temperature, the energy of collisions almost never exceeds the activation 
energy. The pure polysaccharide can therefore remain for years in water without 
being hydrolyzed to any detectable degree. 

This situation changes drastically when the polysaccharide binds to lysozyme. 
The active site of lysozyme, because its substrate is a polymer, is a long groove that 
holds six linked sugars at the same time. As soon as the polysaccharide binds to 
form an enzyme-substrate complex, the enzyme cuts the polysaccharide by add- 
ing a water molecule across one of its sugar-sugar bonds. The product chains are 
then quickly released, freeing the enzyme for further cycles of reaction (Figure 
3-50). 

An impressive increase in hydrolysis rate is possible because conditions are 
created in the microenvironment of the lysozyme active site that greatly reduce 
the activation energy necessary for the hydrolysis to take place. In particular, lyso- 
zyme distorts one of the two sugars connected by the bond to be broken from its 
normal, most stable conformation. The bond to be broken is also held close to two 
amino acids with acidic side chains (a glutamic acid and an aspartic acid) that 
participate directly in the reaction. Figure 3-51 shows the three central steps in 
this enzymatically catalyzed reaction, which occurs millions of times faster than 
uncatalyzed hydrolysis. 

Other enzymes use similar mechanisms to lower activation energies and 
speed up the reactions they catalyze. In reactions involving two or more reactants, 
the active site also acts like a template, or mold, that brings the substrates together 
in the proper orientation for a reaction to occur between them (Figure 3-52A). As 
we saw for lysozyme, the active site of an enzyme contains precisely positioned 
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Figure 3-49 Acid catalysis and base 
catalysis. (A) The start of the uncatalyzed 
reaction that hydrolyzes a peptide bond, 
with blue shading used to indicate electron 
distribution in the water and carbonyl 
bonds. (B) An acid likes to donate a proton 
(H+) to other atoms. By pairing with the 
carbonyl oxygen, an acid causes electrons 
to move away from the carbonyl carbon, 
making this atom much more attractive to 
the electronegative oxygen of an attacking 
water molecule. (C) A base likes to take 

up H*. By pairing with a hydrogen of the 
attacking water molecule, a base causes 
electrons to move toward the water oxygen, 
making it a better attacking group for the 
carbonyl carbon. (D) By having appropriately 
positioned atoms on its surface, an enzyme 
can perform both acid catalysis and base 
catalysis at the same time. 





Figure 3-50 The reaction catalyzed by lysozyme. (A) The enzyme lysozyme (E) catalyzes the cutting of a polysaccharide 
chain, which is its substrate (S). The enzyme first binds to the chain to form an enzyme—substrate complex (ES) and then 
catalyzes the cleavage of a specific covalent bond in the backbone of the polysaccharide, forming an enzyme-product complex 
(EP) that rapidly dissociates. Release of the severed chain (the products P) leaves the enzyme free to act on another substrate 
molecule. (B) A space-filling model of the lysozyme molecule bound to a short length of polysaccharide chain before cleavage 


(Movie 3.8). (B, courtesy of Richard J. Feldmann; PDB code: 3AB6.) 
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This substrate is an oligosaccharide of six sugars, 
labeled A through F. Only sugars D and E are shown in detail. 


ABOX, O A®) ABOX, 


side chain 
on sugar E 








The Asp52 has formed a covalent bond between 
the enzyme and the C1 carbon atom of sugar D. 
The Glu35 then polarizes a water molecule (red ), 
so that its oxygen can readily attack the C1 
carbon atom and displace Asp52. 


In the enzyme-substrate complex (ES), the 
enzyme forces sugar D into a strained 
conformation. The Glu35 in the enzyme is 
positioned to serve as an acid that attacks the 
adjacent sugar-sugar bond by donating a proton 
(H*) to sugar E; Asp52 is poised to attack the 

C1 carbon atom. 


atoms that speed up a reaction by using charged groups to alter the distribution of 
electrons in the substrates (Figure 3-52B). And as we have also seen, when a sub- 
strate binds to an enzyme, bonds in the substrate are often distorted, changing the 
substrate shape. These changes, along with mechanical forces, drive a substrate 
toward a particular transition state (Figure 3-52C). Finally, like lysozyme, many 
enzymes participate intimately in the reaction by transiently forming a covalent 
bond between the substrate and a side chain of the enzyme. Subsequent steps in 
the reaction restore the side chain to its original state, so that the enzyme remains 
unchanged after the reaction (see also Figure 2-48). 


Tightly Bound Small Molecules Add Extra Functions to Proteins 


Although we have emphasized the versatility of enzymes—and proteins in gen- 
eral—as chains of amino acids that perform remarkable functions, there are many 
instances in which the amino acids by themselves are not enough. Just as humans 





(B) binding of substrate 
to enzyme rearranges 
electrons in the substrate, 
creating partial negative 
and positive charges 
that favor a reaction 


(C) enzyme strains the 
bound substrate 
molecule, forcing it 
toward a transition 
state to favor a reaction 


(A) enzyme binds to two 
substrate molecules and 
orients them precisely to 
encourage a reaction to 
occur between them 


The final products are an oligosaccharide of four sugars 
(left) and a disaccharide (right), produced by hydrolysis. 


CH,OH 


ow) 








The reaction of the water molecule (red ) 
completes the hydrolysis and returns the enzyme 
to its initial state, forming the final enzyme- 
product complex (EP). 


Figure 3-51 Events at the active site 

of lysozyme. The top left and top right 
drawings show the free substrate and the 
free products, respectively, whereas the 
other three drawings show the sequential 
events at the enzyme active site. Note 

the change in the conformation of sugar 

D in the enzyme-substrate complex; this 
shape change stabilizes the oxocarbenium 
ion-like transition states required for 
formation and hydrolysis of the covalent 
intermediate shown in the middle panel. 

It is also possible that a carbonium ion 
intermediate forms in step 2, but the 
covalent intermediate shown in the middle 
panel has been detected with a synthetic 
substrate (Movie 3.9). (See D.J. Vocadlo et 
al., Nature 412:835-838, 2001.) 


Figure 3-52 Some general strategies of 
enzyme catalysis. (A) Holding substrates 
together in a precise alignment. (B) Charge 
stabilization of reaction intermediates. 

(C) Applying forces that distort bonds in the 
substrate to increase the rate of a particular 
reaction. 
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TABLE 3-2 


Thiamine (vitamin B4) Thiamine pyrophosphate Activation and transfer of aldehydes 
Riboflavin (vitamin Bs) FADH Oxidation—reduction 


Acyl group activation and transfer 
Pyridoxal phosphate Amino acid activation; also glycogen phosphorylase 
CO» activation and transfer 


NADH, NADPH Oxidation-reduction 


Acyl group activation; oxidation—reduction 
Tetrahydrofolate Activation and transfer of single carbon groups 
Cobalamin coenzymes Isomerization and methyl group transfers 





employ tools to enhance and extend the capabilities of their hands, enzymes and 
other proteins often use small nonprotein molecules to perform functions that 
would be difficult or impossible to do with amino acids alone. Thus, enzymes fre- 
quently have a small molecule or metal atom tightly associated with their active 
site that assists with their catalytic function. Carboxypeptidase, for example, an 
enzyme that cuts polypeptide chains, carries a tightly bound zinc ion in its active 
site. During the cleavage of a peptide bond by carboxypeptidase, the zinc ion forms 
a transient bond with one of the substrate atoms, thereby assisting the hydrolysis 
reaction. In other enzymes, a small organic molecule serves a similar purpose. 
Such organic molecules are often referred to as coenzymes. An example is biotin, 
which is found in enzymes that transfer a carboxylate group (-COO7) from one 
molecule to another (see Figure 2-40). Biotin participates in these reactions by 
forming a transient covalent bond to the -COO™ group to be transferred, being 
better suited to this function than any of the amino acids used to make proteins. 
Because it cannot be synthesized by humans, and must therefore be supplied in 
small quantities in our diet, biotin is a vitamin. Many other coenzymes are either 
vitamins or derivatives of vitamins (Table 3-2). 

Other proteins also frequently require specific small-molecule adjuncts to 
function properly. Thus, the signal receptor protein rhodopsin, which is made by 
the photoreceptor cells in the retina, detects light by means of a small molecule, 
retinal, embedded in the protein (Figure 3-53A). Retinal, which is derived from 
vitamin A, changes its shape when it absorbs a photon of light, and this change 
causes the protein to trigger a cascade of enzymatic reactions that eventually lead 
to an electrical signal being carried to the brain. 


Figure 3-53 Retinal and heme. (A) The 
structure of retinal, the light-sensitive 


ek n HaC molecule attached to rhodopsin in the eye. 
LANAN The structure shown isomerizes when it 
absorbs light. (B) The structure of a heme 
group. The carbon-containing heme ring 
CH3 H3C > nyo is red and the iron atom at its center is 


orange. A heme group is tightly bound 

to each of the four polypeptide chains in 
hemoglobin, the oxygen-carrying protein 
whose structure is shown in Figure 3-19. 





(A) (B) 
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Another example of a protein with a nonprotein portion is hemoglobin (see 
Figure 3-19). Each molecule of hemoglobin carries four heme groups, ring-shaped 
molecules each with a single central iron atom (Figure 3-53B). Heme gives hemo- 
globin (and blood) its red color. By binding reversibly to oxygen gas through its 
iron atom, heme enables hemoglobin to pick up oxygen in the lungs and release 
it in the tissues. 

Sometimes these small molecules are attached covalently and permanently 
to their protein, thereby becoming an integral part of the protein molecule itself. 
We shall see in Chapter 10 that proteins are often anchored to cell membranes 
through covalently attached lipid molecules. And membrane proteins exposed on 
the surface of the cell, as well as proteins secreted outside the cell, are often mod- 
ified by the covalent addition of sugars and oligosaccharides. 


Multienzyme Complexes Help to Increase the Rate of Cell 
Metabolism 


The efficiency of enzymes in accelerating chemical reactions is crucial to the 
maintenance of life. Cells, in effect, must race against the unavoidable processes 
of decay, which—if left unattended—cause macromolecules to run downhill 
toward greater and greater disorder. If the rates of desirable reactions were not 
greater than the rates of competing side reactions, a cell would soon die. We can 
get some idea of the rate at which cell metabolism proceeds by measuring the 
rate of ATP utilization. A typical mammalian cell “turns over” (i.e., hydrolyzes and 
restores by phosphorylation) its entire ATP pool once every 1 or 2 minutes. For 
each cell, this turnover represents the utilization of roughly 10’ molecules of ATP 
per second (or, for the human body, about 1 gram of ATP every minute). 

The rates of reactions in cells are rapid because enzyme catalysis is so effective. 
Some enzymes have become so efficient that there is no possibility of further use- 
ful improvement. The factor that limits the reaction rate is no longer the enzyme’s 
intrinsic speed of action; rather, it is the frequency with which the enzyme collides 
with its substrate. Such a reaction is said to be diffusion-limited (see Panel 3-2, 
pp. 142-143). 

The amount of product produced by an enzyme will depend on the concen- 
tration of both the enzyme and its substrate. If a sequence of reactions is to occur 
extremely rapidly, each metabolic intermediate and enzyme involved must be 
present in high concentration. However, given the enormous number of different 
reactions performed by a cell, there are limits to the concentrations that can be 
achieved. In fact, most metabolites are present in micromolar (107 M) concentra- 
tions, and most enzyme concentrations are much lower. How is it possible, there- 
fore, to maintain very fast metabolic rates? 

The answer lies in the spatial organization of cell components. The cell can 
increase reaction rates without raising substrate concentrations by bringing the 
various enzymes involved in a reaction sequence together to form a large protein 
assembly known as a multienzyme complex (Figure 3-54). Because this assembly 
is organized in a way that allows the product of enzyme A to be passed directly 
to enzyme B, and so on, diffusion rates need not be limiting, even when the con- 
centrations of the substrates in the cell as a whole are very low. It is perhaps not 
surprising, therefore, that such enzyme complexes are very common, and they 
are involved in nearly all aspects of metabolism—including the central genetic 
processes of DNA, RNA, and protein synthesis. In fact, few enzymes in eukaryotic 
cells diffuse freely in solution; instead, most seem to have evolved binding sites 
that concentrate them with other proteins of related function in particular regions 
of the cell, thereby increasing the rate and efficiency of the reactions that they 
catalyze (see p. 331). 

Eukaryotic cells have yet another way of increasing the rate of metabolic reac- 
tions: using their intracellular membrane systems. These membranes can segre- 
gate particular substrates and the enzymes that act on them into the same mem- 
brane-enclosed compartment, such as the endoplasmic reticulum or the cell 
nucleus. If, for example, a compartment occupies a total of 10% of the volume of 
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Figure 3-54 How unstructured regions of polypeptide chain serving as tethers allow reaction intermediates to be 
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passed from one active site to another in large multienzyme complexes. (A-C) The fatty acid synthase in mammals. (A) The 
location of seven protein domains with different activities in this 270 kilodalton protein. The numbers refer to the order in which 
each enzyme domain must function to complete each two-carbon addition step. After multiple cycles of two-carbon addition, 
the termination domain releases the final product once the desired length of fatty acid has been synthesized. (B) The structure 
of the dimeric enzyme, with the location of the five active sites in one monomer indicated. (C) How a flexible tether allows the 
substrate that remains linked to the acyl carrier domain (red) to be passed from one active site to another in each monomer, 
sequentially elongating and modifying the bound fatty acid intermediate (yellow). The five steps are repeated until the final length 


of fatty acid chain has been synthesized. (Only steps 1 through 4 are illustrated here.) 


(D) Multiple tethered subunits in the giant pyruvate dehydrogenase complex (9500 kilodaltons, larger than a ribosome) that 
catalyzes the conversion of pyruvate to acetyl CoA. As in (C), a covalently bound substrate held on a flexible tether (red balls 
with yellow substrate) is serially passed through active sites on subunits (here labeled 1 through 3) to produce the final products. 
Here, subunit 1 catalyzes the decarboxylation of pyruvate accompanied by the reductive acetylation of a lipoy! group linked to 
one of the red balls. Subunit 2 transfers this acetyl group to CoA, forming acetyl CoA, and subunit 3 reoxidizes the lipoyl group 
to prepare it for the next cycle. Only one-tenth of the subunits labeled 1 and 3, attached to the core formed by subunit 2, are 


illustrated here. This important reaction takes place in the mammalian mitochondrion, as part of the pathway that oxidizes 


sugars to COs and H20 (see page 82). (A-C, adapted from T. Maier et al., Quart. Rev. Biophys. 43:373-422, 2010; 


D, from J.L.S. Milne et al., J. Biol. Chem. 281:4364—4370, 2006.) 


the cell, the concentration of reactants in that compartment may be increased by 
10 times compared with a cell with the same number of enzyme and substrate 
molecules, but no compartmentalization. Reactions limited by the speed of diffu- 
sion can thereby be speeded up by a factor of 10. 


The Cell Regulates the Catalytic Activities of Its Enzymes 


A living cell contains thousands of enzymes, many of which operate at the same 
time and in the same small volume of the cytosol. By their catalytic action, these 
enzymes generate a complex web of metabolic pathways, each composed of 
chains of chemical reactions in which the product of one enzyme becomes the 
substrate of the next. In this maze of pathways, there are many branch points 
(nodes) where different enzymes compete for the same substrate. The system is 
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complex (see Figure 2-63), and elaborate controls are required to regulate when 
and how rapidly each reaction occurs. 

Regulation occurs at many levels. At one level, the cell controls how many 
molecules of each enzyme it makes by regulating the expression of the gene that 
encodes that enzyme (discussed in Chapter 7). The cell also controls enzymatic 
activities by confining sets of enzymes to particular subcellular compartments, 
whether by enclosing them in a distinct membrane-bounded compartment (dis- 
cussed in Chapters 12 and 14) or by concentrating them on a protein scaffold (see 
Figure 3-77). As will be explained later in this chapter, enzymes are also cova- 
lently modified to control their activity. The rate of protein destruction by targeted 
proteolysis represents yet another important regulatory mechanism (see Figure 
6-86). But the most general process that adjusts reaction rates operates through 
a direct, reversible change in the activity of an enzyme in response to the specific 
small molecules that it binds. 

The most common type of control occurs when an enzyme binds a molecule 
that is not a substrate to a special regulatory site outside the active site, thereby 
altering the rate at which the enzyme converts its substrates to products. For exam- 
ple, in feedback inhibition, a product produced late in a reaction pathway inhib- 
its an enzyme that acts earlier in the pathway. Thus, whenever large quantities of 
the final product begin to accumulate, this product binds to the enzyme and slows 
down its catalytic action, thereby limiting the further entry of substrates into that 
reaction pathway (Figure 3-55). Where pathways branch or intersect, there are 
usually multiple points of control by different final products, each of which works 
to regulate its own synthesis (Figure 3-56). Feedback inhibition can work almost 
instantaneously, and it is rapidly reversed when the level of the product falls. 
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Figure 3-55 Feedback inhibition of a 
single biosynthetic pathway. The end 
product Z inhibits the first enzyme that is 
unique to its synthesis and thereby controls 
its own level in the cell. This is an example 
of negative regulation. 
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Figure 3-56 Multiple feedback inhibition. 
In this example, which shows the 
biosynthetic pathways for four different 
amino acids in bacteria, the red lines 
indicate positions at which products feed 
back to inhibit enzymes. Each amino acid 
controls the first enzyme specific to its 
own synthesis, thereby controlling its own 
levels and avoiding a wasteful, or even 
dangerous, buildup of intermediates. The 
products can also separately inhibit the 
initial set of reactions common to all the 
syntheses; in this case, three different 
enzymes catalyze the initial reaction, each 
inhibited by a different product. 
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Feedback inhibition is negative regulation: it prevents an enzyme from acting. 
Enzymes can also be subject to positive regulation, in which a regulatory molecule 
stimulates the enzyme’s activity rather than shutting the enzyme down. Positive 
regulation occurs when a product in one branch of the metabolic network stimu- 
lates the activity of an enzyme in another pathway. As one example, the accumu- 
lation of ADP activates several enzymes involved in the oxidation of sugar mole- 
cules, thereby stimulating the cell to convert more ADP to ATP. 


Allosteric Enzymes Have Two or More Binding Sites That Interact 


A striking feature of both positive and negative feedback regulation is that the reg- 
ulatory molecule often has a shape totally different from the shape of the substrate 
of the enzyme. This is why the effect on a protein is termed allostery (from the 
Greek words allos, meaning “other, and stereos, meaning “solid” or “three-dimen- 
sional”). As biologists learned more about feedback regulation, they recognized 
that the enzymes involved must have at least two different binding sites on their 
surface—an active site that recognizes the substrates, and a regulatory site that 
recognizes a regulatory molecule. These two sites must somehow communicate 
so that the catalytic events at the active site can be influenced by the binding of the 
regulatory molecule at its separate site on the protein’s surface. 

The interaction between separated sites on a protein molecule is now known 
to depend on a conformational change in the protein: binding at one of the sites 
causes a shift from one folded shape to a slightly different folded shape. During 
feedback inhibition, for example, the binding of an inhibitor at one site on the 
protein causes the protein to shift to a conformation that incapacitates its active 
site located elsewhere in the protein. 

It is thought that most protein molecules are allosteric. They can adopt two or 
more slightly different conformations, and a shift from one to another caused by 
the binding of a ligand can alter their activity. This is true not only for enzymes 
but also for many other proteins, including receptors, structural proteins, and 
motor proteins. In all instances of allosteric regulation, each conformation of the 
protein has somewhat different surface contours, and the protein’s binding sites 
for ligands are altered when the protein changes shape. Moreover, as we discuss 
next, each ligand will stabilize the conformation that it binds to most strongly, and 
thus—at high enough concentrations—will tend to “switch” the protein toward 
the conformation that the ligand prefers. 


Two Ligands Whose Binding Sites Are Coupled Must Reciprocally 
Affect Each Other's Binding 


The effects of ligand binding on a protein follow from a fundamental chemical 
principle known as linkage. Suppose, for example, that a protein that binds glu- 
cose also binds another molecule, X, at a distant site on the protein’s surface. If 
the binding site for X changes shape as part of the conformational change in the 
protein induced by glucose binding, the binding sites for X and for glucose are 
said to be coupled. Whenever two ligands prefer to bind to the same conformation 
of an allosteric protein, it follows from basic thermodynamic principles that each 
ligand must increase the affinity of the protein for the other. For example, if the 
shift of a protein to a conformation that binds glucose best also causes the binding 
site for X to fit X better, then the protein will bind glucose more tightly when X is 
present than when X is absent. In other words, X will positively regulate the pro- 
tein’s binding of glucose (Figure 3-57). 

Conversely, linkage operates in a negative way if two ligands prefer to bind 
to different conformations of the same protein. In this case, the binding of the 
first ligand discourages the binding of the second ligand. Thus, if a shape change 
caused by glucose binding decreases the affinity of a protein for molecule X, the 
binding of X must also decrease the protein’s affinity for glucose (Figure 3-58). 
The linkage relationship is quantitatively reciprocal, so that, for example, if glu- 
cose has a very large effect on the binding of X, X has a very large effect on the 
binding of glucose. 
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The relationships shown in Figures 3-57 and 3-58 apply to all proteins, and 
they underlie all of cell biology. The principle seems so obvious in retrospect 
that we now take it for granted. But the discovery of linkage in studies of a few 
enzymes in the 1950s, followed by an extensive analysis of allosteric mechanisms 
in proteins in the early 1960s, had a revolutionary effect on our understanding of 
biology. Since molecule X in these examples binds at a site on the enzyme that 
is distinct from the site where catalysis occurs, it need not have any chemical 
relationship to the substrate that binds at the active site. Moreover, as we have 
just seen, for enzymes that are regulated in this way, molecule X can either turn 
the enzyme on (positive regulation) or turn it off (negative regulation). By such a 
mechanism, allosteric proteins serve as general switches that, in principle, can 
allow one molecule in a cell to affect the fate of any other. 


symmetric Protein Assemblies Produce Cooperative Allosteric 
Transitions 


A single-subunit enzyme that is regulated by negative feedback can at most 
decrease from 90% to about 10% activity in response to a 100-fold increase in 
the concentration of an inhibitory ligand that it binds (Figure 3-59, red line). 
Responses of this type are apparently not sharp enough for optimal cell regulation, 
and most enzymes that are turned on or off by ligand binding consist of symmet- 
ric assemblies of identical subunits. With this arrangement, the binding of a mol- 
ecule of ligand to a single site on one subunit can promote an allosteric change in 
the entire assembly that helps the neighboring subunits bind the same ligand. As 
a result, a cooperative allosteric transition occurs (Figure 3-59, blue line), allowing 
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Figure 3-57 Positive regulation caused 
by conformational coupling between 
two separate binding sites. In this 
example, both glucose and molecule X 
bind best to the closed conformation of a 
protein with two domains. Because both 
glucose and molecule X drive the protein 
toward its closed conformation, each 
ligand helps the other to bind. Glucose 
and molecule X are therefore said to bind 
cooperatively to the protein. 


Figure 3-58 Negative regulation caused 
by conformational coupling between 
two separate binding sites. The scheme 
here resembles that in the previous figure, 
but here molecule X prefers the open 
conformation, while glucose prefers the 
closed conformation. Because glucose 
and molecule X drive the protein toward 
opposite conformations (closed and open, 
respectively), the presence of either ligand 
interferes with the binding of the other. 
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a relatively small change in ligand concentration in the cell to switch the whole 
assembly from an almost fully active to an almost fully inactive conformation (or 
vice versa). 

The principles involved in a cooperative “all-or-none” transition are the same 
for all proteins, whether or not they are enzymes. Thus, for example, they are crit- 
ical for the efficient uptake and release of O2 by hemoglobin in our blood. But 
they are perhaps easiest to visualize for an enzyme that forms a symmetric dimer. 
In the example shown in Figure 3-60, the first molecule of an inhibitory ligand 
binds with great difficulty since its binding disrupts an energetically favorable 
interaction between the two identical monomers in the dimer. A second molecule 
of inhibitory ligand now binds more easily, however, because its binding restores 
the energetically favorable monomer-monomer contacts of a symmetric dimer 
(this also completely inactivates the enzyme). 

As an alternative to this induced fit model for a cooperative allosteric transi- 
tion, we can view such a symmetric enzyme as having only two possible confor- 
mations, corresponding to the “enzyme on” and “enzyme off” structures in Figure 
3-60. In this view, ligand binding perturbs an all-or-none equilibrium between 
these two states, thereby changing the proportion of active molecules. Both mod- 
els represent true and useful concepts. 


Many Changes in Proteins Are Driven by Protein Phosphorylation 


Proteins are regulated by more than the reversible binding of other molecules. A 
second method that eukaryotic cells use extensively to regulate a protein’s func- 
tion is the covalent addition of a smaller molecule to one or more ofits amino acid 
side chains. The most common such regulatory modification in higher eukaryotes 
is the addition of a phosphate group. We shall therefore use protein phosphoryla- 
tion to illustrate some of the general principles involved in the control of protein 
function through the modification of amino acid side chains. 

A phosphorylation event can affect the protein that is modified in three 
important ways. First, because each phosphate group carries two negative 
charges, the enzyme-catalyzed addition of a phosphate group to a protein can 
cause a major conformational change in the protein by, for example, attracting a 
cluster of positively charged amino acid side chains. This can, in turn, affect the 
binding of ligands elsewhere on the protein surface, dramatically changing the 


Figure 3-60 A cooperative allosteric transition in an enzyme composed of 
two identical subunits. This diagram illustrates how the conformation of one 
subunit can influence that of its neighbor. The binding of a single molecule of 
an inhibitory ligand (yellow) to one subunit of the enzyme occurs with difficulty 
because it changes the conformation of this subunit and thereby disrupts the 
symmetry of the enzyme. Once this conformational change has occurred, 
however, the energy gained by restoring the symmetric pairing interaction 
between the two subunits makes it especially easy for the second subunit 

to bind the inhibitory ligand and undergo the same conformational change. 
Because the binding of the first molecule of ligand increases the affinity with 
which the other subunit binds the same ligand, the response of the enzyme to 
changes in the concentration of the ligand is much steeper than the response 
of an enzyme with only one subunit (see Figure 3-59 and Movie 3.10). 
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Figure 3-59 Enzyme activity versus 

the concentration of inhibitory ligand 
for single-subunit and multisubunit 
allosteric enzymes. For an enzyme with a 
single subunit (red line), a drop from 90% 
enzyme activity to 10% activity (indicated 
by the two dots on the curve) requires a 
100-fold increase in the concentration of 
inhibitor. The enzyme activity is calculated 
from the simple equilibrium relationship 

K = [IPJ/[I|[P], where P is active protein, 

| is inhibitor, and IP is the inactive protein 
bound to inhibitor. An identical curve 
applies to any simple binding interaction 
between two molecules, A and B. In 
contrast, a multisubunit allosteric enzyme 
can respond in a switchlike manner to a 
change in ligand concentration: the steep 
response is caused by a cooperative 
binding of the ligand molecules, as 
explained in Figure 3-60. Here, the green 
line represents the idealized result expected 
for the cooperative binding of two inhibitory 
ligand molecules to an allosteric enzyme 
with two subunits, and the blue line shows 
the idealized response of an enzyme with 
four subunits. As indicated by the two dots 
on each of these curves, the more complex 
enzymes drop from 90% to 10% activity 
over a much narrower range of inhibitor 
concentration than does the enzyme 
composed of a single subunit. 
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protein’s activity. When a second enzyme removes the phosphate group, the pro- 
tein returns to its original conformation and restores its initial activity. 

Second, an attached phosphate group can form part of a structure that the 
binding sites of other proteins recognize. As previously discussed, the SH2 domain 
binds to a short peptide sequence containing a phosphorylated tyrosine side 
chain (see Figure 3-40B). More than ten other common domains provide binding 
sites for attaching their protein to phosphorylated peptides in other protein mol- 
ecules, each recognizing a phosphorylated amino acid side chain in a different 
protein context. Third, the addition of a phosphate group can mask a binding site 
that otherwise holds two proteins together, and thereby disrupt protein-protein 
interactions. As a result, protein phosphorylation and dephosphorylation very 
often drive the regulated assembly and disassembly of protein complexes (see, for 
example, Figure 15-11). 

Reversible protein phosphorylation controls the activity, structure, and cellu- 
lar localization of both enzymes and many other types of proteins in eukaryotic 
cells. In fact, this regulation is so extensive that more than one-third of the 10,000 
Or so proteins in a typical mammalian cell are thought to be phosphorylated at 
any given time—many with more than one phosphate. As might be expected, the 
addition and removal of phosphate groups from specific proteins often occur in 
response to signals that specify some change in a cell’s state. For example, the 
complicated series of events that takes place as a eukaryotic cell divides is largely 
timed in this way (discussed in Chapter 17), and many of the signals mediating 
cell-cell interactions are relayed from the plasma membrane to the nucleus by a 
cascade of protein phosphorylation events (discussed in Chapter 15). 


A Eukaryotic Cell Contains a Large Collection of Protein Kinases 
and Protein Phosphatases 


Protein phosphorylation involves the enzyme-catalyzed transfer of the terminal 
phosphate group ofan ATP molecule to the hydroxyl group on a serine, threonine, 
or tyrosine side chain of the protein (Figure 3-61). A protein kinase catalyzes 
this reaction, and the reaction is essentially unidirectional because of the large 
amount of free energy released when the phosphate-phosphate bond in ATP is 
broken to produce ADP (discussed in Chapter 2). A protein phosphatase cata- 
lyzes the reverse reaction of phosphate removal, or dephosphorylation. Cells con- 
tain hundreds of different protein kinases, each responsible for phosphorylating a 
different protein or set of proteins. There are also many different protein phospha- 
tases; some are highly specific and remove phosphate groups from only one or a 
few proteins, whereas others act on a broad range of proteins and are targeted to 
specific substrates by regulatory subunits. The state of phosphorylation of a pro- 
tein at any moment, and thus its activity, depends on the relative activities of the 
protein kinases and phosphatases that modify it. 

The protein kinases that phosphorylate proteins in eukaryotic cells belong to a 
very large family of enzymes that share a catalytic (kinase) sequence of about 290 
amino acids. The various family members contain different amino acid sequences 
on either end of the kinase sequence (for example, see Figure 3-10), and often 
have short amino acid sequences inserted into loops within it. Some of these 
additional amino acid sequences enable each kinase to recognize the specific set 
of proteins it phosphorylates, or to bind to structures that localize it in specific 
regions of the cell. Other parts of the protein regulate the activity of each kinase, 
so it can be turned on and offin response to different specific signals, as described 
below. 

By comparing the number of amino acid sequence differences between the 
various members of a protein family, we can construct an “evolutionary tree” that 
is thought to reflect the pattern of gene duplication and divergence that gave rise 
to the family. Figure 3-62 shows an evolutionary tree of protein kinases. Kinases 
with related functions are often located on nearby branches of the tree: the pro- 
tein kinases involved in cell signaling that phosphorylate tyrosine side chains, for 
example, are all clustered in the top left corner of the tree. The other kinases shown 
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Figure 3-61 Protein phosphorylation. Many 
thousands of proteins in a typical eukaryotic 
cell are modified by the covalent addition of 

a phosphate group. (A) The general reaction 
transfers a phosphate group from ATP to an 
amino acid side chain of the target protein by 
a protein kinase. Removal of the phosphate 
group is catalyzed by a second enzyme, a 
protein phosphatase. In this example, the 
phosphate is added to a serine side chain; in 
other cases, the phosphate is instead linked 
to the -OH group of a threonine or a tyrosine 
in the protein. (B) The phosphorylation of a 
protein by a protein kinase can either increase 
or decrease the protein’s activity, depending 
on the site of phosphorylation and the 
structure of the protein. 
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phosphorylate either a serine or a threonine side chain, and many are organized 
into clusters that seem to reflect their function—in transmembrane signal trans- 
duction, intracellular signal amplification, cell-cycle control, and so on. 

As a result of the combined activities of protein kinases and protein phos- 
phatases, the phosphate groups on proteins are continually turning over—being 
added and then rapidly removed. Such phosphorylation cycles may seem waste- 
ful, but they are important in allowing the phosphorylated proteins to switch rap- 
idly from one state to another: the more rapid the cycle, the faster a population of 
protein molecules can change its state of phosphorylation in response to a sud- 
den change in its phosphorylation rate (see Figure 15-14). The energy required to 
drive this phosphorylation cycle is derived from the free energy of ATP hydrolysis, 
one molecule of which is consumed for each phosphorylation event. 


The Regulation of the Src Protein Kinase Reveals How a Protein 
Can Function as a Microprocessor 


The hundreds of different protein kinases in a eukaryotic cell are organized into 
complex networks of signaling pathways that help to coordinate the cell’s activ- 
ities, drive the cell cycle, and relay signals into the cell from the cell’s environ- 
ment. Many of the extracellular signals involved need to be both integrated and 
amplified by the cell. Individual protein kinases (and other signaling proteins) 
serve as input-output devices, or “microprocessors, in the integration process. 
An important part of the input to these signal-processing proteins comes from the 
control that is exerted by phosphates added and removed from them by protein 
kinases and protein phosphatases, respectively. 

The Src family of protein kinases (see Figure 3-10) exhibits such behavior. The 
Src protein (pronounced “sarc” and named for the type of tumor, a sarcoma, that 
its deregulation can cause) was the first tyrosine kinase to be discovered. It is now 
known to be part of a subfamily of nine very similar protein kinases, which are 
found only in multicellular animals. As indicated by the evolutionary tree in Fig- 
ure 3-62, sequence comparisons suggest that tyrosine kinases as a group were 
a relatively late innovation that branched off from the serine/threonine kinases, 
with the Src subfamily being only one subgroup of the tyrosine kinases created in 
this way. 

The Src protein and its relatives contain a short N-terminal region that becomes 
covalently linked to a strongly hydrophobic fatty acid, which anchors the kinase at 
the cytoplasmic face of the plasma membrane. Next along the linear sequence of 
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Figure 3-62 An evolutionary tree of 
selected protein kinases. A higher 
eukaryotic cell contains hundreds of such 
enzymes, and the human genome codes 
for more than 500. Note that only some of 
these, those discussed in this book, are 
shown. 
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amino acids come two peptide-binding domains, a Src homology 3 (SH3) domain 
and an SH2 domain, followed by the kinase catalytic domain (Figure 3-63). These 
kinases normally exist in an inactive conformation, in which a phosphorylated 
tyrosine near the C-terminus is bound to the SH2 domain, and the SH3 domain 
is bound to an internal peptide in a way that distorts the active site of the enzyme 
and helps to render it inactive. 

As shown in Figure 3-64, turning the kinase on involves at least two specific 
inputs: removal of the C-terminal phosphate and the binding of the SH3 domain 
by a specific activating protein. In this way, the activation of the Src kinase sig- 
nals the completion of a particular set of separate upstream events (Figure 3-65). 
Thus, the Src family of protein kinases serves as specific signal integrators, con- 
tributing to the web of information-processing events that enable the cell to com- 
pute useful responses to a complex set of different conditions. 


Proteins That Bind and Hydrolyze GTP Are Ubiquitous Cell 
Regulators 


We have described how the addition or removal of phosphate groups on a protein 
can be used by a cell to control the protein’s activity. In the example just discussed, 
a kinase transfers a phosphate from an ATP molecule to an amino acid side chain 
of a target protein. Eukaryotic cells also have another way to control protein activ- 
ity by phosphate addition and removal. In this case, the phosphate is not attached 
directly to the protein; instead, it is a part of the guanine nucleotide GTP, which 
binds very tightly to a class of proteins known as GTP-binding proteins. In general, 
proteins regulated in this way are in their active conformations with GTP bound. 
The loss of a phosphate group occurs when the bound GTP is hydrolyzed to GDP 
in a reaction catalyzed by the protein itself, and in its GDP-bound state the protein 
is inactive. In this way, GTP-binding proteins act as on-off switches whose activity 
is determined by the presence or absence of an additional phosphate on a bound 
GDP molecule (Figure 3-66). 

GTP-binding proteins (also called GTPases because of the GTP hydrolysis 
they catalyze) comprise a large family of proteins that all contain variations on 
the same GTP-binding globular domain. When a tightly bound GTP is hydrolyzed 
by the GTP-binding protein to GDP, this domain undergoes a conformational 
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Figure 3-63 The domain structure of the 
Src family of protein kinases, mapped 
along the amino acid sequence. For the 
three-dimensional structure of Src, see 
Figure 3-13. 
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Figure 3-64 The activation of a Src-type protein kinase by two sequential events. As described in the text, the requirement for multiple 
upstream events to trigger these processes allows the kinase to serve as a signal integrator (Movie 3.11). (Adapted from S.C. Harrison et al., Cell 


112:737-740, 2003. With permission from Elsevier.) 
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Figure 3-65 How a Src-type protein kinase acts as a signal-integrating INPUTS 
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change that inactivates the protein. The three-dimensional structure of a proto- 
typical member of this family, the monomeric GTPase called Ras, is shown in Fig- 
ure 3-67. 

The Ras protein has an important role in cell signaling (discussed in Chapter 
15). In its GTP-bound form, it is active and stimulates a cascade of protein phos- 
phorylations in the cell. Most of the time, however, the protein is in its inactive, 
GDP-bound form. It becomes active when it exchanges its GDP for a GTP mole- 
cule in response to extracellular signals, such as growth factors, that bind to recep- 
tors in the plasma membrane (see Figure 15-47). 




















Src-type protein kinase activity turns on 


Regulatory Proteins GAP and GEF Control the Activity of GTP- fully only if the answers to all of the 
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GTP-binding proteins are controlled by regulatory proteins that determine 
whether GTP or GDP is bound, just as phosphorylated proteins are turned on 
and off by protein kinases and protein phosphatases. Thus, Ras is inactivated by a 
GTPase-activating protein (GAP), which binds to the Ras protein and induces Ras 
to hydrolyze its bound GTP molecule to GDP—which remains tightly bound— 
and inorganic phosphate (Pj), which is rapidly released. The Ras protein stays in 
its inactive, GDP-bound conformation until it encounters a guanine nucleotide 
exchange factor (GEF), which binds to GDP-Ras and causes Ras to release its GDP. 
Because the empty nucleotide-binding site is immediately filled by a GTP mol- 
ecule (GTP is present in large excess over GDP in cells), the GEF activates Ras 
by indirectly adding back the phosphate removed by GTP hydrolysis. Thus, in a 
sense, the roles of GAP and GEF are analogous to those of a protein phosphatase 
and a protein kinase, respectively (Figure 3-68). 


Proteins Can Be Regulated by the Covalent Addition of Other 
Proteins 


Cells contain a special family of small proteins whose members are covalently 
attached to many other proteins to determine the activity or fate of the second 
protein. In each case, the carboxyl end of the small protein becomes linked to 
the amino group of a lysine side chain of a “target” protein through an isopeptide 
bond. The first such protein discovered, and the most abundantly used, is ubiq- 
uitin (Figure 3-69A). Ubiquitin can be covalently attached to target proteins in a 
variety of ways, each of which has a different meaning for cells. The major form of 
ubiquitin addition produces polyubiquitin chains in which—once the first ubiq- 
uitin molecule is attached to the target—each subsequent ubiquitin molecule 
links to Lys48 of the previous ubiquitin, creating a chain of Lys48-linked ubiqui- 
tins that are attached to a single lysine side chain of the target protein. This form 
of polyubiquitin directs the target protein to the interior of a proteasome, where it Figure 3-66 GTP-binding proteins 
is digested to small peptides (see Figure 6-84). In other circumstances, only single as molecular switches. The activity 


molecules of ubiquitin are added to proteins. In addition, some target proteins are Of a GTP-binding protein (also called a 
GTPase) generally requires the presence 
of a tightly bound GTP molecule (switch 
“on’). Hydrolysis of this GTP molecule 
by the GTP-binding protein produces 
GDP and inorganic phosphate (P), and it 
causes the protein to convert to a different, 
Usually inactive, conformation (switch “off’). 
Resetting the switch requires that the tightly 
bound GDP dissociates. This is a slow 
step that is greatly accelerated by specific 
ON OFF OFF ON signals; once the GDP has dissociated, a 
ACTIVE INACTIVE INACTIVE ACTIVE molecule of GTP is quickly rebound. 


GTP-binding protein 


ZY 
\ = (P) GDP GTP 








GTP SLOW FAST 
HYDROLYSIS GDE 





158 Chapter 3: Proteins 


modified with a different type of polyubiquitin chain. These modifications have 
different functional consequences for the protein that is targeted (Figure 3-69B). 

Related structures are created when a different member of the ubiquitin fam- 
ily, such as SUMO (small ubiquitin-related modifier), is covalently attached to a 
lysine side chain of target proteins. Not surprisingly, all such modifications are 
reversible. Cells contain sets of ubiquitylating and deubiquitylating (and sumoy- 
lating and desumoylating) enzymes that manipulate these covalent adducts, 
thereby playing roles analogous to the protein kinases and phosphatases that add 
and remove phosphates from protein side chains. 


An Elaborate Ubiquitin-Conjugating System Is Used to Mark 
Proteins 


How do cells select target proteins for ubiquitin addition? As an initial step, the 
carboxyl end of ubiquitin needs to be activated. This activation is accomplished 
when a protein called a ubiquitin-activating enzyme (E1) uses ATP hydrolysis 
energy to attach ubiquitin to itself through a high-energy covalent bond (a thio- 
ester). El then passes this activated ubiquitin to one of a set of ubiquitin-conju- 
gating (E2) enzymes, each of which acts in conjunction with a set of accessory 
(E3) proteins called ubiquitin ligases. There are roughly 30 structurally similar 
but distinct E2 enzymes in mammals, and hundreds of different E3 proteins that 
form complexes with specific E2 enzymes. 

Figure 3-70 illustrates the process used to mark proteins for proteasomal 
degradation. [Similar mechanisms are used to attach ubiquitin (and SUMO) to 
other types of target proteins.| Here, the ubiquitin ligase binds to specific degra- 
dation signals, called degrons, in protein substrates, thereby helping E2 to form a 
polyubiquitin chain linked to a lysine of the substrate protein. This polyubiqui- 
tin chain on a target protein will then be recognized by a specific receptor in the 
proteasome, causing the target protein to be destroyed. Distinct ubiquitin ligases 
recognize different degradation signals, thereby targeting distinct subsets of 
intracellular proteins for destruction, often in response to specific signals (see 


Figure 6-86). 
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Figure 3-68 A comparison of two major intracellular signaling mechanisms in eukaryotic 
cells. In both cases, a signaling protein is activated by the addition of a phosphate group and 
inactivated by the removal of this phosphate. Note that the addition of a phosphate to a protein 
can also be inhibitory. (Adapted from E.R. Kantrowitz and W.N. Lipscomb, Trends Biochem. Sci. 
15:53-59, 1990.) 
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Figure 3-67 The structure of the Ras 
protein in its GTP-bound form. This 
monomeric GTPase illustrates the structure 
of a GTP-binding domain, which is present 
in a large family of GTP-binding proteins. 
The red regions change their conformation 
when the GTP molecule is hydrolyzed 

to GDP and inorganic phosphate by the 
protein; the GDP remains bound to the 
protein, while the inorganic phosphate is 
released. The special role of the “switch 
helix” in proteins related to Ras is 
explained in the text (see Figure 3-72 

and Movie 15.7). 
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Protein Complexes with Interchangeable Parts Make Efficient Use 
of Genetic Information 


The SCF ubiquitin ligase is a protein complex that binds different “target proteins” 
at different times in the cell cycle, covalently adding polyubiquitin polypeptide 
chains to these targets. Its C-shaped structure is formed from five protein sub- 
units, the largest of which serves as a scaffold on which the rest of the complex 
is built. The structure underlies a remarkable mechanism (Figure 3-71). At one 
end of the C is an E2 ubiquitin-conjugating enzyme. At the other end is a sub- 
strate-binding arm, a subunit known as an F-box protein. These two subunits are 
separated by a gap of about 5 nm. When this protein complex is activated, the 
F-box protein binds to a specific site on a target protein, positioning the protein 
in the gap so that some of its lysine side chains contact the ubiquitin-conjugating 
enzyme. The enzyme can then catalyze repeated additions of ubiquitin polypep- 
tide to these lysines (see Figure 3-71C), producing polyubiquitin chains that mark 
the target proteins for rapid destruction in a proteasome. 
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Figure 3-69 The marking of proteins 

by ubiquitin. (A) The three-dimensional 
structure of ubiquitin, a small protein of 76 
amino acids. A family of special enzymes 
couples its carboxyl end to the amino 
group of a lysine side chain in a target 
protein molecule, forming an isopeptide 
bond. (B) Some modification patterns that 
have specific meanings to the cell. Note 
that the two types of polyubiquitylation 
differ in the way the ubiquitin molecules 
are linked together. Linkage through 
Lys48 signifies degradation by the 
proteasome (see Figure 6-84), whereas 
that through Lys63 has other meanings. 
Ubiquitin markings are “read” by proteins 
that specifically recognize each type of 
modification. 
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Figure 3-70 The marking of proteins with ubiquitin. (A) The C-terminus of ubiquitin is initially activated by being linked via a high-energy thioester 
bond to a cysteine side chain on the E1 protein. This reaction requires ATP, and it proceeds via a covalent AMP-ubiquitin intermediate. The 
activated ubiquitin on E1, also known as the ubiquitin-activating enzyme, is then transferred to the cysteine on an E2 molecule. (B) The addition of 
a polyubiquitin chain to a target protein. In a mammalian cell, there are several hundred distinct E2-E3 complexes. The E2s are called ubiquitin- 
conjugating enzymes. The E3s are referred to as ubiquitin ligases. (Adapted from D.R. Knighton et al., Science 253:407-414, 1991.) 
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In this manner, specific proteins are targeted for rapid destruction in response 
to specific signals, thereby helping to drive the cell cycle (discussed in Chapter 17). 
The timing of the destruction often involves creating a specific pattern of phos- 
phorylation on the target protein that is required for its recognition by the F-box 
subunit. It also requires the activation of an SCF ubiquitin ligase that carries the 
appropriate substrate-binding arm. Many of these arms (the F-box subunits) are 
interchangeable in the protein complex (see Figure 3-71B), and there are more 
than 70 human genes that encode them. 

As emphasized previously, once a successful protein has evolved, its genetic 
information tends to be duplicated to produce a family of related proteins. Thus, 
for example, not only are there many F-box proteins—making possible the rec- 
ognition of different sets of target proteins—but there is also a family of scaffolds 
(known as cullins) that give rise to a family of SCF-like ubiquitin ligases. 

A protein machine like the SCF ubiquitin ligase, with its interchangeable parts, 
makes economical use of the genetic information in cells. It also creates opportu- 
nities for “rapid” evolution, inasmuch as new functions can evolve for the entire 
complex simply by producing an alternative version of one of its subunits. 

Ubiquitin ligases form a diverse family of protein complexes. Some of these 
complexes are far larger and more complicated than SCF, but their underlying 
enzymatic function remains the same (Figure 3-71D). 


A GTP-Binding Protein Shows How Large Protein Movements 
Can Be Generated 


Detailed structures obtained for one of the GTP-binding protein family members, 
the EF-Tu protein, provide a good example of how allosteric changes in protein 
conformations can produce large movements by amplifying a small, local confor- 
mational change. As will be discussed in Chapter 6, EF-Tu is an abundant mole- 
cule that serves as an elongation factor (hence the EF) in protein synthesis, load- 
ing each aminoacyl-tRNA molecule onto the ribosome. EF-Tu contains a Ras-like 
domain (see Figure 3-67), and the tRNA molecule forms a tight complex with its 
GTP-bound form. This tRNA molecule can transfer its amino acid to the growing 
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Figure 3-71 The structure and mode of 
action of an SCF ubiquitin ligase. (A) The 
structure of the five-protein ubiquitin ligase 
complex that includes an E2 ubiquitin- 
conjugating enzyme. Four proteins form the 
E3 portion. The protein denoted here as 
adaptor protein 1 is the Rox1/Hrt1 protein, 
adaptor protein 2 is the Skp1 protein, and 
the cullin is the Cull protein. One of the 
many different F-box proteins completes 

the complex. (B) Comparison of the same 
complex with two different substrate-binding 
arms, the F-box proteins Skp2 (top) and 
B-trCP1 (bottom), respectively. (C) The 
binding and ubiquitylation of a target protein 
by the SCF ubiquitin ligase. If, as indicated, 

a chain of ubiquitin molecules is added to 
the same lysine of the target protein, that 
protein is marked for rapid destruction by the 
proteasome. (D) Comparison of SCF (bottom) 
with a low-resolution electron microscopy 
structure of a ubiquitin ligase called the 
anaphase-promoting complex (APC/C; top) 
at the same scale. The APC/C is a large, 
15-protein complex. As discussed in Chapter 
17, its ubiquitylations control the late stages 
of mitosis. It is distantly related to SCF and 
contains a Cullin subunit (green) that lies along 
the side of the complex at right, only partly 
visible in this view. E2 proteins are not shown 
here, but their binding sites are indicated in 
orange, along with substrate-binding sites 

in purple. (A and B, adapted from G. Wu 

et al., Mol. Cell 11:1445-1456, 2008. With 
permission from Elsevier; D, adapted from 

P. da Fonseca et al., Nature 470:274-278, 
2011. With permission from Macmillan 
Publishers Ltd.) 
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Figure 3-72 The large conformational change in EF-Tu caused by GTP hydrolysis. (A and B) The three-dimensional 
structure of EF-Tu with GTP bound. The domain at the top has a structure similar to the Ras protein, and its red a helix is the 
switch helix, which moves after GTP hydrolysis. (C) The change in the conformation of the switch helix in domain 1 allows 
domains 2 and 3 to rotate as a single unit by about 90 degrees toward the viewer, which releases the tRNA that was bound to 
this structure (See also Figure 3-73). (A, adapted from H. Berchtold et al., Nature 365:126-132, 1993. With permission from 
Macmillan Publishers Ltd. B, courtesy of Mathias Sprinzl and Rolf Hilgenfeld. PDB code: 1EFT.) 


polypeptide chain only after the GTP bound to EF-Tu is hydrolyzed, dissociating 
the EF-Tu. Since this GTP hydrolysis is triggered by a proper fit of the tRNA to the 
mRNA molecule on the ribosome, the EF-Tu serves as a factor that discriminates 
between correct and incorrect mRNA-tRNA pairings (see Figure 6-65). EF-Tu 

By comparing the three-dimensional structure of EF-Tu in its GTP-bound and amie addid 
GDP-bound forms, we can see how the repositioning of the tRNA occurs. The to tRNA 
dissociation of the inorganic phosphate group (Pi), which follows the reaction 
GTP — GDP + Pj, causes a shift of a few tenths of ananometer at the GTP-binding 
site, just as it does in the Ras protein. This tiny movement, equivalent to a few 
times the diameter of a hydrogen atom, causes a conformational change to propa- 
gate along a crucial piece of a helix, called the switch helix, in the Ras-like domain 
of the protein. The switch helix seems to serve as a latch that adheres to a specific 
site in another domain of the molecule, holding the protein in a “shut” conforma- 
tion. The conformational change triggered by GTP hydrolysis causes the switch 
helix to detach, allowing separate domains of the protein to swing apart, through 
a distance of about 4 nm (Figure 3-72). This releases the bound tRNA molecule, 
allowing its attached amino acid to be used (Figure 3-73). 

Notice in this example how cells have exploited a simple chemical change that 
occurs on the surface of a small protein domain to create a movement 50 times 
larger. Dramatic shape changes of this type also cause the very large movements 
that occur in motor proteins, as we discuss next. 








Motor Proteins Produce Large Movements in Cells 


We have seen that conformational changes in proteins have a central role in 
enzyme regulation and cell signaling. We now discuss proteins whose major 
function is to move other molecules. These motor proteins generate the forces 
responsible for muscle contraction and the crawling and swimming of cells. 
Motor proteins also power smaller-scale intracellular movements: they help to 
move chromosomes to opposite ends of the cell during mitosis (discussed in Figure 3-73 An aminoacyl tRNA molecule 
Chapter 17), to move organelles along molecular tracks within the cell (discussed ound to EF-Tu. Note how the bound 
in Chapter 16), and to move enzymes along a DNA strand during the synthesis of aa A fi 
a new DNA molecule (discussed in Chapter 5). All these fundamental processes GTP hydrolysis triggers the conformational 
depend on proteins with moving parts that operate as force-generating machines. changes shown in Figure 3-72C, 

How do these machines work? In other words, how do cells use shape changes dissociating the protein-tRNA complex. 
in proteins to generate directed movements? If, for example, a protein is required Sac : aa 
to walk along a narrow thread such as a DNA molecule, it can do this by under- gi. called EF-4 (Movie 3.12). (Coordinates 
going a series of conformational changes, such as those shown in Figure 3-74. determined by P. Nissen et al., Science 


But with nothing to drive these changes in an orderly sequence, they are perfectly 270:1464-1472, 1995. PDB code: 1B23.) 
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reversible, and the protein can only wander randomly back and forth along the 
thread. We can look at this situation in another way. Since the directional move- 
ment of a protein does work, the laws of thermodynamics (discussed in Chapter 2) 
demand that such movement use free energy from some other source (otherwise 
the protein could be used to make a perpetual motion machine). Therefore, with- 
out an input of energy, the protein molecule can only wander aimlessly. 

How can the cell make such a series of conformational changes unidirectional? 
To force the entire cycle to proceed in one direction, it is enough to make any one 
of the changes in shape irreversible. Most proteins that are able to walk in one 
direction for long distances achieve this motion by coupling one of the confor- 
mational changes to the hydrolysis of an ATP molecule that is tightly bound to the 
protein. The mechanism is similar to the one just discussed that drives allosteric 
protein shape changes by GTP hydrolysis. Because ATP (or GTP) hydrolysis 
releases a great deal of free energy, it is very unlikely that the nucleotide-binding 
protein will undergo the reverse shape change needed for moving backward— 
since this would require that it also reverse the ATP hydrolysis by adding a phos- 
phate molecule to ADP to form ATP. 

In the model shown in Figure 3-75A, ATP binding shifts a motor protein from 
conformation 1 to conformation 2. The bound ATP is then hydrolyzed to produce 
ADP and inorganic phosphate (Pj), causing a change from conformation 2 to con- 
formation 3. Finally, the release of the bound ADP and P; drives the protein back 
to conformation 1. Because the energy provided by ATP hydrolysis drives the tran- 
sition 2 — 3, this series of conformational changes is effectively irreversible. Thus, 
the entire cycle goes in only one direction, causing the protein molecule to walk 
continuously to the right in this example. 

Many motor proteins generate directional movement through the use of 
a similar unidirectional ratchet, including the muscle motor protein myosin, 
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Figure 3-74 An allosteric “walking” 
protein. Although its three different 
conformations allow it to wander randomly 
back and forth while bound to a thread or a 
filament, the protein cannot move uniformly 
in a single direction. 


Figure 3-75 How a protein can walk 

in one direction. (A) An allosteric 

motor protein driven by ATP hydrolysis. 
The transition between three different 
conformations includes a step driven by 
the hydrolysis of a bound ATP molecule, 
creating a “unidirectional ratchet” that 
makes the entire cycle essentially 
irreversible. By repeated cycles, the protein 
therefore moves continuously to the right 
along the thread. (B) Direct visualization 

of a walking myosin motor protein by 
high-speed atomic force microscopy; the 
elapsed time between steps was less than 
0.5 sec (see Movie 16.3). (B, modified from 
N. Kodera et al., Nature 468:72-76, 2010. 
With permission from Macmillan 
Publishers Ltd.) 
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which walks along actin filaments (Figure 3-75B), and the kinesin proteins that 
walk along microtubules (both discussed in Chapter 16). These movements can 
be rapid: some of the motor proteins involved in DNA replication (the DNA heli- 
cases) propel themselves along a DNA strand at rates as high as 1000 nucleotides 
per second. 


Membrane-Bound Transporters Harness Energy to Pump 
Molecules Through Membranes 


We have thus far seen how proteins that undergo allosteric shape changes can act 
as microprocessors (Src family kinases), as assembly factors (EF-Tu), and as gen- 
erators of mechanical force and motion (motor proteins). Allosteric proteins can 
also harness energy derived from ATP hydrolysis, ion gradients, or electron-trans- 
port processes to pump specific ions or small molecules across a membrane. We 
consider one example here that will be discussed in more detail in Chapter 11. 

The ABC transporters (ATP-binding cassette transporters) constitute an 
important class of membrane-bound pump proteins. In humans, at least 48 dif- 
ferent genes encode them. These transporters mostly function to export hydro- 
phobic molecules from the cytoplasm, serving to remove toxic molecules at the 
mucosal surface of the intestinal tract, for example, or at the blood-brain barrier. 
The study of ABC transporters is of intense interest in clinical medicine, because 
the overproduction of proteins in this class contributes to the resistance of tumor 
cells to chemotherapeutic drugs. In bacteria, the same types of proteins primarily 
function to import essential nutrients into the cell. 

A typical ABC transporter contains a pair of membrane-spanning subunits 
linked to a pair of ATP-binding subunits located just below the plasma mem- 
brane. As in other examples we have discussed, the hydrolysis of the bound ATP 
molecules drives conformational changes in the protein, transmitting forces that 
cause the membrane-spanning subunits to move their bound molecules across 
the lipid bilayer (Figure 3-76). 

Humans have invented many different types of mechanical pumps, and it 
should not be surprising that cells also contain membrane-bound pumps that 
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Figure 3-76 The ABC (ATP-binding 
cassette) transporter, a protein machine 
that pumps molecules through a 
membrane. (A) How this large family of 
transporters pumps molecules into the 

cell in bacteria. As indicated, the binding 

of two molecules of ATP causes the two 
ATP-binding domains to clamp together 
tightly, opening a channel to the cell exterior. 
The binding of a substrate molecule to the 
extracellular face of the protein complex then 
triggers ATP hydrolysis followed by ADP 
release, which opens the cytoplasmic gate; 
the pump is then reset for another cycle. 

(B) As discussed in Chapter 11, in 
eukaryotes an opposite process occurs, 
causing selected substrate molecules to be 
pumped out of the cell. (C) The structure 

of a bacterial ABC transporter (see Movie 
11.5). (C, from R.J. Dawson and K.P. Locher, 
Nature 443:180-185, 2006. With permission 
from Macmillan Publishers Ltd; 

PDB code: 2HYD). 
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function in other ways. Among the most notable are the rotary pumps that couple 
the hydrolysis of ATP to the transport of H* ions (protons). These pumps resem- 
ble miniature turbines, and they are used to acidify the interior of lysosomes and 
other eukaryotic organelles. Like other ion pumps that create ion gradients, they 
can function in reverse to catalyze the reaction ADP + P; — ATP, if the gradient 
across their membrane of the ion that they transport is steep enough. 

One such pump, the ATP synthase, harnesses a gradient of proton concentra- 
tion produced by electron-transport processes to produce most of the ATP used in 
the living world. This ubiquitous pump has a central role in energy conversion, and 
we shall discuss its three-dimensional structure and mechanism in Chapter 14. 


Proteins Often Form Large Complexes That Function as Protein 
Machines 


Large proteins formed from many domains are able to perform more elaborate 
functions than small, single-domain proteins. But large protein assemblies formed 
from many protein molecules linked together by noncovalent bonds perform 
the most impressive tasks. Now that it is possible to reconstruct most biological 
processes in cell-free systems in the laboratory, it is clear that each of the central 
processes in a cell—such as DNA replication, protein synthesis, vesicle budding, 
or transmembrane signaling—is catalyzed by a highly coordinated, linked set of 
10 or more proteins. In most such protein machines, an energetically favorable 
reaction such as the hydrolysis of bound nucleoside triphosphates (ATP or GTP) 
drives an ordered series of conformational changes in one or more of the indi- 
vidual protein subunits, enabling the ensemble of proteins to move coordinately. 
In this way, each enzyme can be moved directly into position, as the machine 
catalyzes successive reactions in a series (Figure 3-77). This is what occurs, for 
example, in protein synthesis on a ribosome (discussed in Chapter 6)—or in DNA 
replication, where a large multiprotein complex moves rapidly along the DNA 
(discussed in Chapter 5). 

Cells have evolved protein machines for the same reason that humans have 
invented mechanical and electronic machines. For accomplishing almost any 
task, manipulations that are spatially and temporally coordinated through linked 
processes are much more efficient than the use of many separate tools. 


scaffolds Concentrate Sets of Interacting Proteins 


As scientists have learned more of the details of cell biology, they have recognized 
an increasing degree of sophistication in cell chemistry. Thus, not only do we now 
know that protein machines play a predominant role, but it has also become clear 
that they are very often localized to specific sites in the cell, being assembled and 
activated only where and when they are needed. As one example, when extra- 
cellular signaling molecules bind to receptor proteins in the plasma membrane, 
the activated receptors often recruit a set of other proteins to the inside surface of 
the plasma membrane to form a large protein complex that passes the signal on 
(discussed in Chapter 15). 

The mechanisms frequently involve scaffold proteins. These are proteins with 
binding sites for multiple other proteins, and they serve both to link together spe- 
cific sets of interacting proteins and to position them at specific locations inside a 
cell. At one extreme are rigid scaffolds, such as the cullin in SCF ubiquitin ligase 
(see Figure 3-71). At the other extreme are the large, flexible scaffold proteins 
that often underlie regions of specialized plasma membrane. These include the 


Figure 3-77 How “protein machines” carry out complex functions. 
These machines are made of individual proteins that collaborate to perform 

a specific task (Movie 3.13). The movement of these proteins is often 
coordinated by the hydrolysis of a bound nucleotide such as ATP or GTP. 
Directional allosteric conformational changes of proteins that are driven in this 
way often occur in a large protein assembly in which the activities of several 
different protein molecules are coordinated by such movements within the 
complex. 
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Discs-large protein (Dlg), a protein of about 900 amino acids that is concentrated 
in special regions beneath the plasma membrane in epithelial cells and at syn- 
apses. Dlg contains binding sites for at least seven other proteins, interspersed 
with regions of more flexible polypeptide chain. An ancient protein, conserved in 
organisms as diverse as sponges, worms, flies, and humans, Dlg derives its name 
from the mutant phenotype of the organism in which it was first discovered; the 
cells in the imaginal discs of a Drosophila embryo with a mutation in the Dig gene 
fail to stop proliferating when they should, and they produce unusually large discs 
whose epithelial cells can form tumors. 

Although incompletely studied, Dlg and a large number of similar scaffold 
proteins are thought to function like the protein that is schematically illustrated in 
Figure 3-78. By binding a specific set of interacting proteins, these scaffolds can 
enhance the rate of critical reactions, while also confining them to the particular 
region of the cell that contains the scaffold. For similar reasons, cells also make 
extensive use of scaffold RNA molecules, as discussed in Chapter 7. 


Many Proteins Are Controlled by Covalent Modifications That 
Direct Them to Specific Sites Inside the Cell 


We have thus far described only a few ways in which proteins are post-translation- 
ally modified. A large number of other such modifications also occur, more than 
200 distinct types being known. To give a sense of the variety, Table 3-3 presents 


TABLE 3-3 


Drives the assembly of a protein into larger complexes 
(see Figure 15-11) 


Helps to create distinct regions in chromatin through 
forming either mono-, di-, or trimethyl lysine in histones 
(see Figure 4—36) 


Helps to activate genes in chromatin by modifying 
histones (see Figure 4—33) 


This fatty acid addition drives protein association with 
membranes (see Figure 10-18) 


Controls enzyme activity and gene expression in glucose 
homeostasis 


Ubiquitin on Lys Monoubiquitin addition regulates the transport of 
membrane proteins in vesicles (see Figure 13—50) 


A polyubiquitin chain targets a protein for degradation 
(see Figure 3-70) 


Ubiquitin is a 76-amino-acid polypeptide; there are at least 10 other ubiquitin-related proteins in 
mammalian cells. 
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Figure 3-78 How the proximity created 
by scaffold proteins can greatly speed 
reactions in a cell. In this example, long 
unstructured regions of polypeptide chain 
in a large scaffold protein connect a series 
of structured domains that bind a set of 
reacting proteins. The unstructured regions 
serve as flexible “tethers” that greatly speed 
reaction rates by causing a rapid, random 
collision of all of the proteins that are bound 
to the scaffold. (For specific examples of 
protein tethering, see Figure 3-54 and 
Figure 16-18; for scaffold RNA molecules, 
see Figure 7—49B.) 
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(A) A SPECTRUM OF COVALENT MODIFICATIONS PRODUCES A REGULATORY PROTEIN CODE Figure 3-79 Multisite protein modification 
and its effects. (A) A protein that carries 


a post-translational addition to more than 
one of its amino acid side chains can 

be considered to carry a combinatorial 
regulatory code. Multisite modifications 
are added to (and removed from) a protein 
through signaling networks, and the 
resulting combinatorial regulatory code on 
the protein is read to alter its behavior in 
the cell. (B) The pattern of some covalent 
modifications to the protein p53. 
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a few of the modifying groups with known regulatory roles. As in phosphate 
and ubiquitin additions described previously, these groups are added and then 
removed from proteins according to the needs of the cell. 

A large number of proteins are now known to be modified on more than one 
amino acid side chain, with different regulatory events producing a different pat- 
tern of such modifications. A striking example is the protein p53, which plays a 
central part in controlling a cell’s response to adverse circumstances (see Figure 
17-62). Through one of four different types of molecular additions, this protein 
can be modified at 20 different sites. Because an enormous number of different 
combinations of these 20 modifications are possible, the protein’s behavior can 
in principle be altered in a huge number of ways. Such modifications will often 
create a site on the modified protein that binds it to a scaffold protein in a specific 
region of the cell, thereby connecting it—via the scaffold—to the other proteins 
required for a reaction at that site. 

One can view each protein’s set of covalent modifications as a combinatorial 
regulatory code. Specific modifying groups are added to or removed from a pro- 
tein in response to signals, and the code then alters protein behavior—changing 
the activity or stability of the protein, its binding partners, and/or its specific loca- 
tion within the cell (Figure 3-79). As a result, the cell is able to respond rapidly 
and with great versatility to changes in its condition or environment. 


A Complex Network of Protein Interactions Underlies Cell Function 


There are many challenges facing cell biologists in this information-rich era when 
a large number of complete genome sequences are known. One is the need to 
dissect and reconstruct each one of the thousands of protein machines that exist 
in an organism such as ourselves. To understand these remarkable protein com- 
plexes, each will need to be reconstituted from its purified protein parts, so that 
we can study its detailed mode of operation under controlled conditions in a test 
tube, free from all other cell components. This alone is a massive task. But we now 
know that each of these subcomponents of a cell also interacts with other sets of 
macromolecules, creating a large network of protein-protein and protein-nucleic 
acid interactions throughout the cell. To understand the cell, therefore, we will 
need to analyze most of these other interactions as well. 


PROTEIN FUNCTION 


We can gain some idea of the complexity of intracellular protein networks 
from a particularly well-studied example described in Chapter 16: the many doz- 
ens of proteins that interact with the actin cytoskeleton to control actin filament 
behavior (see Panel 16-3, p. 905). 

The extent of such protein-protein interactions can also be estimated more 
generally. An enormous amount of valuable information is now freely available in 
protein databases on the Internet: tens of thousands of three-dimensional protein 
structures plus tens of millions of protein sequences derived from the nucleotide 
sequences of genes. Scientists have been developing new methods for mining 
this great resource to increase our understanding of cells. In particular, comput- 
er-based bioinformatics tools are being combined with robotics and other tech- 
nologies to allow thousands of proteins to be investigated in a single set of exper- 
iments. Proteomics is a term that is often used to describe such research focused 
on the analysis of large sets of proteins, analogous to the term genomics describing 
the large-scale analysis of DNA sequences and genes. 

A biochemical method based on affinity tagging and mass spectroscopy 
has proven especially powerful for determining the direct binding interactions 
between the many different proteins in a cell (discussed in Chapter 8). The results 
are being tabulated and organized in Internet databases. This allows a cell biolo- 
gist studying a small set of proteins to readily discover which other proteins in the 
same cell are likely to bind to, and thus interact with, that set of proteins. When 
displayed graphically as a protein interaction map, each protein is represented by 
a box or dot in a two-dimensional network, with a straight line connecting those 
proteins that have been found to bind to each other. 

When hundreds or thousands of proteins are displayed on the same map, the 
network diagram becomes bewilderingly complicated, serving to illustrate the 
enormous challenges that face scientists attempting to understand the cell (Fig- 
ure 3-80). Much more useful are small subsections of these maps, centered on a 
few proteins of interest. 

We have previously described the structure and mode of action of the SCF 
ubiquitin ligase, using it to illustrate how protein complexes are constructed from 
interchangeable parts (see Figure 3-71). Figure 3-81 shows a network of protein- 
protein interactions for the five proteins that form this protein complex in a yeast 
cell. Four of the subunits of this ligase are located at the bottom right of this figure. 
The remaining subunit, the F-box protein that serves as its substrate-binding arm, 
appears as a set of 15 different gene products that bind to adaptor protein 2 (the 
Skp1 protein). Along the top and left of the figure are sets of additional protein 
interactions marked with yellow and green shading: as indicated, these protein 
sets function at the origin of DNA replication, in cell cycle regulation, in methi- 
onine synthesis, in the kinetochore, and in vacuolar H*-ATPase assembly. We 
shall use this figure to explain how such protein interaction maps are used, and 
what they do and do not mean. 


1. Protein interaction maps are useful for identifying the likely function of 
previously uncharacterized proteins. Examples are the products of the 
genes that have thus far only been inferred to exist from the yeast genome 
sequence, which are the three proteins in the figure that lack a simple 
three-letter abbreviation (white letters beginning with Y). The three in this 
diagram are F-box proteins that bind to Skp1; these are therefore likely to 
function as part of the ubiquitin ligase, serving as substrate-binding arms 
that recognize different target proteins. However, as we discuss next, nei- 
ther assignment can be considered certain without additional data. 


2. Protein interaction networks need to be interpreted with caution because, 
as a result of evolution making efficient use of each organism’s genetic 
information, the same protein can be used as part of different protein 
complexes that have different types of functions. Thus, although protein A 
binds to protein B and protein B binds to protein C, proteins A and C need 
not function in the same process. For example, we know from detailed 
biochemical studies that the functions of Skp1 in the kinetochore and in 
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Figure 3-80 A network of protein- 
binding interactions in a yeast cell. 
Each line connecting a pair of dots 
(proteins) indicates a protein-protein 
interaction. (From A. Guimerá and 

M. Sales-Pardo, Mol. Syst. Biol. 2:42, 
2006. With permission from Macmillan 
Publishers Ltd.) 
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Figure 3-81 A map of some protein-protein interactions of the SCF ubiquitin ligase and other proteins in the yeast 

S. cerevisiae. The symbols and/or colors used for the five proteins of the ligase are those in Figure 3-71. Note that 15 different 
F-box proteins are shown (purple); those with white lettering (beginning with Y) are known from the genome sequence as 

open reading frames. For additional details, see text. (Courtesy of Peter Bowers and David Eisenberg, UCLA-DOE Institute for 
Genomics and Proteomics, UCLA.) 


vacuolar Ht-ATPase assembly (yellow shading) are separate from its func- 
tion in the SCF ubiquitin ligase. In fact, only the remaining three functions 
of Skp1 illustrated in the diagram—methionine synthesis, cell cycle regula- 
tion, and origin of replication (green shading)—involve ubiquitylation. 


3. In cross-species comparisons, those proteins displaying similar patterns of 
interactions in the two protein interaction maps are likely to have the same 
function in the cell. Thus, as scientists generate more and more highly 
detailed maps for multiple organisms, the results will become increasingly 
useful for inferring protein function. These map comparisons will be a par- 
ticularly powerful tool for deciphering the functions of human proteins, 
because a vast amount of direct information about protein function can 
be obtained from genetic engineering, mutational, and genetic analyses in 
experimental organisms—such as yeast, worms, and flies—that are not fea- 
sible in humans. 


What does the future hold? There are likely to be on the order of 10,000 differ- 
ent proteins in a typical human cell, each of which interacts with 5 to 10 differ- 
ent partners. Despite the enormous progress made in recent years, we cannot yet 
claim to understand even the simplest known cells, such as the small Mycoplasma 
bacterium formed from only about 500 gene products (see Figure 1-10). How then 


SUMMARY 


can we hope to understand a human? Clearly, a great deal of new biochemistry 
will be essential, in which each protein in a particular interacting set is purified so 
that its chemistry and interactions can be dissected in a test tube. But in addition, 
more powerful ways of analyzing networks will be needed based on mathematical 
and computational tools not yet invented, as we shall emphasize in Chapter 8. 
Clearly, there are many wonderful challenges that remain for future generations 
of cell biologists. 


Summary 


Proteins can form enormously sophisticated chemical devices, whose functions 
largely depend on the detailed chemical properties of their surfaces. Binding sites 
for ligands are formed as surface cavities in which precisely positioned amino acid 
side chains are brought together by protein folding. In this way, normally unreac- 
tive amino acid side chains can be activated to make and break covalent bonds. 
Enzymes are catalytic proteins that greatly speed up reaction rates by binding the 
high-energy transition states for a specific reaction path; they also can perform acid 
catalysis and base catalysis simultaneously. The rates of enzyme reactions are often 
so fast that they are limited only by diffusion. Rates can be further increased only if 
enzymes that act sequentially on a substrate are joined into a single multienzyme 
complex, or if the enzymes and their substrates are attached to protein scaffolds, or 
otherwise confined to the same part of the cell. 

Proteins reversibly change their shape when ligands bind to their surface. The 
allosteric changes in protein conformation produced by one ligand affect the bind- 
ing of a second ligand, and this linkage between two ligand-binding sites provides 
a crucial mechanism for regulating cell processes. Metabolic pathways, for exam- 
ple, are controlled by feedback regulation: some small molecules inhibit and other 
small molecules activate enzymes early in a pathway. Enzymes controlled in this 
way generally form symmetric assemblies, allowing cooperative conformational 
changes to create a steep response to changes in the concentrations of the ligands 
that regulate them. 

The expenditure of chemical energy can drive unidirectional changes in protein 
shape. By coupling allosteric shape changes to ATP hydrolysis, for example, pro- 
teins can do useful work, such as generating a mechanical force or moving for long 
distances in a single direction. The three-dimensional structures of proteins have 
revealed how a small local change caused by nucleoside triphosphate hydrolysis is 
amplified to create major changes elsewhere in the protein. By such means, these 
proteins can serve as input-output devices that transmit information, as assem- 
bly factors, as motors, or as membrane-bound pumps. Highly efficient protein 
machines are formed by incorporating many different protein molecules into larger 
assemblies that coordinate the allosteric movements of the individual components. 
Such machines perform most of the important reactions in cells. 

Proteins are subjected to many reversible, post-translational modifications, such 
as the covalent addition of a phosphate or an acetyl group to a specific amino acid 
side chain. The addition of these modifying groups is used to regulate the activity of 
a protein, changing its conformation, its binding to other proteins, and its location 
inside the cell. A typical protein in a cell will interact with more than five different 
partners. Through proteomics, biologists can analyze thousands of proteins in one 
set of experiments. One important result is the production of detailed protein inter- 
action maps, which aim at describing all of the binding interactions between the 
thousands of distinct proteins in a cell. However, understanding these networks will 
require new biochemistry, through which small sets of interacting proteins can be 
purified and their chemistry dissected in detail. In addition, new computational 
techniques will be required to deal with the enormous complexity. 
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WHAT WE DON’T KNOW 


e What are the functions of the 
surprisingly large amount of unfolded 
polypeptide chain found in proteins? 


e How many types of protein functions 
remain to be discovered? What are 
the most promising approaches for 
discovering them’? 


e When will scientists be able to 
take any amino acid sequence and 
accurately predict both that protein’s 
three-dimensional conformations 
and its chemical properties? What 
breakthroughs will be needed to 
accomplish this important goal? 


e Are there ways to reveal the detailed 
workings of a protein machine that do 
not require the purification of each of 
its component parts in large amounts, 
so that the machine’s functions can 
be reconstituted and dissected using 
chemical techniques in a test tube? 


e What are the roles of the dozens 
of different tyoes of covalent 
modifications of proteins that have 
been found in addition to those listed 
in Table 3-3? Which ones are critical 
for cell function and why? 


e Why is amyloid toxic to cells 

and how does it contribute to 
neurodegenerative diseases such as 
Alzheimer’s disease? 
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PROBLEMS 


Which statements are true? Explain why or why not. 


3-1 Each strand in a p sheet is a helix with two amino 
acids per turn. 


3-2 Intrinsically disordered regions of proteins can be 
identified using bioinformatic methods to search genes for 
encoded amino acid sequences that possess high hydro- 
phobicity and low net charge. 


3-3 Loops of polypeptide that protrude from the sur- 
face of a protein often form the binding sites for other mol- 
ecules. 


3-4 An enzyme reaches a maximum rate at high sub- 
strate concentration because it has a fixed number of 
active sites where substrate binds. 


3-5 Higher concentrations of enzyme give rise to a 
higher turnover number. 


3-6 Enzymes that undergo cooperative allosteric tran- 
sitions invariably consist of symmetric assemblies of mul- 
tiple subunits. 


3-7 Continual addition and removal of phosphates 
by protein kinases and protein phosphatases is wasteful 
of energy—since their combined action consumes ATP— 
but it is a necessary consequence of effective regulation by 
phosphorylation. 


Discuss the following problems. 


3-8 Consider the following statement. “To produce 
one molecule of each possible kind of polypeptide chain, 
300 amino acids in length, would require more atoms than 
exist in the universe.’ Given the size of the universe, do you 
suppose this statement could possibly be correct? Since 
counting atoms is a tricky business, consider the problem 
from the standpoint of mass. The mass of the observable 
universe is estimated to be about 10% grams, give or take 
an order of magnitude or so. Assuming that the average 
mass of an amino acid is 110 daltons, what would be the 
mass of one molecule of each possible kind of polypeptide 
chain 300 amino acids in length? Is this greater than the 
mass of the universe? 


3-9 Acommonstrategy for identifying distantly related 
homologous proteins is to search the database using a short 
signature sequence indicative of the particular protein 
function. Why is it better to search with a short sequence 
than with a long sequence? Do you not have more chances 
for a “hit” in the database with a long sequence? 


3-10 ‘The so-called kelch motif consists of a four- 
stranded ß sheet, which forms what is known as a Bf pro- 
peller. It is usually found to be repeated four to seven times, 
forming a kelch repeat domain in a multidomain protein. 


One such kelch repeat domain is shown in Figure Q3-1. 
Would you classify this domain as an “in-line” or “plug-in” 
type domain? 


Figure Q3-1 The 

kelch repeat domain of 
galactose oxidase from 

D. dendroides (Problem 
3-10). The seven individual 
B propellers are color 
coded and labeled. The 

N- and C-termini are 
indicated by N and C. 





3-11 Titin, which has a molecular weight of about 
3 x 10°, is the largest polypeptide yet described. Titin 
molecules extend from muscle thick filaments to the 
Z disc; they are thought to act as springs to keep the thick 
filaments centered in the sarcomere. Titin is composed ofa 
large number of repeated immunoglobulin (Ig) sequences 
of 89 amino acids, each of which is folded into a domain 
about 4 nm in length (Figure Q3-2A). 

You suspect that the springlike behavior of titin is 
caused by the sequential unfolding (and refolding) of indi- 
vidual Ig domains. You test this hypothesis using the atomic 
force microscope, which allows you to pick up one end of 
a protein molecule and pull with an accurately measured 
force. For a fragment of titin containing seven repeats of the 
Ig domain, this experiment gives the sawtooth force-ver- 
sus-extension curve shown in Figure Q3-2B. If the experi- 
ment is repeated in a solution of 8 M urea (a protein dena- 
turant), the peaks disappear and the measured extension 
becomes much longer for a given force. If the experiment 
is repeated after the protein has been cross-linked by treat- 
ment with glutaraldehyde, once again the peaks disappear 
but the extension becomes much smaller for a given force. 


(A) 
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Figure Q3-2 Springlike behavior of titin (Problem 3-11). (A) The 
structure of an individual Ig domain. (B) Force in piconewtons versus 
extension in nanometers obtained by atomic force microscopy. 


CHAPTER 3 END-OF-CHAPTER PROBLEMS 


A. Are the data consistent with your hypothesis that 
titin’s springlike behavior is due to the sequential unfold- 
ing of individual Ig domains? Explain your reasoning. 

B. Is the extension for each putative domain-un- 
folding event the magnitude you would expect? (In an 
extended polypeptide chain, amino acids are spaced at 
intervals of 0.34 nm.) 

C. Why is each successive peak in Figure Q3-2B a lit- 
tle higher than the one before? 

D. Why does the force collapse so abruptly after each 
peak? 


3-12 Rous sarcoma virus (RSV) carries an oncogene 
called Src, which encodes a continuously active protein 
tyrosine kinase that leads to unchecked cell proliferation. 
Normally, Src carries an attached fatty acid (myristoylate) 
group that allows it to bind to the cytoplasmic side of the 
plasma membrane. A mutant version of Src that does not 
allow attachment of myristoylate does not bind to the 
membrane. Infection of cells with RSV encoding either the 
normal or the mutant form of Src leads to the same high 
level of protein tyrosine kinase activity, but the mutant Src 
does not cause cell proliferation. 

A. Assuming that the normal Src is all bound to the 
plasma membrane and that the mutant Src is distributed 
throughout the cytoplasm, calculate their relative concen- 
trations in the neighborhood of the plasma membrane. For 
the purposes of this calculation, assume that the cell is a 
sphere with a radius (r) of 10 um and that the mutant Src 
is distributed throughout the cell, whereas the normal Src 
is confined to a 4-nm-thick layer immediately beneath the 
membrane. [For this problem, assume that the membrane 
has no thickness. The volume of a sphere is (4/3) 27°. ] 

B. The target (X) for phosphorylation by Src resides in 
the membrane. Explain why the mutant Src does not cause 
cell proliferation. 


3-13 An antibody binds to another protein with an 
equilibrium constant, K, of 5 x 109 M~t. When it binds to 
a second, related protein, it forms three fewer hydrogen 
bonds, reducing its binding affinity by 11.9 kJ/mole. What 
is the K for its binding to the second protein? (Free-energy 
change is related to the equilibrium constant by the equa- 
tion AG? = -2.3 RT log K, where R is 8.3 x 10° kJ/(mole K) 
and Tis 310 K.) 


3-14 ‘The protein SmpB binds to a special species of 
tRNA, tmRNA, to eliminate the incomplete proteins made 
from truncated mRNAs in bacteria. If the binding of SnpB 
to tmRNA is plotted as fraction tmRNA bound versus SmpB 
concentration, one obtains a symmetrical S-shaped curve 
as shown in Figure Q3-3. This curve is a visual display of 
a very useful relationship between Kg and concentration, 
which has broad applicability. The general expression for 
fraction of ligand bound is derived from the equation for 
Ka (Ka = [Pr][L]/[Pr-L]) by substituting ([L]ror - [L]) for 
[Pr-L] and rearranging. Because the total concentration of 
ligand ([L]ror) is equal to the free ligand ([L]) plus bound 
ligand ([Pr-L]), 
fraction bound = [Pr-L]/|L]ror = [Pr]/([Pr] + Ka) 
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1.0 Figure Q3-3 Fraction 
of tmRNA bound versus 
SmpB concentration 
(Problem 3-14). 
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For SmpB and tmRNA, the fraction bound = [SmpB- 
tmRNA]/[tmRNA]ror = [SmpB]/({[SmpB] + Ka). Using 
this relationship, calculate the fraction of tmRNA bound 
for SmpB concentrations equal to 104Kg, 10° Kg, 107 Ka, 
10! Ka, Ka 10°! Ka, 10°? Ka, 10-3 Kg, and 10-4 Kg. 


3-15 Many enzymes obey simple Michaelis-Menten 
kinetics, which are summarized by the equation 


rate = Vinax [S]/([S] F Km) 


where Vmax = maximum velocity, [S] = concentration of 
substrate, and Km = the Michaelis constant. 

It is instructive to plug a few values of [S] into the 
equation to see how rate is affected. What are the rates for 
[S] equal to zero, equal to Km, and equal to infinite concen- 
tration? 


3-16 The enzyme hexokinase adds a phosphate to 
D-glucose but ignores its mirror image, L-glucose. Suppose 
that you were able to synthesize hexokinase entirely from 
D-amino acids, which are the mirror image of the normal 
L-amino acids. 

A. Assuming that the “D” enzyme would fold to a sta- 
ble conformation, what relationship would you expect it to 
bear to the normal “L’ enzyme? 

B. Do you suppose the “D” enzyme would add a 
phosphate to L-glucose, and ignore D-glucose? 


3-17 How do you suppose that a molecule of hemoglo- 
bin is able to bind oxygen efficiently in the lungs, and yet 
release it efficiently in the tissues? 


3-18 Synthesis of the purine nucleotides AMP and 
GMP proceeds by a branched pathway starting with ribose 
5-phosphate (R5P), as shown schematically in Figure 
Q3-4. Using the principles of feedback inhibition, propose 
a regulatory strategy for this pathway that ensures an ade- 
quate supply of both AMP and GMP and minimizes the 
buildup of the intermediates (A-I) when supplies of AMP 
and GMP are adequate. 


F —> G— AMP 
R5P —> A —> B —> C —> D > E 
H —> | —> [GMP 


Figure Q3-4 Schematic diagram of the metabolic pathway for 
synthesis of AMP and GMP from R5P (Problem 3-18). 
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BASIC GENETIC MECHANISMS 


DNA, Chromosomes, 
and Genomes 


Life depends on the ability of cells to store, retrieve, and translate the genetic 
instructions required to make and maintain a living organism. This hereditary 
information is passed on from a cell to its daughter cells at cell division, and from 
one generation of an organism to the next through the organism’s reproductive 
cells. The instructions are stored within every living cell as its genes, the infor- 
mation-containing elements that determine the characteristics of a species as a 
whole and of the individuals within it. 

As soon as genetics emerged as a science at the beginning of the twentieth cen- 
tury, scientists became intrigued by the chemical structure of genes. The informa- 
tion in genes is copied and transmitted from cell to daughter cell millions of times 
during the life of a multicellular organism, and it survives the process essentially 
unchanged. What form of molecule could be capable of such accurate and almost 
unlimited replication and also be able to exert precise control, directing multi- 
cellular development as well as the daily life of every cell? What kind of instruc- 
tions does the genetic information contain? And how can the enormous amount 
of information required for the development and maintenance of an organism fit 
within the tiny space of a cell? 

The answers to several of these questions began to emerge in the 1940s. At 
this time researchers discovered, from studies in simple fungi, that genetic infor- 
mation consists largely of instructions for making proteins. Proteins are phenom- 
enally versatile macromolecules that perform most cell functions. As we saw in 
Chapter 3, they serve as building blocks for cell structures and form the enzymes 
that catalyze most of the cell’s chemical reactions. They also regulate gene expres- 
sion (Chapter 7), and they enable cells to communicate with each other (Chapter 
15) and to move (Chapter 16). The properties and functions of cells and organisms 
are determined to a great extent by the proteins that they are able to make. 

Painstaking observations of cells and embryos in the late nineteenth century 
had led to the recognition that the hereditary information is carried on chro- 
mosomes—threadlike structures in the nucleus of a eukaryotic cell that become 
visible by light microscopy as the cell begins to divide (Figure 4-1). Later, when 
biochemical analysis became possible, chromosomes were found to consist of 
deoxyribonucleic acid (DNA) and protein, with both being present in roughly the 
same amounts. For many decades, the DNA was thought to be merely a structural 
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Figure 4—1 Chromosomes in cells. (A) Two adjacent plant cells 
photographed through a light microscope. The DNA has been stained with a 
fluorescent dye (DAPI) that binds to it. The DNA is present in chromosomes, 
which become visible as distinct structures in the light microscope only when 
they become compact, sausage-shaped structures in preparation for cell 
division, as shown on the left. The cell on the right, which is not dividing, 
contains identical chromosomes, but they cannot be clearly distinguished 

at this phase in the cell’s life cycle, because they are in a more extended 
conformation. (B) Schematic diagram of the outlines of the two cells along 
with their chromosomes. (A, courtesy of Peter Shaw.) 


element. However, the other crucial advance made in the 1940s was the identifica- 
tion of DNA as the likely carrier of genetic information. This breakthrough in our 
understanding of cells came from studies of inheritance in bacteria (Figure 4-2). 
But still, as the 1950s began, both how proteins could be specified by instructions 
in the DNA and how this information might be copied for transmission from cell 
to cell seemed completely mysterious. The puzzle was suddenly solved in 1953, 
when James Watson and Francis Crick derived the mechanism from their model 
of DNA structure. As outlined in Chapter 1, the determination of the double-he- 
lical structure of DNA immediately solved the problem of how the information 
in this molecule might be copied, or replicated. It also provided the first clues as 
to how a molecule of DNA might use the sequence of its subunits to encode the 
instructions for making proteins. Today, the fact that DNA is the genetic material 
is so fundamental to biological thought that it is difficult to appreciate the enor- 
mous intellectual gap that was filled by this breakthrough discovery. 

We begin this chapter by describing the structure of DNA. We see how, despite 
its chemical simplicity, the structure and chemical properties of DNA make it 
ideally suited as the raw material of genes. We then consider how the many pro- 
teins in chromosomes arrange and package this DNA. The packing has to be done 
in an orderly fashion so that the chromosomes can be replicated and apportioned 
correctly between the two daughter cells at each cell division. And it must also 
allow access to chromosomal DNA, both for the enzymes that repair DNA damage 
and for the specialized proteins that direct the expression of its many genes. 

In the past two decades, there has been a revolution in our ability to deter- 
mine the exact order of subunits in DNA molecules. As a result, we now know the 
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Figure 4—2 The first experimental 
demonstration that DNA is the genetic 
material. These experiments, carried out 
in the 1920s (A) and 1940s (B), showed 
that adding purified DNA to a bacterium 
changed the bacterium’s properties and 
that this change was faithfully passed 

on to subsequent generations. Two 
closely related strains of the bacterium 
Streptococcus pneumoniae differ from 
each other in both their appearance under 
the microscope and their pathogenicity. 
One strain appears smooth (S) and causes 
death when injected into mice, and the 
other appears rough (R) and is nonlethal. 
(A) An initial experiment shows that some 
substance present in the S strain can 
change (or transform) the R strain into the 
S strain and that this change is inherited by 
subsequent generations of bacteria. 

(B) This experiment, in which the R strain 
has been incubated with various classes 
of biological molecules purified from the 

S strain, identifies the active substance 

as DNA. 
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sequence of the 3.2 billion nucleotide pairs that provide the information for pro- 
ducing a human adult from a fertilized egg, as well as having the DNA sequences 
for thousands of other organisms. Detailed analyses of these sequences are pro- 
viding exciting insights into the process of evolution, and it is with this subject that 
the chapter ends. 

This is the first of four chapters that deal with basic genetic mechanisms—the 
ways in which the cell maintains, replicates, and expresses the genetic informa- 
tion carried in its DNA. In the next chapter (Chapter 5), we shall discuss the mech- 
anisms by which the cell accurately replicates and repairs DNA; we also describe 
how DNA sequences can be rearranged through the process of genetic recombi- 
nation. Gene expression—the process through which the information encoded in 
DNA is interpreted by the cell to guide the synthesis of proteins—is the main topic 
of Chapter 6. In Chapter 7, we describe how this gene expression is controlled by 
the cell to ensure that each of the many thousands of proteins and RNA molecules 
encrypted in its DNA is manufactured only at the proper time and place in the life 
of a cell. 


THE STRUCTURE AND FUNCTION OF DNA 


Biologists in the 1940s had difficulty in conceiving how DNA could be the genetic 
material. The molecule seemed too simple: a long polymer composed of only four 
types of nucleotide subunits, which resemble one another chemically. Early in the 
1950s, DNA was examined by x-ray diffraction analysis, a technique for determin- 
ing the three-dimensional atomic structure of a molecule (discussed in Chapter 
8). The early x-ray diffraction results indicated that DNA was composed of two 
strands of the polymer wound into a helix. The observation that DNA was dou- 
ble-stranded provided one of the major clues that led to the Watson-Crick model 
for DNA structure that, as soon as it was proposed in 1953, made DNA’s potential 
for replication and information storage apparent. 


A DNA Molecule Consists of Two Complementary Chains of 
Nucleotides 


A deoxyribonucleic acid (DNA) molecule consists of two long polynucleotide 
chains composed of four types of nucleotide subunits. Each of these chains is 
known as a DNA chain, or a DNA strand. The chains run antiparallel to each other, 
and hydrogen bonds between the base portions of the nucleotides hold the two 
chains together (Figure 4-3). As we saw in Chapter 2 (Panel 2-6, pp. 100-101), 
nucleotides are composed of a five-carbon sugar to which are attached one or 
more phosphate groups and a nitrogen-containing base. In the case of the nucle- 
otides in DNA, the sugar is deoxyribose attached to a single phosphate group 
(hence the name deoxyribonucleic acid), and the base may be either adenine (A), 
cytosine (C), guanine (G), or thymine (T). The nucleotides are covalently linked 
together in a chain through the sugars and phosphates, which thus form a “back- 
bone” of alternating sugar-phosphate-sugar-phosphate. Because only the base 
differs in each of the four types of nucleotide subunit, each polynucleotide chain 
in DNA is analogous to a sugar-phosphate necklace (the backbone), from which 
hang the four types of beads (the bases A, C, G, and T). These same symbols (A, 
C, G, and T) are commonly used to denote either the four bases or the four entire 
nucleotides—that is, the bases with their attached sugar and phosphate groups. 
The way in which the nucleotides are linked together gives a DNA strand a 
chemical polarity. If we think of each sugar as a block with a protruding knob (the 
5’ phosphate) on one side and a hole (the 3’ hydroxyl) on the other (see Figure 
4-3), each completed chain, formed by interlocking knobs with holes, will have 
all of its subunits lined up in the same orientation. Moreover, the two ends of the 
chain will be easily distinguishable, as one has a hole (the 3’ hydroxyl) and the 
other a knob (the 5’ phosphate) at its terminus. This polarity in a DNA chain is 
indicated by referring to one end as the 3’ end and the other as the 5’ end, names 
derived from the orientation of the deoxyribose sugar. With respect to DNA’s 
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information-carrying capacity, the chain of nucleotides in a DNA strand, being 
both directional and linear, can be read in much the same way as the letters on 
this page. 

The three-dimensional structure of DNA—the DNA double helix—arises from 
the chemical and structural features of its two polynucleotide chains. Because 
these two chains are held together by hydrogen-bonding between the bases on 
the different strands, all the bases are on the inside of the double helix, and the 
sugar-phosphate backbones are on the outside (see Figure 4-3). In each case, a 
bulkier two-ring base (a purine; see Panel 2-6, pp. 100-101) is paired with a sin- 
gle-ring base (a pyrimidine): A always pairs with T, and G with C (Figure 4-4). 
This complementary base-pairing enables the base pairs to be packed in the ener- 
getically most favorable arrangement in the interior of the double helix. In this 
arrangement, each base pair is of similar width, thus holding the sugar-phosphate 
backbones a constant distance apart along the DNA molecule. To maximize the 
efficiency of base-pair packing, the two sugar-phosphate backbones wind around 
each other to form a right-handed double helix, with one complete turn every ten 
base pairs (Figure 4-5). 

The members of each base pair can fit together within the double helix only 
if the two strands of the helix are antiparallel—that is, only if the polarity of one 
strand is oriented opposite to that of the other strand (see Figures 4-3 and 4-4). 
A consequence of DNA’s structure and base-pairing requirements is that each 
strand of a DNA molecule contains a sequence of nucleotides that is exactly com- 
plementary to the nucleotide sequence of its partner strand. 


Figure 4-3 DNA and its building blocks. 
DNA is made of four types of nucleotides, 
which are linked covalently into a 
polynucleotide chain (a DNA strand) with 

a sugar-phosphate backbone from which 
the bases (A, C, G, and T) extend. A DNA 
molecule is composed of two antiparallel 
DNA strands held together by hydrogen 
bonds between the paired bases. The 
arrowheads at the ends of the DNA strands 
indicate the polarities of the two strands. In 
the diagram at the bottom left of the figure, 
the DNA molecule is shown straightened 
out; in reality, it is twisted into a double 
helix, as shown on the right. For details, 
see Figure 4—5 and Movie 4.1. 
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Figure 4-4 Complementary base pairs 
in the DNA double helix. The shapes 
and chemical structures of the bases 
allow hydrogen bonds to form efficiently 
only between A and T and between G 
and C, because atoms that are able to 
form hydrogen bonds (see Panel 2-3, 
pp. 94-95) can then be brought close 
together without distorting the double helix. 
\ / \ As indicated, two hydrogen bonds form 

N Hilti CH3 between A and T, while three form between 
adenine H thymine G and C. The bases can pair in this way 
only if the two polynucleotide chains that 
contain them are antiparallel to each other. 
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The Structure of DNA Provides a Mechanism for Heredity 


The discovery of the structure of DNA immediately suggested answers to the two 
most fundamental questions about heredity. First, how could the information to 
specify an organism be carried in a chemical form? And second, how could this 
information be duplicated and copied from generation to generation? 

The answer to the first question came from the realization that DNA is a linear 
polymer of four different kinds of monomer, strung out in a defined sequence like 
the letters of a document written in an alphabetic script. 

The answer to the second question came from the double-stranded nature of 
the structure: because each strand of DNA contains a sequence of nucleotides 
that is exactly complementary to the nucleotide sequence of its partner strand, 
each strand can act as a template, or mold, for the synthesis of a new complemen- 
tary strand. In other words, if we designate the two DNA strands as S and S’, strand 


of 6 Figure 4-5 The DNA double helix. 

(A) A space-filling model of 1.5 turns of 
the DNA double helix. Each turn of DNA is 
made up of 10.4 nucleotide pairs, and the 
center-to-center distance between adjacent 
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smaller the minor groove, as indicated. 
(B) A short section of the double helix 
viewed from its side, showing four base 
pairs. The nucleotides are linked together 
ugar covalently by phosphodiester bonds that 
join the 3’-hydroxyl (-OH) group of one 
sugar to the 5’-hydroxyl group of the next 
sugar. Thus, each polynucleotide strand 
has a chemical polarity; that is, its two 
ends are chemically different. The 5’ end 
of the DNA polymer is by convention often 
` sendi *, illustrated carrying a phosphate group, 
(A) (B) while the 3’ end is shown with a hydroxyl. 
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S can serve as a template for making a new strand S’, while strand S’ can serve as a 
template for making a new strand S (Figure 4-6). Thus, the genetic information in 
DNA can be accurately copied by the beautifully simple process in which strand 
S separates from strand S’, and each separated strand then serves as a template 
for the production of a new complementary partner strand that is identical to its 
former partner. 

The ability of each strand of a DNA molecule to act as a template for producing 
a complementary strand enables a cell to copy, or replicate, its genome before 
passing it on to its descendants. We shall describe the elegant machinery that the 
cell uses to perform this task in Chapter 5. 

Organisms differ from one another because their respective DNA molecules 
have different nucleotide sequences and, consequently, carry different biological 
messages. But how is the nucleotide alphabet used to make messages, and what 
do they spell out? 

As discussed above, it was known well before the structure of DNA was deter- 
mined that genes contain the instructions for producing proteins. If genes are 
made of DNA, the DNA must therefore somehow encode proteins (Figure 4-7). 
As discussed in Chapter 3, the properties of a protein, which are responsible for its 
biological function, are determined by its three-dimensional structure. This struc- 
ture is determined in turn by the linear sequence of the amino acids of which it is 
composed. The linear sequence of nucleotides in a gene must therefore somehow 
spell out the linear sequence of amino acids in a protein. The exact correspon- 
dence between the four-letter nucleotide alphabet of DNA and the twenty-letter 
amino acid alphabet of proteins—the genetic code—is not at all obvious from the 
DNA structure, and it took over a decade after the discovery of the double helix 
before it was worked out. In Chapter 6, we will describe this code in detail in the 
course of elaborating the process of gene expression, through which a cell converts 
the nucleotide sequence of a gene first into the nucleotide sequence of an RNA 
molecule, and then into the amino acid sequence of a protein. 

The complete store of information in an organism’s DNA is called its genome, 
and it specifies all the RNA molecules and proteins that the organism will ever 
synthesize. (The term genome is also used to describe the DNA that carries this 
information.) The amount of information contained in genomes is staggering. The 
nucleotide sequence of a very small human gene, written out in the four-letter 
nucleotide alphabet, occupies a quarter of a page of text (Figure 4-8), while the 
complete sequence of nucleotides in the human genome would fill more than a 
thousand books the size of this one. In addition to other critical information, it 
includes roughly 21,000 protein-coding genes, which (through alternative splic- 
ing; see p. 415) give rise to a much greater number of distinct proteins. 


In Eukaryotes, DNA Is Enclosed in a Cell Nucleus 


As described in Chapter 1, nearly all the DNA in a eukaryotic cell is sequestered in 
a nucleus, which in many cells occupies about 10% of the total cell volume. This 
compartment is delimited by a nuclear envelope formed by two concentric lipid 


Figure 4-6 DNA as a template for its 
own duplication. Because the nucleotide 
A successfully pairs only with T, and G 
pairs with C, each strand of DNA can act 
as a template to specify the sequence of 
nucleotides in its complementary strand. 

In this way, double-helical DNA can be 
copied precisely, with each parental DNA 
helix producing two identical daughter DNA 
helices. 
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Figure 4-7 The relationship between 
genetic information carried in DNA and 
proteins. (Discussed in Chapter 1.) 
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Figure 4-7 The nucleotide sequence of the human f-globin gene. By 
convention, a nucleotide sequence is written from its 5’ end to its 3’ end, 
and it should be read from left to right in successive lines down the page as 
though it were normal English text. This gene carries the information for the 
amino acid sequence of one of the two types of subunits of the hemoglobin 
molecule; a different gene, the a-globin gene, carries the information for the 
other. (Hemoglobin, the protein that carries oxygen in the blood, has four 
subunits, two of each type.) Only one of the two strands of the DNA double 
helix containing the B-globin gene is shown; the other strand has the exact 
complementary sequence. The DNA sequences highlighted in yellow show 
the three regions of the gene that specify the amino acid sequence for the 
B-globin protein. We shall see in Chapter 6 how the cell splices these three 
sequences together at the level of messenger RNA in order to synthesize a 
full-length B-globin protein. 


bilayer membranes (Figure 4-9). These membranes are punctured at intervals 
by large nuclear pores, through which molecules move between the nucleus and 
the cytosol. The nuclear envelope is directly connected to the extensive system 
of intracellular membranes called the endoplasmic reticulum, which extend out 
from it into the cytoplasm. And it is mechanically supported by a network of inter- 
mediate filaments called the nuclear lamina—a thin feltlike mesh just beneath 
the inner nuclear membrane (see Figure 4-9B). 

The nuclear envelope allows the many proteins that act on DNA to be concen- 
trated where they are needed in the cell, and, as we see in subsequent chapters, it 
also keeps nuclear and cytosolic enzymes separate, a feature that is crucial for the 
proper functioning of eukaryotic cells. 


Summary 


Genetic information is carried in the linear sequence of nucleotides in DNA. Each 
molecule of DNA is a double helix formed from two complementary antiparallel 
strands of nucleotides held together by hydrogen bonds between G-C and A-T base 
pairs. Duplication of the genetic information occurs by the use of one DNA strand 
as a template for the formation of a complementary strand. The genetic information 
stored in an organism’s DNA contains the instructions for all the RNA molecules and 
proteins that the organism will ever synthesize and is said to comprise its genome. 
In eukaryotes, DNA is contained in the cell nucleus, a large membrane-bound com- 
partment. 


CHROMOSOMAL DNA AND ITS PACKAGING IN THE 
CHROMATIN FIBER 


The most important function of DNA is to carry genes, the information that spec- 
ifies all the RNA molecules and proteins that make up an organism— including 
information about when, in what types of cells, and in what quantity each RNA 
molecule and protein is to be made. The nuclear DNA of eukaryotes is divided up 
into chromosomes, and in this section we see how genes are typically arranged on 
each chromosome. In addition, we describe the specialized DNA sequences that 
are required for a chromosome to be accurately duplicated as a separate entity 
and passed on from one generation to the next. 

We also confront the serious challenge of DNA packaging. If the double helices 
comprising all 46 chromosomes in a human cell could be laid end to end, they 
would reach approximately 2 meters; yet the nucleus, which contains the DNA, is 
only about 6 um in diameter. This is geometrically equivalent to packing 40 km (24 
miles) of extremely fine thread into a tennis ball. The complex task of packaging 
DNA is accomplished by specialized proteins that bind to the DNA and fold it, 
generating a series of organized coils and loops that provide increasingly higher 
levels of organization, and prevent the DNA from becoming an unmanageable 
tangle. Amazingly, although the DNA is very tightly compacted, it nevertheless 
remains accessible to the many enzymes in the cell that replicate it, repair it, and 
use its genes to produce RNA molecules and proteins. 
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CCCT GTGGAGCCACACCCTAGGGTTGGCCA 
ATCTACTCCCAGGAGCAGGGAGGGCAGGAG 
CCAGGGCTGGGCATAAAAGTCAGGGCAGAG 
CCATCTATTGCTTACATTTGCTTCTGACAC 
AACTGTGTTCACTAGCAACTCAAACAGACA 


TTGGTATCAAGGTTACAAGACAGGT 
TTAAGGAGACCAATAGAAACTGGGCATGTG 
GAGACAGAGAAGACTCTTGGGTTTCTGATA 
GGCACTGACTCTCTCTGCCTATTGGTCTAT 
TTTCCCACCCTTAG 








AGTCTATGGGACCCTIGATGTTTTCTTTCC 
CCTTCTTTTCTATGGTTAAGTTCATGTCAT 
AGGAAGGGGAGAAGTAACAGGGTACAGTTT 
AGAATGGGAAACAGACGAATGATTGCATCA 
GTGTGGAAGTCTCAGGATCGTTTTAGTTTC 
TTTTATTIGCTGTTCATAACAATTGTITTC 
TITTGTTTAATTCTTGCTTTICTITTTITTTT 
CTTCTCCGCAATTTTTACTATTATACTTAA 
TGCCTTAACATTGTGTATAACAAAAGGAAA 
TATCTCTGAGATACATTAAGTAACTTAAAA 
AAAAACTTTACACAGTCTGCCTAGTACATT 
ACTATTTGGAATATATGTGTGCTTATTTGC 
ATATTCATAATCTCCCTACTTTATTTTCTT 
TTATTTTTAATTGATACATAATCATTATAC 
ATATTTATGGGTTAAAGTGTAATGTTTTAA 
TATGTGTACACATATTGACCAAATCAGGGT 
AATTTTGCATTTGTAATTTTAAAAAATGCT 
TTICTTCTTTTAATATACTTTTTTGTTTATC 
TTATTTCTAATACTTTCCCTAATCTCTTTC 
TT TCAGGGCAATAATGATACAATGTATCAT 
GCCTCTTTGCACCATTCTAAAGAATAACAG 
TGATAATTTCTGGGTTAAGGCAATAGCAAT 
ATTTCTGCATATAAATATTTCTGCATATAA 
ATTGTAACTGATGTAAGAGGTTTCATATTG 
CTAATAGCAGCTACAATCCAGCTACCATTC 
TGCTTTTATTTTATGGTTGGGATAAGGCTG 
GATTATTCTGAGTCCAAGCTAGGCCCTTTT 
GCTAATCATGTTCATACCTCTTATCTTCCT 





GCTCGCTTTCTTGC 
TGTCCAATTTCTATTAAAGGTTCCTTTGTT 
CCCTAAGTCCAACTACTAAACTGGGGGATA 
TTATGAAGGGCCTTGAGCATCTGGATTCTG 
CCTAATAAAAAACATTTATTTTCATTGCAA 
TGATGTATTTAAATTATTTCTGAATATTTT 
ACTAAAAAGGGAATGTGGGAGGTCAGTGCA 
TT TAAAACATAAAGAAATGATGAGCTGTTC 
AAACCT TGGGAAAATACACTATATCT TAAA 
CTCCATGAAAGAAGGTGAGGCTGCAACCAG 
CTAATGCACATTGGCAACAGCCCCTGATGC 
CTATGCCTTATTCATCCCTCAGAAAAGGAT 
TCTTGTAGAGGCTTGATTTGCAGGTTAAAG 
TTTTGCTATGCTGTATTTTACATTACTTAT 
TGTTTTAGCTGTCCTCATGAATGTCTTTTC 
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Figure 4-9 A cross-sectional view of a typical cell nucleus. (A) Electron micrograph of a thin section through the nucleus of 
a human fibroblast. (B) Schematic drawing, showing that the nuclear envelope consists of two membranes, the outer one being 
continuous with the endoplasmic reticulum (ER) membrane (see also Figure 12-7). The space inside the endoplasmic reticulum 
(the ER lumen) is colored yellow; it is continuous with the space between the two nuclear membranes. The lipid bilayers of the 
inner and outer nuclear membranes are connected at each nuclear pore. A sheetlike network of intermediate filaments (brown) 
inside the nucleus forms the nuclear lamina (brown), providing mechanical support for the nuclear envelope (for details, see 
Chapter 12). The dark-staining heterochromatin contains specially condensed regions of DNA that will be discussed later. (A, 
courtesy of E.G. Jordan and J. McGovern.) 


Eukaryotic DNA Is Packaged into a Set of Chromosomes 


Each chromosome in a eukaryotic cell consists of a single, enormously long linear 
DNA molecule along with the proteins that fold and pack the fine DNA thread into 
a more compact structure. In addition to the proteins involved in packaging, chro- 
mosomes are also associated with many other proteins (as well as numerous RNA 
molecules). These are required for the processes of gene expression, DNA repli- 
cation, and DNA repair. The complex of DNA and tightly bound protein is called 
chromatin (from the Greek chroma, “color,” because of its staining properties). 
Bacteria lack a special nuclear compartment, and they generally carry their 
genes on a single DNA molecule, which is often circular (see Figure 1-24). This 
DNA is also associated with proteins that package and condense it, but they are 
different from the proteins that perform these functions in eukaryotes. Although 
the bacterial DNA with its attendant proteins is often called the bacterial “chro- 
mosome,’ it does not have the same structure as eukaryotic chromosomes, and 
less is known about how the bacterial DNA is packaged. Therefore, our discussion 
of chromosome structure will focus almost entirely on eukaryotic chromosomes. 
With the exception of the gametes (eggs and sperm) and a few highly special- 
ized cell types that cannot multiply and either lack DNA altogether (for example, 
red blood cells) or have replicated their DNA without completing cell division (for 
example, megakaryocytes), each human cell nucleus contains two copies of each 
chromosome, one inherited from the mother and one from the father. The mater- 
nal and paternal chromosomes of a pair are called homologous chromosomes 
(homologs). The only nonhomologous chromosome pairs are the sex chromo- 
somes in males, where a Y chromosome is inherited from the father and an X 
chromosome from the mother. Thus, each human cell contains a total of 46 chro- 
mosomes—22 pairs common to both males and females, plus two so-called sex 
chromosomes (X and Y in males, two Xs in females). These human chromosomes 
can be readily distinguished by “painting” each one a different color using a tech- 
nique based on DNA hybridization (Figure 4-10). In this method (described in 
detail in Chapter 8), a short strand of nucleic acid tagged with a fluorescent dye 
serves as a “probe” that picks out its complementary DNA sequence, lighting up 
the target chromosome at any site where it binds. Chromosome painting is most 
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frequently done at the stage in the cell cycle called mitosis, when chromosomes 
are especially compacted and easy to visualize (see below). 

Another more traditional way to distinguish one chromosome from another 
is to stain them with dyes that reveal a striking and reproducible pattern of bands 
along each mitotic chromosome (Figure 4-11). These banding patterns presum- 
ably reflect variations in chromatin structure, but their basis is not well under- 
stood. Nevertheless, the pattern of bands on each type of chromosome is unique, 
and it provided the initial means to identify and number each human chromo- 
some reliably. 
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Figure 4-10 The complete set of human 
chromosomes. These chromosomes, 
from a female, were isolated from a cell 
undergoing nuclear division (mitosis) 

and are therefore highly compacted. 

Each chromosome has been “painted” a 
different color to permit its unambiguous 
identification under the fluorescence 
microscope, using a technique called 
“spectral karyotyping.” Chromosome 
painting can be performed by exposing 
the chromosomes to a large collection of 
DNA molecules whose sequence matches 
known DNA sequences from the human 
genome. The set of sequences matching 
each chromosome is coupled to a different 
combination of fluorescent dyes. DNA 
molecules derived from chromosome 1 are 
labeled with one specific dye combination, 
those from chromosome 2 with another, 
and so on. Because the labeled DNA can 
form base pairs, or hybridize, only to the 
chromosome from which it was derived, 
each chromosome becomes labeled 

with a different combination of dyes. For 
such experiments, the chromosomes are 
subjected to treatments that separate 

the two strands of double-helical DNA in 

a way that permits base-pairing with the 
single-stranded labeled DNA, but keeps 
the overall chromosome structure relatively 
intact. (A) The chromosomes visualized as 
they originally spilled from the lysed cell. 
(B) The same chromosomes artificially 
lined up in their numerical order. This 
arrangement of the full chromosome 

set is called a karyotype. (Adapted from 

N. McNeil and T. Ried, Expert Rev. Mol. 
Med. 2:1—14, 2000. With permission from 
Cambridge University Press.) 


Figure 4-11 The banding patterns of 
human chromosomes. Chromosomes 
1-22 are numbered in approximate order 
of size. A typical human cell contains two 
of each of these chromosomes, plus two 
sex chromosomes—two X chromosomes 
in a female, one X and one Y chromosome 
in a male. The chromosomes used to 
make these maps were stained at an early 
stage in mitosis, when the chromosomes 
are incompletely compacted. The 
horizontal red line represents the position 
of the centromere (see Figure 4-19), 

which appears as a constriction on 

mitotic chromosomes. The red knobs on 
chromosomes 13, 14, 15, 21, and 22 
indicate the positions of genes that code 
for the large ribosomal RNAs (discussed 

in Chapter 6). These banding patterns are 
obtained by staining chromosomes with 
Giemsa stain, and they can be observed 
under the light microscope. (Adapted from 
U. Francke, Cytogenet. Cell Genet. 31:24- 
32, 1981. With permission from the author.) 
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Figure 4-12 Aberrant human chromosomes. (A) Two normal human 
chromosomes, 4 and 6. (B) In an individual carrying a balanced chromosomal 
translocation, the DNA double helix in one chromosome has crossed over 
with the DNA double helix in the other chromosome due to an abnormal 
recombination event. The chromosome painting technique used on the 
chromosomes in each of the sets allows the identification of even short 
pieces of chromosomes that have become translocated, a frequent event in 
cancer cells. (Courtesy of Zhenya Tang and the NIGMS Human Genetic Cell 
Repository at the Coriell Institute for Medical Research: GM21880.) 


The display of the 46 human chromosomes at mitosis is called the human 
karyotype. If parts of chromosomes are lost or are switched between chromo- 
somes, these changes can be detected either by changes in the banding patterns 
or—with greater sensitivity—by changes in the pattern of chromosome painting 
(Figure 4-12). Cytogeneticists use these alterations to detect inherited chromo- 
some abnormalities and to reveal the chromosome rearrangements that occur in 
cancer cells as they progress to malignancy (discussed in Chapter 20). 


Chromosomes Contain Long Strings of Genes 


Chromosomes carry genes—the functional units of heredity. A gene is often 
defined as a segment of DNA that contains the instructions for making a particu- 
lar protein (or a set of closely related proteins), but this definition is too narrow. 
Genes that code for protein are indeed the majority, and most of the genes with 
clear-cut mutant phenotypes fall under this heading. In addition, however, there 
are many “RNA genes” —segments of DNA that generate a functionally significant 
RNA molecule, instead of a protein, as their final product. We shall say more about 
the RNA genes and their products later. 

As might be expected, some correlation exists between the complexity of 
an organism and the number of genes in its genome (see Table 1-2, p. 29). For 
example, some simple bacteria have only 500 genes, compared to about 30,000 
for humans. Bacteria, archaea, and some single-celled eukaryotes, such as yeast, 
have concise genomes, consisting of little more than strings of closely packed 
genes. However, the genomes of multicellular plants and animals, as well as many 
other eukaryotes, contain, in addition to genes, a large quantity of interspersed 
DNA whose function is poorly understood (Figure 4-13). Some of this additional 
DNA is crucial for the proper control of gene expression, and this may in part 
explain why there is so much of it in multicellular organisms, whose genes have to 
be switched on and off according to complicated rules during development (dis- 
cussed in Chapters 7 and 21). 

Differences in the amount of DNA interspersed between genes, far more than 
differences in numbers of genes, account for the astonishing variations in genome 
size that we see when we compare one species with another (see Figure 1-32). For 
example, the human genome is 200 times larger than that of the yeast Saccharo- 
myces cerevisiae, but 30 times smaller than that of some plants and amphibians 
and 200 times smaller than that of a species of amoeba. Moreover, because of dif- 
ferences in the amount of noncoding DNA, the genomes of closely related organ- 
isms (bony fish, for example) can vary several hundredfold in their DNA content, 
even though they contain roughly the same number of genes. Whatever the excess 
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Figure 4—13 The arrangement of 

genes in the genome of S. cerevisiae 
compared to humans. (A) S. cerevisiae is 
a budding yeast widely used for brewing 
and baking. The genome of this single- 
celled eukaryote is distributed over 16 
chromosomes. A small region of one 
chromosome has been arbitrarily selected 
to show its high density of genes. (B) A 
region of the human genome of equal 
length to the yeast segment in (A). The 
human genes are much less densely 
packed and the amount of interspersed 
DNA sequence is far greater. Not shown in 
this sample of human DNA is the fact that 
most human genes are much longer than 
yeast genes (see Figure 4-15). 
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DNA may do, it seems clear that it is not a great handicap for a eukaryotic cell to 
carry a large amount of it. 

How the genome is divided into chromosomes also differs from one eukaryotic 
species to the next. For example, while the cells of humans have 46 chromosomes, 
those of some small deer have only 6, while those of the common carp contain 
over 100. Even closely related species with similar genome sizes can have very 
different numbers and sizes of chromosomes (Figure 4-14). Thus, there is no sim- 
ple relationship between chromosome number, complexity of the organism, and 
total genome size. Rather, the genomes and chromosomes of modern-day species 
have each been shaped by a unique history of seemingly random genetic events, 
acted on by poorly understood selection pressures over long evolutionary times. 


The Nucleotide Sequence of the Human Genome Shows How 
Our Genes Are Arranged 


With the publication of the full DNA sequence of the human genome in 2004, it 
became possible to see in detail how the genes are arranged along each of our 
chromosomes (Figure 4-15). It will be many decades before the information con- 
tained in the human genome sequence is fully analyzed, but it has already stimu- 
lated new experiments that have had major effects on the content of every chapter 
in this book. 
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Figure 4—14 Two closely related species 
of deer with very different chromosome 
numbers. In the evolution of the Indian 
muntjac, initially separate chromosomes 
fused, without having a major effect on the 
animal. These two species contain a similar 
number of genes. (Chinese muntjac photo 
courtesy of Deborah Carreno, Natural 
Wonders Photography.) 


Figure 4—15 The organization of genes on 
a human chromosome. (A) Chromosome 
22, one of the smallest human chromosomes, 
contains 48 x 10° nucleotide pairs and 
makes up approximately 1.5% of the human 
genome. Most of the left arm of chromosome 
22 consists of short repeated sequences 

of DNA that are packaged in a particularly 
compact form of chromatin (heterochromatin) 
discussed later in this chapter. (B) A tenfold 
expansion of a portion of chromosome 22, 
with about 40 genes indicated. Those in dark 
brown are known genes and those in red are 
predicted genes. (C) An expanded portion of 
(B) showing four genes. (D) The intron-exon 
arrangement of a typical gene is shown 

after a further tenfold expansion. Each exon 
(red) codes for a portion of the protein, while 
the DNA sequence of the introns (gray) is 
relatively unimportant, as discussed in detail 
in Chapter 6. 

The human genome (3.2 x 102 nucleotide 
pairs) is the totality of genetic information 
belonging to our species. Almost all of this 
genome is distributed over the 22 different 
autosomes and 2 sex chromosomes (see 
Figures 4-10 and 4-11) found within the 
nucleus. A minute fraction of the human 
genome (16,569 nucleotide pairs—in multiple 
copies per cell) is found in the mitochondria 
(introduced in Chapter 1, and discussed 
in detail in Chapter 14). The term human 
genome sequence refers to the complete 
nucleotide sequence of DNA in the 24 
nuclear chromosomes and the mitochondria. 
Being diploid, a human somatic cell nucleus 
contains roughly twice the haploid amount of 
DNA, or 6.4 x 109 nucleotide pairs, when not 
duplicating its chromosomes in preparation 
for division. (Adapted from International 
Human Genome Sequencing Consortium, 
Nature 409:860-921, 2001. With permission 
from Macmillan Publishers Ltd.) 
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TABLE 4-1 


DNA length 3.2 x 10° nucleotide pairs* 


Percentage of DNA sequence in exons (protein-coding | 1.5% 
sequences) 


Percentage of DNA in other highly conserved 


sequences**** 


Percentage of DNA in high-copy-number repetitive Approximately 50% 
elements 


* The sequence of 2.85 billion nucleotides is known precisely (error rate of only about 1 in 
100,000 nucleotides). The remaining DNA primarily consists of short sequences that are 
tandemly repeated many times over, with repeat numbers differing from one individual to the 
next. These highly repetitive blocks are hard to sequence accurately. 

** This number is only a very rough estimate. 

*“* A pseudogene is a DNA sequence closely resembling that of a functional gene, but 
containing numerous mutations that prevent its proper expression or function. Most 
pseudogenes arise from the duplication of a functional gene followed by the accumulation of 
damaging mutations in one copy. 

==> These conserved functional regions include DNA encoding 5’ and 3’ UTRs (untranslated 
regions of mRNA), DNA specifying structural and functional RNAs, and DNA with conserved 
protein-binding sites. 


The first striking feature of the human genome is how little of it (only a few 
percent) codes for proteins (Table 4-1 and Figure 4-16). It is also notable that 
nearly half of the chromosomal DNA is made up of mobile pieces of DNA that 
have gradually inserted themselves in the chromosomes over evolutionary time, 
multiplying like parasites in the genome (see Figure 4-62). We discuss these trans- 
posable elements in detail in later chapters. 

A second notable feature of the human genome is the large average gene 
size—about 27,000 nucleotide pairs. As discussed above, a typical gene carries in 
its linear sequence of nucleotides the information for the linear sequence of the 
amino acids of a protein. Only about 1300 nucleotide pairs are required to encode 
a protein of average size (about 430 amino acids in humans). Most of the remain- 
ing sequence in a gene consists of long stretches of noncoding DNA that interrupt 
the relatively short segments of DNA that code for protein. As will be discussed in 
detail in Chapter 6, the coding sequences are called exons; the intervening (non- 
coding) sequences in genes are called introns (see Figure 4-15 and Table 4-1). 
The majority of human genes thus consist of a long string of alternating exons and 
introns, with most of the gene consisting of introns. In contrast, the majority of 
genes from organisms with concise genomes lack introns. This accounts for the 
much smaller size of their genes (about one-twentieth that of human genes), as 
well as for the much higher fraction of coding DNA in their chromosomes. 





(B) 


Figure 4-16 Scale of the human genome. 
If drawn with a 1 mm space between each 
nucleotide pair, as in (A), the human genome 
would extend 3200 km (approximately 
2000 miles), far enough to stretch across 
the center of Africa, the site of our human 
origins (red line in B). At this scale, there 
would be, on average, a protein-coding 
gene every 150 m. An average gene would 
extend for 30 m, but the coding sequences 
in this gene would add up to only just over 
a meter. 
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In addition to introns and exons, each gene is associated with regulatory DNA 
sequences, which are responsible for ensuring that the gene is turned on or off at 
the proper time, expressed at the appropriate level, and only in the proper type of 
cell. In humans, the regulatory sequences for a typical gene are spread out over 
tens of thousands of nucleotide pairs. As would be expected, these regulatory 
sequences are much more compressed in organisms with concise genomes. We 
discuss how regulatory DNA sequences work in Chapter 7. 

Research in the last decade has surprised biologists with the discovery that, 
in addition to 21,000 protein-coding genes, the human genome contains many 
thousands of genes that encode RNA molecules that do not produce proteins, but 
instead have a variety of other important functions. What is thus far known about 
these molecules will be presented in Chapters 6 and 7. Last, but not least, the 
nucleotide sequence of the human genome has revealed that the archive of infor- 
mation needed to produce a human seems to be in an alarming state of chaos. As 
one commentator described our genome, “In some ways it may resemble your 
garage/bedroom/refrigerator/life: highly individualistic, but unkempt; little evi- 
dence of organization; much accumulated clutter (referred to by the uninitiated 
as ‘junk’); virtually nothing ever discarded; and the few patently valuable items 
indiscriminately, apparently carelessly, scattered throughout.’ We shall discuss 
how this is thought to have come about in the final sections of this chapter entitled 
“How Genomes Evolve.” 


Each DNA Molecule That Forms a Linear Chromosome Must 
Contain a Centromere, Two Telomeres, and Replication Origins 


To form a functional chromosome, a DNA molecule must be able to do more than 
simply carry genes: it must be able to replicate, and the replicated copies must be 
separated and reliably partitioned into daughter cells at each cell division. This 
process occurs through an ordered series of stages, collectively known as the cell 
cycle, which provides for a temporal separation between the duplication of chro- 
mosomes and their segregation into two daughter cells. The cell cycle is briefly 
summarized in Figure 4-17, and it is discussed in detail in Chapter 17. Briefly, 
during a long interphase, genes are expressed and chromosomes are replicated, 
with the two replicas remaining together as a pair of sister chromatids. Through- 
out this time, the chromosomes are extended and much of their chromatin exists 
as long threads in the nucleus so that individual chromosomes cannot be easily 
distinguished. It is only during a much briefer period of mitosis that each chro- 
mosome condenses so that its two sister chromatids can be separated and dis- 
tributed to the two daughter nuclei. The highly condensed chromosomes in a 
dividing cell are known as mitotic chromosomes (Figure 4-18). This is the form 
in which chromosomes are most easily visualized; in fact, the images of chromo- 
somes shown so far in the chapter are of chromosomes in mitosis. 

Each chromosome operates as a distinct structural unit: for a copy to be passed 
on to each daughter cell at division, each chromosome must be able to replicate, 
and the newly replicated copies must subsequently be separated and partitioned 
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Figure 4-17 A simplified view of the 
eukaryotic cell cycle. During interphase, 
the cell is actively expressing its genes 

and is therefore synthesizing proteins. 
Also, during interphase and before cell 
division, the DNA is replicated and each 
chromosome is duplicated to produce two 
closely paired sister DNA molecules (called 
sister chromatids). A cell with only one type 
of chromosome, present in maternal and 
paternal copies, is illustrated here. Once 
DNA replication is complete, the cell can 
enter M phase, when mitosis occurs and 
the nucleus is divided into two daughter 
nuclei. During this stage, the chromosomes 
condense, the nuclear envelope breaks 
down, and the mitotic spindle forms from 
microtubules and other proteins. The 
condensed mitotic chromosomes are 
captured by the mitotic spindle, and one 
complete set of chromosomes is then 
pulled to each end of the cell by separating 
the members of each sister-chromatid pair. 
A nuclear envelope re-forms around each 
chromosome set, and in the final step of 

M phase, the cell divides to produce two 
daughter cells. Most of the time in the cell 
cycle is spent in interphase; M phase is 
brief in comparison, occupying only about 
an hour in many mammalian cells. 
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correctly into the two daughter cells. These basic functions are controlled by three 
types of specialized nucleotide sequences in the DNA, each of which binds spe- 
cific proteins that guide the machinery that replicates and segregates chromo- 
somes (Figure 4-19). 

Experiments in yeasts, whose chromosomes are relatively small and easy to 
manipulate, have identified the minimal DNA sequence elements responsible for 
each of these functions. One type of nucleotide sequence acts as a DNA repli- 
cation origin, the location at which duplication of the DNA begins. Eukaryotic 
chromosomes contain many origins of replication to ensure that the entire chro- 
mosome can be replicated rapidly, as discussed in detail in Chapter 5. 

After DNA replication, the two sister chromatids that form each chromosome 
remain attached to one another and, as the cell cycle proceeds, are condensed 
further to produce mitotic chromosomes. The presence of a second specialized 
DNA sequence, called a centromere, allows one copy of each duplicated and con- 
densed chromosome to be pulled into each daughter cell when a cell divides. A 
protein complex called a kinetochore forms at the centromere and attaches the 
duplicated chromosomes to the mitotic spindle, allowing them to be pulled apart 
(discussed in Chapter 17). 

The third specialized DNA sequence forms telomeres, the ends of a chromo- 
some. Telomeres contain repeated nucleotide sequences that enable the ends of 
chromosomes to be efficiently replicated. Telomeres also perform another func- 
tion: the repeated telomere DNA sequences, together with the regions adjoining 
them, form structures that protect the end of the chromosome from being mis- 
taken by the cell for a broken DNA molecule in need of repair. We discuss both this 
type of repair and the structure and function of telomeres in Chapter 5. 

In yeast cells, the three types of sequences required to propagate a chromo- 
some are relatively short (typically less than 1000 base pairs each) and therefore 
use only a tiny fraction of the information-carrying capacity of a chromosome. 
Although telomere sequences are fairly simple and short in all eukaryotes, the 
DNA sequences that form centromeres and replication origins in more complex 
organisms are much longer than their yeast counterparts. For example, experi- 
ments suggest that a human centromere can contain up to a million nucleotide 
pairs and that it may not require a stretch of DNA with a defined nucleotide 
sequence. Instead, as we shall discuss later in this chapter, a human centromere 
is thought to consist of a large, regularly repeating protein-nucleic acid structure 
that can be inherited when a chromosome replicates. 

















INTERPHASE MITOSIS INTERPHASE 
mo moo 

a : BA : aa : : 
: : JẸ co B E 

H = H E DIVISION |= H 

: q — HE == —— E - 

aa portion of © |] j 

replicated mitotic spindle duplicated 

chromosome chromosomes 

in separate 


daughter cells 





Figure 4-18 A mitotic chromosome. 
A mitotic chromosome is a condensed 
duplicated chromosome in which the 
two new chromosomes, called sister 
chromatids, are still linked together (see 
Figure 4-17). The constricted region 
indicates the position of the centromere. 
(Courtesy of Terry D. Allen.) 


Figure 4-19 The three DNA sequences 
required to produce a eukaryotic 
chromosome that can be replicated and 
then segregated accurately at mitosis. 
Each chromosome has multiple origins 

of replication, one centromere, and two 
telomeres. Shown here is the sequence of 
events that a typical chromosome follows 
during the cell cycle. The DNA replicates 
in interphase, beginning at the origins of 
replication and proceeding bidirectionally 
from the origins across the chromosome. 
In M phase, the centromere attaches the 
duplicated chromosomes to the mitotic 
spindle so that a copy of the entire genome 
is distributed to each daughter cell during 
mitosis; the special structure that attaches 
the centromere to the spindle is a protein 
complex called the kinetochore (dark 
green). The centromere also helps to hold 
the duplicated chromosomes together 
until they are ready to be moved apart. 
The telomeres form special caps at each 
chromosome end. 


CHROMOSOMAL DNA AND ITS PACKAGING IN THE CHROMATIN FIBER 


DNA Molecules Are Highly Condensed in Chromosomes 


All eukaryotic organisms have special ways of packaging DNA into chromosomes. 
For example, if the 48 million nucleotide pairs of DNA in human chromosome 
22 could be laid out as one long perfect double helix, the molecule would extend 
for about 1.5 cm if stretched out end to end. But chromosome 22 measures only 
about 2 um in length in mitosis (see Figures 4-10 and 4-11), representing an end- 
to-end compaction ratio of over 7000-fold. This remarkable feat of compression 
is performed by proteins that successively coil and fold the DNA into higher and 
higher levels of organization. Although much less condensed than mitotic chro- 
mosomes, the DNA of human interphase chromosomes is still tightly packed. 

In reading these sections it is important to keep in mind that chromosome 
structure is dynamic. We have seen that each chromosome condenses to an 
extreme degree in the M phase of the cell cycle. Much less visible, but of enormous 
interest and importance, specific regions of interphase chromosomes decon- 
dense to allow access to specific DNA sequences for gene expression, DNA repair, 
and replication—and then recondense when these processes are completed. The 
packaging of chromosomes is therefore accomplished in a way that allows rapid 
localized, on-demand access to the DNA. In the next sections, we discuss the spe- 
cialized proteins that make this type of packaging possible. 


Nucleosomes Are a Basic Unit of Eukaryotic Chromosome 
structure 


The proteins that bind to the DNA to form eukaryotic chromosomes are tradi- 
tionally divided into two classes: the histones and the non-histone chromosomal 
proteins, each contributing about the same mass to a chromosome as the DNA. 
The complex of both classes of protein with the nuclear DNA of eukaryotic cells is 
known as chromatin (Figure 4-20). 

Histones are responsible for the first and most basic level of chromosome 
packing, the nucleosome, a protein-DNA complex discovered in 1974. When 
interphase nuclei are broken open very gently and their contents examined under 
the electron microscope, most of the chromatin appears to be in the form of a 
fiber with a diameter of about 30 nm (Figure 4-21A). If this chromatin is sub- 
jected to treatments that cause it to unfold partially, it can be seen under the elec- 
tron microscope as a series of “beads on a string” (Figure 4-21B). The string is 
DNA, and each bead is a “nucleosome core particle” that consists of DNA wound 
around a histone core (Movie 4.2). 

The structural organization of nucleosomes was determined after first isolat- 
ing them from unfolded chromatin by digestion with particular enzymes (called 
nucleases) that break down DNA by cutting between the nucleosomes. After 
digestion for a short period, the exposed DNA between the nucleosome core par- 
ticles, the linker DNA, is degraded. Each individual nucleosome core particle con- 
sists of a complex of eight histone proteins—two molecules each of histones H2A, 
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Figure 4-20 Chromatin. As illustrated, 
chromatin consists of DNA bound to both 
histone and non-histone proteins. The 
mass of histone protein present is about 
equal to the total mass of non-histone 
protein, but—as schematically indicated 
here—the latter class is composed of an 
enormous number of different species. In 
total, a chromosome is about one-third 
DNA and two-thirds protein by mass. 
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H2B, H3, and H4—and double-stranded DNA that is 147 nucleotide pairs long. 
The histone octamer forms a protein core around which the double-stranded DNA 
is wound (Figure 4-22). 

The region of linker DNA that separates each nucleosome core particle from 
the next can vary in length from a few nucleotide pairs up to about 80. (The term 
nucleosome technically refers to a nucleosome core particle plus one ofits adjacent 
DNA linkers, but it is often used synonymously with nucleosome core particle.) 
On average, therefore, nucleosomes repeat at intervals of about 200 nucleotide 
pairs. For example, a diploid human cell with 6.4 x 10°? nucleotide pairs contains 
approximately 30 million nucleosomes. The formation of nucleosomes converts a 
DNA molecule into a chromatin thread about one-third of its initial length. 


The Structure of the Nucleosome Core Particle Reveals How DNA 
Is Packaged 


The high-resolution structure of a nucleosome core particle, solved in 1997, 
revealed a disc-shaped histone core around which the DNA was tightly wrapped 
in a left-handed coil of 1.7 turns (Figure 4-23). All four of the histones that make 
up the core of the nucleosome are relatively small proteins (102-135 amino acids), 
and they share a structural motif, known as the histone fold, formed from three a 
helices connected by two loops (Figure 4-24). In assembling a nucleosome, the 
histone folds first bind to each other to form H3-H4 and H2A-H2B dimers, and 
the H3-H4 dimers combine to form tetramers. An H3-H4 tetramer then further 
combines with two H2A-H2B dimers to form the compact octamer core, around 
which the DNA is wound. 

The interface between DNA and histone is extensive: 142 hydrogen bonds are 
formed between DNA and the histone core in each nucleosome. Nearly half of 
these bonds form between the amino acid backbone of the histones and the sug- 
ar-phosphate backbone of the DNA. Numerous hydrophobic interactions and salt 
linkages also hold DNA and protein together in the nucleosome. More than one- 
fifth of the amino acids in each of the core histones are either lysine or arginine 
(two amino acids with basic side chains), and their positive charges can effectively 


Figure 4-22 Structural organization of the nucleosome. A nucleosome 
contains a protein core made of eight histone molecules. In biochemical 
experiments, the nucleosome core particle can be released from isolated 
chromatin by digestion of the linker DNA with a nuclease, an enzyme that 
breaks down DNA. (The nuclease can degrade the exposed linker DNA but 
cannot attack the DNA wound tightly around the nucleosome core.) After 
dissociation of the isolated nucleosome into its protein core and DNA, the 
length of the DNA that was wound around the core can be determined. 
This length of 147 nucleotide pairs is sufficient to wrap 1.7 times around the 
histone core. 


Figure 4-21 Nucleosomes as seen in 
the electron microscope. (A) Chromatin 
isolated directly from an interphase nucleus 
appears in the electron microscope as a 
thread about 30 nm thick. (B) This electron 
micrograph shows a length of chromatin 
that has been experimentally unpacked, 

or decondensed, after isolation to show 
the nucleosomes. (A, courtesy of Barbara 
Hamkalo; B, courtesy of Victoria Foe.) 
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neutralize the negatively charged DNA backbone. These numerous interactions 
explain in part why DNA of virtually any sequence can be bound on a histone 
octamer core. The path of the DNA around the histone core is not smooth; rather, 
several kinks are seen in the DNA, as expected from the nonuniform surface of the 
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Figure 4-24 The overall structural organization of the core histones. (A) Each of the core 


histones contains an N-terminal tail, which is subject to several forms of covalent modification, and 
a histone fold region, as indicated. (B) The structure of the histone fold, which is formed by all four 
of the core histones. (C) Histones 2A and 2B form a dimer through an interaction Known as the 
“handshake.” Histones H3 and H4 form a dimer through the same type of interaction. (D) The final 
histone octamer on DNA. Note that all eight N-terminal tails of the histones protrude from the disc- 
shaped core structure. Their conformations are highly flexible, and they serve as binding sites for 


sets of other proteins. 
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Figure 4-23 The structure of a nucleosome 
core particle, as determined by x-ray 
diffraction analyses of crystals. Each 
histone is colored according to the scheme in 
Figure 4-22, with the DNA double helix in light 
gray. (Adapted from K. Luger et al., Nature 
389:251-260, 1997. With permission from 
Macmillan Publishers Ltd.) 
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core. The bending requires a substantial compression of the minor groove of the 
DNA helix. Certain dinucleotides in the minor groove are especially easy to com- 
press, and some nucleotide sequences bind the nucleosome more tightly than 
others (Figure 4-25). This probably explains some striking, but unusual, cases 
of very precise positioning of nucleosomes along a stretch of DNA. However, the 
sequence preference of nucleosomes must be weak enough to allow other factors 
to dominate, inasmuch as nucleosomes can occupy any one of a number of posi- 
tions relative to the DNA sequence in most chromosomal regions. 

In addition to its histone fold, each of the core histones has an N-terminal 
amino acid “tail,’ which extends out from the DNA-histone core (see Figure 
4-24D). These histone tails are subject to several different types of covalent mod- 
ifications that in turn control critical aspects of chromatin structure and function, 
as we Shall discuss shortly. 

As a reflection of their fundamental role in DNA function through controlling 
chromatin structure, the histones are among the most highly conserved eukary- 
otic proteins. For example, the amino acid sequence of histone H4 from a pea 
differs from that of a cow at only 2 of the 102 positions. This strong evolution- 
ary conservation suggests that the functions of histones involve nearly all of their 
amino acids, so that a change in any position is deleterious to the cell. But in addi- 
tion to this remarkable conservation, eukaryotic organisms also produce smaller 
amounts of specialized variant core histones that differ in amino acid sequence 
from the main ones. As discussed later, these variants, combined with the surpris- 
ingly large number of covalent modifications that can be added to the histones in 
nucleosomes, give rise to a variety of chromatin structures in cells. 


Nucleosomes Have a Dynamic Structure, and Are Frequently 
Subjected to Changes Catalyzed by ATP-Dependent Chromatin 
Remodeling Complexes 


For many years biologists thought that, once formed in a particular position on 
DNA, a nucleosome would remain fixed in place because of the very tight asso- 
ciation between its core histones and DNA. If true, this would pose problems for 
genetic readout mechanisms, which in principle require easy access to many 
specific DNA sequences. It would also hinder the rapid passage of the DNA tran- 
scription and replication machinery through chromatin. But kinetic experiments 
show that the DNA in an isolated nucleosome unwraps from each end at a rate of 
about four times per second, remaining exposed for 10 to 50 milliseconds before 
the partially unwrapped structure recloses. Thus, most of the DNA in an isolated 
nucleosome is in principle available for binding other proteins. 

For the chromatin in a cell, a further loosening of DNA-histone contacts is 
clearly required, because eukaryotic cells contain a large variety of ATP-depen- 
dent chromatin remodeling complexes. These complexes include a subunit that 
hydrolyzes ATP (an ATPase evolutionarily related to the DNA helicases discussed 
in Chapter 5). This subunit binds both to the protein core of the nucleosome and 
to the double-stranded DNA that winds around it. By using the energy of ATP 
hydrolysis to move this DNA relative to the core, the protein complex changes the 
structure of a nucleosome temporarily, making the DNA less tightly bound to the 
histone core. Through repeated cycles of ATP hydrolysis that pull the nucleosome 
core along the DNA double helix, the remodeling complexes can catalyze nucle- 
osome Sliding. In this way, they can reposition nucleosomes to expose specific 
regions of DNA, thereby making them available to other proteins in the cell (Fig- 
ure 4-26). In addition, by cooperating with a variety of other proteins that bind to 
histones and serve as histone chaperones, some remodeling complexes are able to 
remove either all or part of the nucleosome core from a nucleosome—catalyzing 
either an exchange of its H2A-H2B histones, or the complete removal of the oct- 
americ core from the DNA (Figure 4-27). As a result of such processes, measure- 
ments reveal that a typical nucleosome is replaced on the DNA every one or two 
hours inside the cell. 
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Figure 4—25 The bending of DNA ina 
nucleosome. The DNA helix makes 

1.7 tight turns around the histone octamer. 
This diagram illustrates how the minor 
groove is compressed on the inside of the 
turn. Owing to structural features of the 
DNA molecule, the indicated dinucleotides 
are preferentially accommodated in such 
a narrow minor groove, which helps to 
explain why certain DNA sequences 

will bind more tightly than others to the 
nucleosome core. 


CHROMOSOMAL DNA AND ITS PACKAGING IN THE CHROMATIN FIBER 


Cells contain dozens of different ATP-dependent chromatin remodeling com- 
plexes that are specialized for different roles. Most are large protein complexes 
that can contain 10 or more subunits, some of which bind to specific modifica- 
tions on histones (see Figure 4-26C). The activity of these complexes is carefully 
controlled by the cell. As genes are turned on and off, chromatin remodeling com- 
plexes are brought to specific regions of DNA where they act locally to influence 
chromatin structure (discussed in Chapter 7; see also Figure 4-40, below). 

Although some DNA sequences bind more tightly than others to the nucleo- 
some core (see Figure 4-25), the most important influence on nucleosome posi- 
tioning appears to be the presence of other tightly bound proteins on the DNA. 
Some bound proteins favor the formation of a nucleosome adjacent to them. 
Others create obstacles that force the nucleosomes to move elsewhere. The exact 
positions of nucleosomes along a stretch of DNA therefore depend mainly on the 
presence and nature of other proteins bound to the DNA. And due to the presence 
of ATP-dependent chromatin remodeling complexes, the arrangement of nucle- 
osomes on DNA can be highly dynamic, changing rapidly according to the needs 
of the cell. 


Nucleosomes Are Usually Packed Together into a Compact 
Chromatin Fiber 


Although enormously long strings of nucleosomes form on the chromosomal 
DNA, chromatin in a living cell probably rarely adopts the extended “beads-on-a- 
string” form. Instead, the nucleosomes are packed on top of one another, gener- 
ating arrays in which the DNA is even more highly condensed. Thus, when nuclei 
are very gently lysed onto an electron microscope grid, much of the chromatin is 
seen to be in the form of a fiber with a diameter of about 30 nm, which is consid- 
erably wider than chromatin in the “beads-on-a-string” form (see Figure 4-21). 
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Figure 4-26 The nucleosome sliding 
catalyzed by ATP-dependent chromatin 
remodeling complexes. (A) Using the 
energy of ATP hydrolysis, the remodeling 
complex is thought to push on the DNA 

of its bound nucleosome and loosen its 
attachment to the nucleosome core. Each 
cycle of ATP binding, ATP hydrolysis, and 
release of the ADP and P; products thereby 
moves the DNA with respect to the histone 
octamer in the direction of the arrow in this 
diagram. It requires many such cycles to 
produce the nucleosome sliding shown. 

(B) The structure of a nucleosome-bound 
dimer of the two identical ATPase subunits 
(green) that slide nucleosomes back and 
forth in the ISW1 family of chromatin 
remodeling complexes. (C) The structure 

of a large chromatin remodeling complex, 
showing how it is thought to wrap around a 
nucleosome. Modeled in green is the yeast 
RSC complex, which contains 15 subunits — 
including an ATPase and at least four 
subunits with domains that recognize specific 
covalently modified histones. (B, from 

L.R. Racki et al., Nature 462:1016-1021, 
2009. With permission from Macmillan 
Publishers Ltd; C, adapted from 

A.E. Leschziner et al., Proc. Nat! Acad. Sci. 
USA 104:4913-4918, 2007.) 
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How nucleosomes are organized into condensed arrays is unclear. The struc- 
ture of a tetranucleosome (a complex of four nucleosomes) obtained by x-ray 
crystallography and high-resolution electron microscopy of reconstituted chro- 
matin have been used to support a zigzag model for the stacking of nucleosomes 
in a 30-nm fiber (Figure 4-28). But cryoelectron microscopy of carefully prepared 
nuclei suggests that most regions of chromatin are less regularly structured. 

What causes nucleosomes to stack so tightly on each other? Nucleosome-to- 
nucleosome linkages that involve histone tails, most notably the H4 tail, consti- 
tute one important factor (Figure 4-29). Another important factor is an additional 
histone that is often present in a 1-to-1 ratio with nucleosome cores, known as 
histone H1. This so-called linker histone is larger than the individual core histones 
and it has been considerably less well conserved during evolution. A single his- 
tone H1 molecule binds to each nucleosome, contacting both DNA and protein, 
and changing the path of the DNA as it exits from the nucleosome. This change in 
the exit path of DNA is thought to help compact nucleosomal DNA (Figure 4-30). 


(A) 








Figure 4-27 Nucleosome removal and histone 
exchange catalyzed by ATP-dependent chromatin 
remodeling complexes. By cooperating with specific 
members of a large family of different histone chaperones, 
some chromatin remodeling complexes can remove 
the H2A-H2B dimers from a nucleosome (top series of 
~~ reactions) and replace them with dimers that contain a 
variant histone, such as the H2AZ—H2B dimer (See Figure 
4-35). Other remodeling complexes are attracted to 
specific sites on chromatin and cooperate with histone 
chaperones to remove the histone octamer completely 
and/or to replace it with a different nucleosome core 
(bottom series of reactions). Highly simplified views of the 
processes are illustrated here. 
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Figure 4—28 A zigzag model for the 30- 
nm chromatin fiber. (A) The conformation 
of two of the four nucleosomes in a 
tetranucleosome, from a structure 
determined by x-ray crystallography. 

(B) Schematic of the entire tetranucleosome; 
the fourth nucleosome is not visible, being 
stacked on the bottom nucleosome and 
behind it in this diagram. (C) Diagrammatic 
illustration of a possible zigzag structure 
that could account for the 30-nm chromatin 
fiber. (A, PDB code: 1ZBB; C, adapted 
from C.L. Woodcock, Nat. Struct. Mol. Biol. 
12:639-640, 2005. With permission from 
Macmillan Publishers Ltd.) 
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Most eukaryotic organisms make several histone H1 proteins of related but quite 
distinct amino acid sequences. The presence of many other DNA-binding pro- 
teins, as well as proteins that bind directly to histones, is certain to add important 
additional features to any array of nucleosomes. 


Summary 


A gene is a nucleotide sequence in a DNA molecule that acts as a functional unit 
for the production of a protein, a structural RNA, or a catalytic or regulatory RNA 
molecule. In eukaryotes, protein-coding genes are usually composed of a string of 
alternating introns and exons associated with regulatory regions of DNA. A chro- 
mosome is formed from a single, enormously long DNA molecule that contains a 
linear array of many genes, bound to a large set of proteins. The human genome 
contains 3.2 x 10? DNA nucleotide pairs, divided between 22 different autosomes 
(present in two copies each) and 2 sex chromosomes. Only a small percentage of this 
DNA codes for proteins or functional RNA molecules. A chromosomal DNA mole- 
cule also contains three other types of important nucleotide sequences: replication 
origins and telomeres allow the DNA molecule to be efficiently replicated, while a 
centromere attaches the sister DNA molecules to the mitotic spindle, ensuring their 
accurate segregation to daughter cells during the M phase of the cell cycle. 

The DNA in eukaryotes is tightly bound to an equal mass of histones, which 
form repeated arrays of DNA-protein particles called nucleosomes. The nucleosome 
is composed of an octameric core of histone proteins around which the DNA dou- 
ble helix is wrapped. Nucleosomes are spaced at intervals of about 200 nucleotide 
pairs, and they are usually packed together (with the aid of histone H1 molecules) 
into quasi-regular arrays to form a 30-nm chromatin fiber. Even though compact, 
the structure of chromatin must be highly dynamic to allow access to the DNA. 
There is some spontaneous DNA unwrapping and rewrapping in the nucleosome 
itself; however, the general strategy for reversibly changing local chromatin struc- 
ture features ATP-driven chromatin remodeling complexes. Cells contain a large set 
of such complexes, which are targeted to specific regions of chromatin at appropri- 
ate times. The remodeling complexes collaborate with histone chaperones to allow 
nucleosome cores to be repositioned, reconstituted with different histones, or com- 
pletely removed to expose the underlying DNA. 
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Figure 4-29 A model for the role played 
by histone tails in the compaction of 
chromatin. (A) A schematic diagram 
shows the approximate exit points of 

the eight histone tails, one from each 
histone protein, that extend from each 
nucleosome. The actual structure is 
shown to its right. In the high-resolution 
structure of the nucleosome, the tails are 
largely unstructured, suggesting that they 
are highly flexible. (B) As indicated, the 
histone tails are thought to be involved in 
interactions between nucleosomes that 
help to pack them together. (A, PDB 
code: 1KX5.) 


Figure 4-30 How the linker histone 
binds to the nucleosome. The position 
and structure of histone H1 is shown. The 
H1 core region constrains an additional 
20 nucleotide pairs of DNA where it exits 
from the nucleosome core and is important 
for compacting chromatin. (A) Schematic, 
and (B) structure inferred for a single 
nucleosome from a structure determined 
by high-resolution electron microscopy of 
a reconstituted chromatin fiber (C). (B and 
C, adapted from F. Song et al., Science 
344:376-380, 2014.) 
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CHROMATIN STRUCTURE AND FUNCTION 


Having described how DNA is packaged into nucleosomes to create a chromatin 
fiber, we now turn to the mechanisms that create different chromatin structures 
in different regions of a cell’s gnome. Mechanisms of this type have a variety of 
important functions in cells. Most strikingly, certain types of chromatin structure 
can be inherited; that is, the structure can be directly passed down from a cell 
to its descendants. Because the cell memory that results is based on an inher- 
ited chromatin structure rather than on a change in DNA sequence, this is a form 
of epigenetic inheritance. The prefix epi is Greek for “on”; this is appropriate, 
because epigenetics represents a form of inheritance that is superimposed on the 
genetic inheritance based on DNA. 

In Chapter 7, we shall introduce the many different ways in which the expres- 
sion of genes is regulated. There we discuss epigenetic inheritance in detail and 
present several different mechanisms that can produce it. Here, we are con- 
cerned with only one, that based on chromatin structure. We begin this section by 
reviewing the observations that first demonstrated that chromatin structures can 
be inherited. We then describe some of the chemistry that makes this possible— 
the covalent modification of histones in nucleosomes. These modifications have 
many functions, inasmuch as they serve as recognition sites for protein domains 
that link specific protein complexes to different regions of chromatin. Histones 
thereby have effects on gene expression, as well as on many other DNA-linked 
processes. Through such mechanisms, chromatin structure plays an important 
role in the development, growth, and maintenance of all eukaryotic organisms, 
including ourselves. 


Heterochromatin Is Highly Organized and Restricts Gene 
Expression 


Light-microscope studies in the 1930s distinguished two types of chromatin in 
the interphase nuclei of many higher eukaryotic cells: a highly condensed form, 
called heterochromatin, and all the rest, which is less condensed, called euchro- 
matin. Heterochromatin represents an especially compact form of chromatin 
(see Figure 4-9), and we are finally beginning to understand its molecular prop- 
erties. It is highly concentrated in certain specialized regions, most notably at the 
centromeres and telomeres introduced previously (see Figure 4-19), but it is also 
present at many other locations along chromosomes—locations that can vary 
according to the physiological state of the cell. In a typical mammalian cell, more 
than 10% of the genome is packaged in this way. 

The DNA in heterochromatin typically contains few genes, and when euchro- 
matic regions are converted to a heterochromatic state, their genes are generally 
switched off as a result. However, we know now that the term heterochromatin 
encompasses several distinct modes of chromatin compaction that have different 
implications for gene expression. Thus, heterochromatin should not be thought 
of as simply encapsulating “dead” DNA, but rather as a descriptor for compact 
chromatin domains that share the common feature of being unusually resistant 
to gene expression. 


The Heterochromatic State Is Self-Propagating 


Through chromosome breakage and rejoining, whether brought about by a nat- 
ural genetic accident or by experimental artifice, a piece of chromosome that is 
normally euchromatic can be translocated into the neighborhood of heteroch- 
romatin. Remarkably, this often causes silencing—inactivation—of the normally 
active genes. This phenomenon is referred to as a position effect. It reflects a 
spreading of the heterochromatic state into the originally euchromatic region, 
and it has provided important clues to the mechanisms that create and maintain 
heterochromatin. First recognized in Drosophila, position effects have now been 
observed in many eukaryotes, including yeasts, plants, and humans. 
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Figure 4-31 The cause of position effect variegation in Drosophila. (A) Heterochromatin (green) is normally prevented from 
spreading into adjacent regions of euchromatin (red) by barrier DNA sequences, which we shall discuss shortly. In flies that 
inherit certain chromosomal rearrangements, however, this barrier is no longer present. (B) During the early development of such 
flies, heterochromatin can spread into neighboring chromosomal DNA, proceeding for different distances in different cells. This 
spreading soon stops, but the established pattern of heterochromatin is subsequently inherited, so that large clones of progeny 
cells are produced that have the same neighboring genes condensed into heterochromatin and thereby inactivated (hence the 
“variegated” appearance of some of these flies; see Figure 4-32). Although “spreading” is used to describe the formation of 
new heterochromatin close to previously existing heterochromatin, the term may not be wholly accurate. There is evidence that 
during expansion, the condensation of DNA into heterochromatin can “skip over” some regions of chromatin, sparing the genes 


that lie within them from repressive effects. 


In chromosome breakage-and-rejoining events of the sort just described, the 
zone of silencing, where euchromatin is converted to a heterochromatic state, 
spreads for different distances in different early cells in the fly embryo. Remark- 
ably, these differences then are perpetuated for the rest of the animal’s life: in 
each cell, once the heterochromatic condition is established on a piece of chro- 
matin, it tends to be stably inherited by all of that cell’s progeny (Figure 4-31). This 
remarkable phenomenon, called position effect variegation, was first recognized 
through a detailed genetic analysis of the mottled loss of red pigment in the fly eye 
(Figure 4-32). It shares features with the extensive spread of heterochromatin that 
inactivates one of the two X chromosomes in female mammals. There too, a ran- 
dom process acts in each cell of the early embryo to dictate which X chromosome 
will be inactivated, and that same X chromosome then remains inactive in all the 
cell’s progeny, creating a mosaic of different clones of cells in the adult body (see 
Figure 7-50). 

These observations, taken together, point to a fundamental strategy of het- 
erochromatin formation: heterochromatin begets more heterochromatin. This 
positive feedback can operate both in space, causing the heterochromatic state to 
spread along the chromosome, and in time, across cell generations, propagating 
the heterochromatic state of the parent cell to its daughters. The challenge is to 
explain the molecular mechanisms that underlie this remarkable behavior. 
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Figure 4—32 The discovery of position 
effects on gene expression. The White 
gene in the fruit fly Drosophila controls eye 
pigment production and is named after the 
mutation that first identified it. Wild-type 
flies with a normal White gene (White*) 
have normal pigment production, which 
gives them red eyes, but if the White gene 
is mutated and inactivated, the mutant 
flies (White-) make no pigment and have 
white eyes. In flies in which a normal White 
gene has been moved near a region of 
heterochromatin, the eyes are mottled, 
with both red and white patches. The white 
patches represent cell lineages in which 
the White gene has been silenced by the 
effects of the heterochromatin. In contrast, 
the red patches represent cell lineages in 
which the White gene is expressed. Early 
in development, when the heterochromatin 
is first formed, it soreads into neighboring 
euchromatin to different extents in different 
embryonic cells (see Figure 4-31). The 
presence of large patches of red and white 
cells reveals that the state of transcriptional 
activity, as determined by the packaging of 
this gene into chromatin in those ancestor 
cells, is inherited by all daughter cells. 
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(A) LYSINE ACETYLATION AND METHYLATION ARE COMPETING REACTIONS 





acetyl lysine monomethyl lysine dimethyl lysine 


Figure 4-33 Some prominent types of covalent amino acid side-chain 
modifications found on nucleosomal histones. (A) Three different levels 

of lysine methylation are shown; each can be recognized by a different 
binding protein and thus each can have a different significance for the cell. 
Note that acetylation removes the plus charge on lysine, and that, most 
importantly, an acetylated lysine cannot be methylated, and vice versa. 

(B) Serine phosphorylation adds a negative charge to a histone. Modifications 
of histones not shown here include the mono- or dimethylation of an arginine, 
the phosphorylation of a threonine, the addition of ADP-ribose to a glutamic 
acid, and the addition of a ubiquityl, sumoyl, or biotin group to a lysine. 


As a first step, one can carry out a search for the molecules that are involved. 
This has been done by means of genetic screens, in which large numbers of 
mutants are generated, after which one picks out those that show an abnormal- 
ity of the process in question. Extensive genetic screens in Drosophila, fungi, and 
mice have identified more than 100 genes whose products either enhance or sup- 
press the spread of heterochromatin and its stable inheritance—in other words, 
genes that serve as either enhancers or suppressors of position effect variegation. 
Many of these genes turn out to code for non-histone chromosomal proteins that 
interact with histones and are involved in modifying or maintaining chromatin 
structure. We shall discuss how they work in the sections that follow. 


The Core Histones Are Covalently Modified at Many Different Sites 


The amino acid side chains of the four histones in the nucleosome core are sub- 
jected to a remarkable variety of covalent modifications, including the acetylation 
of lysines, the mono-, di-, and trimethylation of lysines, and the phosphorylation 
of serines (Figure 4-33). A large number of these side-chain modifications occur 
on the eight relatively unstructured N-terminal “histone tails” that protrude from 
the nucleosome (Figure 4-34). However, there are also more than 20 specific side- 
chain modifications on the nucleosome’s globular core. 

All of the above types of modifications are reversible, with one enzyme serv- 
ing to create a particular type of modification, and another to remove it. These 
enzymes are highly specific. Thus, for example, acetyl groups are added to specific 
lysines by a set of different histone acetyl transferases (HATs) and removed by a set 
of histone deacetylase complexes (HDACs). Likewise, methyl groups are added to 
lysine side chains by a set of different histone methyl transferases and removed 
by a set of histone demethylases. Each enzyme is recruited to specific sites on 
the chromatin at defined times in each cell’s life history. For the most part, the 
initial recruitment depends on transcription regulator proteins (sometimes called 
“transcription factors” ). As we shall explain in Chapter 7, these proteins recognize 
and bind to specific DNA sequences in the chromosomes. They are produced at 
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Figure 4-34 The covalent modification of core histone tails. (A) The structure of the nucleosome highlighting the location of 
the first 30 amino acids in each of its eight N-terminal histone tails (green). These tails are unstructured and highly mobile, and 
thus will change their conformation depending on other bound proteins. (B) Well-documented modifications of the four histone 
core proteins are indicated. Although only a single symbol is used here for methylation (M), each lysine (K) or arginine (R) can be 
methylated in several different ways. Note also that some positions (e.g., lysine 9 of H3) can be modified either by methylation 
or by acetylation, but not both. Most of the modifications shown add a relatively small molecule onto the histone tails; the 
exception is ubiquitin, a 76-amino-acid protein also used for other cell processes (see Figure 3-69). Not shown are more than 
20 possible modifications located in the globular core of the histones. (A, PDB: 1KX5; B, adapted from H. Santos-Rosa and 

C. Caldas, Eur. J. Cancer 41:2381-2402, 2005. With permission from Elsevier.) 


different times and places in the life of an organism, thereby determining where 
and when the chromatin-modifying enzymes will act. In this way, the DNA 
sequence ultimately determines how histones are modified. But in at least some 
cases, the covalent modifications on nucleosomes can persist long after the tran- 
scription regulator proteins that first induced them have disappeared, thereby 
providing the cell with a memory of its developmental history. Most remarkably, 
as in the related phenomenon of position effect variegation discussed above, this 
memory can be transmitted from one cell generation to the next. 

Very different patterns of covalent modification are found on different groups 
of nucleosomes, depending both on their exact position in the genome and on 
the history of the cell. The modifications of the histones are carefully controlled, 
and they have important consequences. The acetylation of lysines on the N-ter- 
minal tails loosens chromatin structure, in part because adding an acetyl group 
to lysine removes its positive charge, thereby reducing the affinity of the tails for 
adjacent nucleosomes. However, the most profound effects of the histone modifi- 
cations lie in their ability to recruit specific other proteins to the modified stretch 
of chromatin. Trimethylation of one specific lysine on the histone H3 tail, for 
instance, attracts the heterochromatin-specific protein HP1 and contributes to 
the establishment and spread of heterochromatin. More generally, the recruited 
proteins act with the modified histones to determine how and when genes will be 
expressed, as well as other chromosome functions. In this way, the precise struc- 
ture of each domain of chromatin governs the readout of the genetic information 
that it contains, and thereby the structure and function of the eukaryotic cell. 
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Chromatin Acquires Additional Variety Through the Site-Specific 
Insertion of a Small Set of Histone Variants 


In addition to the four highly conserved standard core histones, eukaryotes also 
contain a few variant histones that can also assemble into nucleosomes. These 
histones are present in much smaller amounts than the major histones, and they 
have been less well conserved over long evolutionary times. Variants are known 
for each of the core histones with the exception of H4; some examples are shown 
in Figure 4-35. 

The major histones are synthesized primarily during the S phase of the cell 
cycle and assembled into nucleosomes on the daughter DNA helices just behind 
the replication fork (see Figure 5-32). In contrast, most histone variants are syn- 
thesized throughout interphase. They are often inserted into already-formed 
chromatin, which requires a histone-exchange process catalyzed by the ATP-de- 
pendent chromatin remodeling complexes discussed previously. These remodel- 
ing complexes contain subunits that cause them to bind both to specific sites on 
chromatin and to histone chaperones that carry a particular variant. As a result, 
each histone variant is inserted into chromatin in a highly selective manner (see 
Figure 4-27). 


Covalent Modifications and Histone Variants Act in Concert to 
Control Chromosome Functions 


The number of possible distinct markings on an individual nucleosome is in prin- 
ciple enormous, and this potential for diversity is still greater when we allow for 
nucleosomes that contain histone variants. However, the histone modifications 
are known to occur in coordinated sets. More than 15 such sets can be identified 
in mammalian cells. However, it is not yet clear how many different types of chro- 
matin are functionally important in cells. 

Some combinations are known to have a specific meaning for the cell in the 
sense that they determine how and when the DNA packaged in the nucleosomes 
is to be accessed or manipulated—a fact that led to the idea of a “histone code.” 
For example, one type of marking signals that a stretch of chromatin has been 
newly replicated, another signals that the DNA in that chromatin has been dam- 
aged and needs repair, while others signal when and how gene expression should 
take place. Various regulatory proteins contain small domains that bind to spe- 
cific marks, recognizing, for example, a trimethylated lysine 4 on histone H3 (Fig- 
ure 4-36). These domains are often linked together as modules in a single large 


Figure 4-35 The structure of some histone 
variants compared with the major histone 
that they replace. The histone variants 

are inserted into nucleosomes at specific 
sites on chromosomes by ATP-dependent 
chromatin remodeling enzymes that act in 
concert with histone chaperones (see Figure 
4-27). The CENP-A (Centromere Protein-A) 
variant of histone H3 is discussed later in 
this chapter (see Figure 4—42); other variants 
are discussed in Chapter 7. The sequences 
in each variant that are colored differently 
(compared to the major histone above it) 
denote regions with an amino acid sequence 
different from this major histone. (Adapted 
from K. Sarma and D. Reinberg, Nat. Rev. 
Mol. Cell Biol. 6:1389-149, 2005. With 
permission from Macmillan Publishers Ltd.) 
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Figure 4-36 How a mark on a nucleosome is read. The figure shows the structure of a protein module (called an ING PHD 
domain) that specifically recognizes histone H3 trimethylated on lysine 4. (A) A trimethyl group. (B) Space-filling model of an ING 
PHD domain bound to a histone tail (green, with the trimethyl group highlighted in yellow). (C) A ribbon model showing how 

the N-terminal six amino acids in the H3 tail are recognized. The red lines represent hydrogen bonds. This is one of a family of 
PHD domains that recognize methylated lysines on histones; different members of the family bind tightly to lysines located at 
different positions, and they can discriminate between a mono-, di-, and trimethylated lysine. In a similar way, other small protein 
modules recognize specific histone side chains that have been marked with acetyl groups, phosphate groups, and so on. 
(Adapted from P.V. Peña et al., Nature 442:100-108, 2006. With permission from Macmillan Publishers Ltd.) 


protein or protein complex, which thereby recognizes a specific combination of 
histone modifications (Figure 4-37). The result is a reader complex that allows 
particular combinations of markings on chromatin to attract additional proteins, 
so as to execute an appropriate biological function at the right time (Figure 4-38). 

The marks on nucleosomes due to covalent additions to histones are dynamic, 
being constantly removed and added at rates that depend on their chromosomal 
locations. Because the histone tails extend outward from the nucleosome core 
and are likely to be accessible even when chromatin is condensed, they would 
seem to provide an especially suitable format for creating marks that can be 
readily altered as a cell’s needs change. Although much remains to be learned 
about the meaning of the different histone modifications, a few well-studied 
examples of the information that can be encoded in the histone H3 tail are listed 
in Figure 4-39. 


A Complex of Reader and Writer Proteins Can Spread Specific 
Chromatin Modifications Along a Chromosome H3 tail exit H4 tail exit 


from core from core 


The phenomenon of position effect variegation described previously requires that 
some modified forms of chromatin have the ability to spread for substantial dis- 
tances along a chromosomal DNA molecule (see Figure 4-31). How is this possi- 
ble? 

The enzymes that add or remove modifications to histones in nucleosomes 
are part of multisubunit complexes. They can initially be brought to a particu- 
lar region of chromatin by one of the sequence-specific DNA-binding proteins 
(transcription regulators) discussed in Chapters 6 and 7 (for a specific example, 





Figure 4-37 Recognition of a specific combination of marks ona 
nucleosome. In the example shown, two adjacent domains that are part of 
the NURF (Nucleosome Remodeling Factor) chromatin remodeling complex 
bind to the nucleosome, with the PHD domain (red) recognizing a methylated 
H3 lysine 4 and another domain (a bromodomain, blue) recognizing an 
acetylated H4 lysine 16. These two histone marks constitute a unique histone 
modification pattern that occurs in subsets of nucleosomes in human cells. 
Here the two histone tails are indicated by green dotted lines, and only half 
of one nucleosome is shown. (Adapted from A.J. Ruthenburg et al., Cell 
145:692-706, 2011. With permission from Elsevier.) 
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see Figure 7-20). But after a modifying enzyme “writes” its mark on one or a few 
neighboring nucleosomes, events that resemble a chain reaction can ensue. In 
such a case, the “writer enzyme” works in concert with a “reader protein” located 
in the same protein complex. The reader protein contains a module that recog- 
nizes the mark and binds tightly to the newly modified nucleosome (see Figure 
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Figure 4-38 Schematic diagram showing 
how a particular combination of histone 
modifications can be recognized by a 
reader complex. A large protein complex 
that contains a series of protein modules, 
each of which recognizes a specific histone 
mark, is schematically illustrated (green). 
This “reader complex” will bind tightly only 
to a region of chromatin that contains 
several of the different histone marks that 

it recognizes. Therefore, only a specific 
combination of marks will cause the 
complex to bind to chromatin and attract 
the additional protein complexes (purple) 
needed to catalyze a biological function. 


Figure 4-39 Some specific meanings 

of histone modifications. (A) The 
modifications on the histone H3 N-terminal 
tail are shown, repeated from Figure 

4-34. (B) The H3 tail can be marked by 
different sets of modifications that act in 
combination to convey a specific meaning. 
Only a small number of the meanings 

are known, including the three examples 
shown. Not illustrated is the fact that, as 
just implied (see Figure 4-38), reading a 
histone mark generally involves the joint 
recognition of marks at other sites on the 
nucleosome along with the indicated H3 
tail recognition. In addition, specific levels 
of methylation (mono-, di-, or trimethyl 
groups) are generally required. Thus, 

for example, the trimethylation of lysine 

9 attracts the heterochromatin-specific 
protein HP1, which induces a spreading 
wave of further lysine 9 trimethylation 
followed by further HP1 binding, according 
to the general scheme that will be 
illustrated shortly (see Figure 4—40). Also 
important in this process, however, is a 
synergistic trimethylation of the histone H4 
N-terminal tail on lysine 20. 
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4-36), activating an attached writer enzyme and positioning it near an adjacent 
nucleosome. Through many such read-write cycles, the reader protein can carry 
the writer enzyme along the DNA—spreading the mark in a hand-over-hand man- 
ner along the chromosome (Figure 4-40). 

In reality, the process is more complicated than the scheme just described. 
Both readers and writers are part of a protein complex that is likely to contain 
multiple readers and writers, and to require multiple marks on the nucleosome to 
spread. Moreover, many of these reader-writer complexes also contain an ATP-de- 
pendent chromatin remodeling protein (see Figure 4-26C), and the reader, writer, 
and remodeling proteins can work in concert to either decondense or condense 
long stretches of chromatin as the reader moves progressively along the nucleo- 
some-packaged DNA. 

A similar process is used to remove histone modifications from specific regions 
of the DNA; in this case, an “eraser enzyme,’ such as a histone demethylase or his- 
tone deacetylase, is recruited to the complex. As for the writer complex in Figure 
4-40, sequence-specific DNA-binding proteins (transcription regulators) direct 
where such modifications occur (discussed in Chapter 7). 

Some idea of the complexity of the above processes can be derived from the 
results of genetic screens for genes that either enhance or suppress the spreading 
and stability of heterochromatin, as manifest in effects on position effect varie- 
gation in Drosophila (see Figure 4-32). As pointed out previously, more than 100 
such genes are known, and most of them are likely to code for subunits in one or 
more reader-writer-remodeling protein complexes. 
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Figure 4-40 How the recruitment 

of a reader-writer complex can 

spread chromatin changes along a 
chromosome. The writer is an enzyme 
that creates a specific modification on one 
or more of the four nucleosomal histones. 
After its recruitment to a specific site on a 
chromosome by a transcription regulatory 
protein, the writer collaborates with a 
reader protein to spread its mark from 
nucleosome to nucleosome by means of 
the indicated reader—writer complex. For 
this mechanism to work, the reader must 
recognize the same histone modification 
mark that the writer produces; its binding 
to that mark can be shown to activate 

the writer. In this schematic example, a 
spreading wave of chromatin condensation 
is thereby induced. Not shown are the 
additional proteins involved, including an 
ATP-dependent chromatin remodeling 
complex required to reposition the modified 
nucleosomes. 
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Barrier DNA Sequences Block the Spread of Reader—Writer 
Complexes and thereby Separate Neighboring Chromatin 
Domains 


The above mechanism for spreading chromatin structures raises a potential prob- 
lem. Inasmuch as each chromosome contains one continuous, very long DNA 
molecule, what prevents a cacophony of confusing cross-talk between adjacent 
chromatin domains of different structure and function? Early studies of position 
effect variegation had suggested an answer: certain DNA sequences mark the 
boundaries of chromatin domains and separate one such domain from another 
(see Figure 4-31). Several such barrier sequences have now been identified and 
characterized through the use of genetic engineering techniques that allow spe- 
cific DNA segments to be deleted from, or inserted in, chromosomes. 

For example, in cells that are destined to give rise to red blood cells, asequence 
called HS4 normally separates the active chromatin domain that contains the 
human ß-globin locus from an adjacent region of silenced, condensed chromatin. 
If this sequence is deleted, the B-globin locus is invaded by condensed chromatin. 
This chromatin silences the genes it covers, and it spreads to a different extent in 
different cells, causing position effect variegation similar to that observed in Dro- 
sophila. As described in Chapter 7, the consequences are dire: the globin genes 
are poorly expressed, and individuals who carry such a deletion have a severe 
form of anemia. 

In genetic engineering experiments, the HS4 sequence is often added to both 
ends of a gene that is to be inserted into a mammalian genome, in order to protect 
that gene from the silencing caused by spreading heterochromatin. Analysis of 
this barrier sequence reveals that it contains a cluster of binding sites for histone 
acetylase enzymes. Since the acetylation of a lysine side chain is incompatible 
with the methylation of the same side chain, and specific lysine methylations are 
required to spread heterochromatin, histone acetylases are logical candidates for 
the formation of DNA barriers to spreading (Figure 4-41). However, several other 
types of chromatin modifications are known that can also protect genes from 
silencing. 
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Figure 4-41 Some mechanisms of 
barrier action. These models are derived 
from experimental analyses of barrier 
action, and a combination of several of 
them may function at any one site. 

(A) The tethering of a region of chromatin to 
a large fixed site, such as the nuclear pore 
complex illustrated here, can form a barrier 
that stops the spread of heterochromatin. 
(B) The tight binding of barrier proteins to 
a group of nucleosomes can make this 
chromatin resistant to heterochromatin 
spreading. (C) By recruiting a group of 
highly active histone-modifying enzymes, 
barriers can erase the histone marks that 
are required for heterochromatin to spread. 
For example, a potent acetylation of lysine 
9 on histone H3 will compete with lysine 9 
methylation, thereby preventing the binding 
of the HP1 protein needed to form a major 
form of heterochromatin. (Based on 

A.G. West and P. Fraser, Hum. Mol. Genet. 
14:R101-R111, 2005. With permission 
from Oxford University Press.) 


CHROMATIN STRUCTURE AND FUNCTION 203 


Saccharomyces cerevisiae, a special 
normal nucleosome with centromeric DNA sequence assembles a 
nucleosome centromere-specific single nucleosome in which two copies of 
histone H3 an H3 variant histone (called CENP-A in 
most organisms) replace the normal HS. 


sequence-specific (B) How peptide sequences unique to 
DNA-binding protein this variant histone (See Figure 4—35) help 
_ to assemble additional proteins, some 
g of which form a kinetochore. The yeast 
kinetochore is unusual in capturing only 


a single microtubule; humans have much 
yeast centromeric DNA larger centromeres and form kinetochores 
that can capture 20 or more microtubules 


(see Figure 4—43). The kinetochore is 
— discussed in detail in Chapter 17. (Adapted 


from A. Joglekar et al., Nat. Cell Biol. 
8:581-585, 2006. With permission from 


Macmillan Publishers Ltd.) 
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The Chromatin in Centromeres Reveals How Histone Variants Can 
Create Special Structures 


Nucleosomes carrying histone variants have a distinctive character and are 
thought to be able to produce marks in chromatin that are unusually long-lasting. 
An important example is seen in the formation and inheritance of the specialized 
chromatin structure at the centromere, the region of each chromosome required 
for attachment to the mitotic spindle and orderly segregation of the duplicated 
copies of the genome into daughter cells each time a cell divides. In many com- 
plex organisms, including humans, each centromere is embedded in a stretch of 
special centromeric chromatin that persists throughout interphase, even though 
the centromere-mediated attachment to the spindle and movement of DNA occur 
only during mitosis. This chromatin contains a centromere-specific variant H3 
histone, known as CENP-A (Centromere Protein-A; see Figure 4-35), plus addi- 
tional proteins that pack the nucleosomes into particularly dense arrangements 
and form the kinetochore, the special structure required for attachment of the 
mitotic spindle (see Figure 4-19). 

A specific DNA sequence of approximately 125 nucleotide pairs is sufficient to 
serve as a centromere in the yeast S. cerevisiae. Despite its small size, more than 
a dozen different proteins assemble on this DNA sequence; the proteins include 
the CENP-A histone H3 variant, which, along with the three other core histones, 
forms a centromere-specific nucleosome. The additional proteins at the yeast 
centromere attach this nucleosome to a single microtubule from the yeast mitotic 
spindle (Figure 4-42). 

The centromeres in more complex organisms are considerably larger than 
those in budding yeasts. For example, fly and human centromeres extend over 
hundreds of thousands of nucleotide pairs and, while they contain CENP-A, they 
do not seem to contain a centromere-specific DNA sequence. These centromeres 
largely consist of short, repeated DNA sequences, known as alpha satellite DNA 
in humans. But the same repeat sequences are also found at other (non-centro- 
meric) positions on chromosomes, indicating that they are not sufficient to direct 
centromere formation. Most strikingly, in some unusual cases, new human cen- 
tromeres (called neocentromeres) have been observed to form spontaneously on 
fragmented chromosomes. Some of these new positions were originally euchro- 
matic and lack alpha satellite DNA altogether (Figure 4-43). It seems that cen- 
tromeres in complex organisms are defined by an assembly of proteins, rather 
than by a specific DNA sequence. 

Inactivation of some centromeres and genesis of others de novo seem to have 
played an essential part in evolution. Different species, even when quite closely 
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Figure 4—43 Evidence for the plasticity of human centromere formation. (A) A series of A-T-rich aloha satellite DNA 
sequences is repeated many thousands of times at each human centromere (red), and is surrounded by pericentric 
heterochromatin (brown). However, due to an ancient chromosome breakage-and-rejoining event, some human chromosomes 
contain two blocks of alpha satellite DNA, each of which presumably functioned as a centromere in its original chromosome. 
Usually, chromosomes with two functional centromeres are not stably propagated because they attach improperly to the 
spindle and are broken apart during mitosis. In chromosomes that do survive, however, one of the centromeres has somehow 
become inactivated, even though it contains all the necessary DNA sequences. This allows the chromosome to be stably 
propagated. (B) In a small fraction (1/2000) of human births, extra chromosomes are observed in cells of the offspring. Some of 
these extra chromosomes, which have formed from a breakage event, lack alpha satellite DNA altogether, yet new centromeres 


(neocentromeres) have arisen from what was originally euchromatic DNA. 


The complexity of centromeric chromatin is not illustrated in these diagrams. The alpha satellite DNA that forms centromeric 
chromatin in humans is packaged into alternating blocks of chromatin. One block is formed from a long string of nucleosomes 
containing the CENP-A H3 variant histone; the other block contains nucleosomes that are specially marked with dimethyl lysine 
4 on the normal H3 histone. Each block is more than a thousand nucleosomes long. This centromeric chromatin is flanked by 
pericentric heterochromatin, as shown. The pericentric chromatin contains methylated lysine 9 on its H3 histones, along with 
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HP1 protein, and it is an example of “classica 


related, often have different numbers of chromosomes; see Figure 4-14 for an 
extreme example. As we shall discuss below, detailed genome comparisons show 
that in many cases the changes in chromosome numbers have arisen through 
chromosome breakage-and-rejoining events, creating novel chromosomes, some 
of which must initially have contained abnormal numbers of centromeres—either 
more than one, or none at all. Yet stable inheritance requires that each chromo- 
some should contain one centromere, and one only. It seems that surplus cen- 
tromeres must have been inactivated, and/or new centromeres created, so as to 
allow the rearranged chromosome sets to be stably maintained. 


Some Chromatin Structures Can Be Directly Inherited 


The changes in centromere activity just discussed, once established, need to be 
perpetuated through subsequent cell generations. What could be the mechanism 
of this type of epigenetic inheritance? 

It has been proposed that de novo centromere formation requires an initial 
seeding event, involving the formation of a specialized DNA-protein structure that 
contains nucleosomes formed with the CENP-A variant of histone H3. In humans, 
this seeding event happens more readily on arrays of alpha satellite DNA than 
on other DNA sequences. The H3-H4 tetramers from each nucleosome on the 
parental DNA helix are directly inherited by the sister DNA helices at a replication 
fork (see Figure 5-32). Therefore, once a set of CENP-A-containing nucleosomes 
has been assembled on a stretch of DNA, it is easy to understand how a new cen- 
tromere could be generated in the same place on both daughter chromosomes 
following each round of cell division. One need only assume that the presence of 
the CENP-A histone in an inherited nucleosome selectively recruits more CENP-A 
histone to its newly formed neighbors. 

There are some striking similarities between the formation and maintenance 
of centromeres and the formation and maintenance of some other regions of 


heterochromatin (see Figure 4-39). 
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heterochromatin. In particular, the entire centromere forms as an all-or-none 
entity, suggesting that the creation of centromeric chromatin is a highly coop- 
erative process, spreading out from an initial seed in a manner reminiscent of 
the phenomenon of position effect variegation that we discussed earlier. In both 
cases, a particular chromatin structure, once formed, seems to be directly inher- 
ited on the DNA following each round of chromosome replication. A cooperative 
recruitment of proteins, along with the action of reader-writer complexes, can 
thus not only account for the spreading of specific forms of chromatin in space 
along the chromosome, but also for its propagation across cell generations—from 
parent cell to daughter cell (Figure 4-44). 


Experiments with Frog Embryos Suggest that both Activating and 
Repressive Chromatin Structures Can Be Inherited Epigenetically 


Epigenetic inheritance plays a central part in the creation of multicellular organ- 
isms. Their differentiated cell types become established during development, and 
persist thereafter even through repeated cell-division cycles. The daughters of a 
liver cell persist as liver cells, those of an epidermal cell as epidermal cells, and so 
on, even though they all contain the same genome; and this is because distinctive 
patterns of gene expression are passed on faithfully from parent cell to daughter 
cell. Chromatin structure has a role in this epigenetic transmission of information 
from one cell generation to the next. 

One type of evidence comes from studies in which the nucleus of a cell from 
a frog or tadpole is transplanted into a frog egg whose own nucleus has been 
removed (an enucleated egg). In a classic set of experiments performed in 1968, 
it was shown that a nucleus taken from a differentiated donor cell can be repro- 
grammed in this way to support development of a whole new tadpole (see Figure 
7-2). But this reprogramming occurs only with difficulty, and it becomes less and 
less efficient as nuclei from older animals are used. Thus, for example, less than 
2% of the enucleated eggs injected with a nucleus from a tadpole epithelial cell 
developed to the swimming tadpole stage, compared with 35% when the donor 
nuclei were taken instead from an early (gastrula-stage) embryo. With new exper- 
imental tools, the cause of this resistance to reprogramming can now be traced. 
It arises, at least in part, because specific chromatin structures in the original dif- 
ferentiated nucleus tend to persist and be transmitted through the many cell-di- 
vision cycles required for embryonic development. In experiments with Xenopus 
embryos, specific forms of either repressive or active chromatin structures could 
be demonstrated to persist through as many as 24 cell divisions, causing the mis- 
placed expression of genes. Figure 4-45 briefly describes one such experiment, 
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Figure 4—44 How the packaging of 

DNA in chromatin can be inherited 
following chromosome replication. 

In this model, some of the specialized 
chromatin components are distributed 

to each sister chromosome after DNA 
duplication, along with the specially marked 
nucleosomes that they bind. After DNA 
replication, the inherited nucleosomes that 
are specially modified, acting in concert 
with the inherited chromatin components, 
change the pattern of histone modification 
on the newly formed nucleosomes nearby. 
This creates new binding sites for the 
same chromatin components, which then 
assemble to complete the structure. The 
latter process Is likely to involve reader- 
writer-remodeling complexes operating in a 
manner similar to that previously illustrated 
in Figure 4—40. 
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focused on chromatin containing the histone variant, H3.3. We shall return to 
these phenomena in the final section of Chapter 22, where we discuss stem cells 
and the ways in which one cell type can be converted into another. 


Chromatin Structures Are Important for Eukaryotic Chromosome 
Function 


Although a great deal remains to be learned about the functions of different chro- 
matin structures, the packaging of DNA into nucleosomes was probably crucial 
for the evolution of eukaryotes like ourselves. To form a complex multicellular 
organism, the cells in different lineages must specialize by changing the acces- 
sibility and activity of many hundreds of genes. As described in Chapter 21, this 
process depends on cell memory: each cell holds a record ofits past developmen- 
tal history in the regulatory circuits that control its many genes. That record, it 
seems, is partly stored in the structure of the chromatin. 

Although bacteria also have cell memory mechanisms, the complexity of the 
memory circuits in higher eukaryotes is unparalleled. Strategies based on local 
variations in chromatin structure, unique to eukaryotes, can enable individual 
genes, once they are switched on or switched off, to stay in that state until some 
new factor acts to reverse it. At one extreme are structures like centromeric chro- 
matin that, once established, are stably inherited from one cell generation to the 
next. Likewise, the major “classical” type of heterochromatin, which contains long 
arrays of the HP1 protein (see Figure 4-39), can persist stably throughout life. In 
contrast, a form of condensed chromatin that is created by the Polycomb group of 
proteins serves to silence genes that must be kept inactive in some conditions, but 
are active in others. The latter mechanism governs the expression of a large num- 
ber of genes that encode transcription regulators important in early embryonic 
development, as discussed in Chapter 21. There are many other variant forms of 
chromatin, some with much shorter lifetimes, often less than the division time of 
the cell. We shall say more about the variety of chromatin types in the next section. 


Figure 4-45 Evidence for the inheritance 
of a gene-activating chromatin state. 
The well-characterized MyoD gene 
encodes a master transcription regulatory 
protein for muscle, MyoD (see p. 399). This 
gene is normally turned on in the indicated 
region of the young embryo where somites 
form. When a nucleus from this region is 
injected into an enucleated egg as shown, 
many of the progeny cell nuclei abnormally 
express the MyoD protein in non-muscle 
regions of the “nuclear transplant embryo” 
that forms. This abnormal expression can 
be attributed to maintenance of the MyoD 
promoter region in its active chromatin 
state through the many cycles of cell 
division that produce the blastula-stage 
embryo —a so-called “epigenetic memory” 
that persists in this case in the absence 

of transcription. The active chromatin 
surrounding the MyoD promoter contains 
the variant histone H3.3 (see Figure 4-35) 
in a Lys4 methylated form. As indicated, 
an overproduction of this histone caused 
by injecting excess MRNA encoding the 
normal H3.3 protein increases both H3.3 
occupancy on the MyoD promoter and 

the epigenetic MyoD production, whereas 
injection of an MRNA producing a mutant 
form of H3.3 that cannot be methylated 

at Lys4 reduces the epigenetic MyoD 
production. Such experiments provide 
evidence that an inherited chromatin state 
underlies the epigenetic memory observed. 
(Adapted from R.K. Ng and J.B. Gurdon, 
Nat. Cell Biol. 10:102-109, 2008. With 
permission from Macmillan Publishers Ltd.) 
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Summary 


In the chromosomes of eukaryotes, DNA is uniformly assembled into nucleosomes, 
but a variety of different chromatin structures is possible. This variety is based on a 
large set of reversible covalent modifications of the four histones in the nucleosome 
core. These modifications include the mono-, di-, and trimethylation of many differ- 
ent lysine side chains, an important reaction that is incompatible with the acetyla- 
tion that can occur on the same lysines. Specific combinations of the modifications 
mark many nucleosomes, governing their interactions with other proteins. These 
marks are read when protein modules that are part of a larger protein complex 
bind to the modified nucleosomes in a region of chromatin. These reader proteins 
then attract additional proteins that perform various functions. 

Some reader protein complexes contain a histone-modifying enzyme, such as a 
histone lysine methylase, that “writes” the same mark that the reader recognizes. A 
reader-writer-remodeling complex of this type can spread a specific form of chro- 
matin along a chromosome. In particular, large regions of condensed heterochro- 
matin are thought to be formed in this way. Heterochromatin is commonly found 
around centromeres and near telomeres, but it is also present at many other posi- 
tions in chromosomes. The tight packaging of DNA into heterochromatin usually 
silences the genes within it. 

The phenomenon of position effect variegation provides strong evidence for the 
inheritance of condensed states of chromatin from one cell generation to the next. A 
similar mechanism appears to be responsible for maintaining the specialized chro- 
matin at centromeres. More generally, the ability to propagate specific chromatin 
structures across cell generations makes possible an epigenetic cell memory process 
that plays a role in maintaining the set of different cell states required by complex 
multicellular organisms. 
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Having discussed the DNA and protein molecules from which the chromatin fiber 
is made, we now turn to the organization of the chromosome on a more global 
scale and the way in which its various domains are arranged in space. As a 30-nm 
fiber, a typical human chromosome would still be 0.1 cm in length and able to 
span the nucleus more than 100 times. Clearly, there must be a still higher level 
of folding, even in interphase chromosomes. Although the molecular details are 
still largely a mystery, this higher-order packaging almost certainly involves the 
folding of the chromatin into a series of loops and coils. This chromatin packing is 
fluid, frequently changing in response to the needs of the cell. 

We begin this section by describing some unusual interphase chromosomes 
that can be easily visualized. Exceptional though they are, these special cases 
reveal features that are thought to be representative of all interphase chromo- 
somes. Moreover, they provide ways to investigate some fundamental aspects of 
chromatin structure that we have touched on in the previous section. Next, we 
describe how a typical interphase chromosome is arranged in the mammalian 
cell nucleus. Finally, we shall discuss the additional tenfold compaction that chro- 
mosomes undergo in the passage from interphase to mitosis. 


Chromosomes Are Folded into Large Loops of Chromatin 


Insight into the structure of the chromosomes in interphase cells has come from 
studies of the stiff and enormously extended chromosomes in growing amphib- 
ian oocytes (immature eggs). These very unusual lampbrush chromosomes (the 
largest chromosomes known), paired in preparation for meiosis, are clearly visi- 
ble even in the light microscope, where they are seen to be organized into a series 
of large chromatin loops emanating from a linear chromosomal axis (Figure 4-46 
and Figure 4-47). 

In these chromosomes, a given loop always contains the same DNA sequence 
that remains extended in the same manner as the oocyte grows. These chromo- 
somes are producing large amounts of RNA for the oocyte, and most of the genes 
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Figure 4—46 A model for the chromatin 
domains in a lampbrush chromosome. 
Shown is a small portion of one pair of 
sister chromatids. Here, two identical DNA 
double helices are aligned side by side, 
packaged into different types of chromatin. 
The set of lampbrush chromosomes 

in many amphibians contains a total of 
about 10,000 loops resembling those 
shown here. The rest of the DNA in each 
chromosome (the great majority) remains 
highly condensed. Four copies of each 
loop are present in the cell, since each 
lampbrush chromosome consists of two 
aligned sets of paired chromatids. This 
four-stranded structure is characteristic of 
this stage of development of the oocyte, 
which has arrested at the diplotene stage 
of meiosis; see Figure 17-56. 
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present in the DNA loops are being actively expressed. The majority of the DNA, 
however, is not in loops but remains highly condensed on the chromosome axis, 
where genes are generally not expressed. 

It is thought that the interphase chromosomes of all eukaryotes are similarly 
arranged in loops. Although these loops are normally too small and fragile to be 
easily observed in a light microscope, other methods can be used to infer their 
presence. For example, modern DNA technologies have made it possible to assess 
the frequency with which any two loci along an interphase chromosome are held 
together, thus revealing likely candidates for the sites on chromatin that form the 
bases of loop structures (Figure 4-48). These experiments and others suggest that 
the DNA in human chromosomes is likely to be organized into loops of various 
lengths. A typical loop might contain between 50,000 and 200,000 nucleotide 
pairs of DNA, although loops of a million nucleotide pairs have also been sug- 
gested (Figure 4-49). 


Polytene Chromosomes Are Uniquely Useful for Visualizing 
Chromatin Structures 


Further insight has come from another unusual class of cells—the polytene cells of 
flies, such as the fruit fly Drosophila. Some types of cells, in many organisms, grow 
abnormally large through multiple cycles of DNA synthesis without cell division. 
Such cells, containing increased numbers of standard chromosomes, are said to 
be polyploid. In the salivary glands of fly larvae, this process is taken to an extreme 
degree, creating huge cells that contain hundreds or thousands of copies of the 
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Figure 4-47 Lampbrush chromosomes. 
(A) A light micrograph of lampbrush 
chromosomes in an amphibian oocyte. 
Early in oocyte differentiation, each 
chromosome replicates to begin 

meiosis, and the homologous replicated 
chromosomes pair to form this highly 
extended structure containing a total of 
four replicated DNA double helices, or 
chromatids. The lampbrush chromosome 
stage persists for months or years, while 
the oocyte builds up a supply of materials 
required for its ultimate development into 
a new individual. (B) An enlarged region 
of a similar chromosome, stained with a 
fluorescent reagent that makes the loops 
active in RNA synthesis clearly visible. 
(Courtesy of Joseph G. Gall.) 


THE GLOBAL STRUCTURE OF CHROMOSOMES 


cur 4. DNA. en 


DNA-binding 
proteins 


cross-link 
formed 





NUCLEASE eget S 
/ e L 
S 


genome. Moreover, in this case, all the copies of each chromosome are aligned 
side by side in exact register, like drinking straws in a box, to create giant polytene 
chromosomes. These allow features to be detected that are thought to be shared 
with ordinary interphase chromosomes, but are normally hard to see. 

When polytene chromosomes from a fly’s salivary glands are viewed in the 
light microscope, distinct alternating dark bands and light interbands are visible 
(Figure 4-50), each formed from a thousand identical DNA sequences arranged 
side by side in register. About 95% of the DNA in polytene chromosomes is in 
bands, and 5% is in interbands. A very thin band can contain 3000 nucleotide 
pairs, while a thick band may contain 200,000 nucleotide pairs in each ofits chro- 
matin strands. The chromatin in each band appears dark because the DNA is more 
condensed than the DNA in interbands; it may also contain a higher concentra- 
tion of proteins (Figure 4-51). This banding pattern seems to reflect the same sort 
of organization detected in the amphibian lampbrush chromosomes described 
earlier. 

There are approximately 3700 bands and 3700 interbands in the complete set 
of Drosophila polytene chromosomes. The bands can be recognized by their dif- 
ferent thicknesses and spacings, and each one has been given a number to gener- 
ate a chromosome “map” that has been indexed to the finished genome sequence 
of this fly. 

The Drosophila polytene chromosomes provide a good starting point for exam- 
ining how chromatin is organized on a large scale. In the previous section, we 
saw that there are many forms of chromatin, each of which contains nucleosomes 
with a different combination of modified histones. Specific sets of non-histone 
proteins assemble on these nucleosomes to affect biological function in differ- 
ent ways. Recruitment of some of these non-histone proteins can spread for long 
distances along the DNA, imparting a similar chromatin structure to broad tracts 
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Figure 4—48 A method for determining 
the position of loops in interphase 
chromosomes. In this technique, known 
as the chromosome conformation 
capture (3C) method, cells are treated 
with formaldehyde to create the indicated 
covalent DNA-protein and DNA-DNA 
cross-links. The DNA is then treated with 
an enzyme (a restriction nuclease) that 
chops the DNA into many pieces, cutting 
at strictly defined nucleotide sequences 
and forming sets of identical “cohesive 
ends” (see Figure 8-28). The cohesive 
ends can be made to join through their 
complementary base-pairing. Importantly, 
prior to the ligation step shown, the DNA 
is diluted so that the fragments that have 
been kept in close proximity to each other 
(through cross-linking) are the ones most 
likely to join. Finally, the cross-links are 
reversed and the newly ligated fragments 
of DNA are identified and quantified by 
PCR (the polymerase chain reaction, 
described in Chapter 8). From the results, 
combined with DNA sequence information, 
one can derive models for the interphase 
conformation of chromosomes. 
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Figure 4-49 A model for the organization of an interphase chromosome. A section of an interohase chromosome is shown folded into a series 
of looped domains, each containing perhaps 50,000-200,000 or more nucleotide pairs of double-helical DNA condensed into a chromatin fiber. 
The chromatin in each individual loop is further condensed through poorly understood folding processes that are reversed when the cell requires 
direct access to the DNA packaged in the loop. Neither the composition of the postulated chromosomal axis nor how the folded chromatin fiber is 
anchored to it is clear. However, in mitotic chromosomes, the bases of the chromosomal loops are enriched both in condensins (discussed below) 
and in DNA topoisomerase II enzymes (discussed in Chapter 5), two proteins that may form much of the axis at metaphase. 
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of the genome (see Figure 4-40). Such regions, where all of the chromatin has 
a similar structure, are separated from neighboring domains by barrier proteins 
(see Figure 4-41). At low resolution, the interphase chromosome can therefore 
be considered as a mosaic of chromatin structures, each containing particular 
nucleosome modifications associated with a particular set of non-histone pro- 
teins. Polytene chromosomes allow us to see details of this mosaic of domains in 
the light microscope, as well as to observe some of the changes associated with 
gene expression. 


There Are Multiple Forms of Chromatin 


By staining Drosophila polytene chromosomes with antibodies, or by using a 
more recent technique called ChIP (chromatin immunoprecipitation) analysis 
(see Chapter 8), the locations of the histone and non-histone proteins in chro- 
matin can be mapped across the entire DNA sequence of an organism’s genome. 
Such an analysis in Drosophila has thus far localized more than 50 different chro- 
matin proteins and histone modifications. The results suggest that three major 
types of repressive chromatin predominate in this organism, along with two major 
types of chromatin on actively transcribed genes, and that each type is associated 
with a different complex of non-histone proteins. Thus, classical heterochromatin 
contains more than six such proteins, including heterochromatin protein 1 (HP1), 
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Figure 4-50 The entire set of polytene 
chromosomes in one Drosophila salivary 
cell. In this drawing of a light micrograph, 
the giant chromosomes have been 
spread out for viewing by squashing them 
against a microscope slide. Drosophila 
has four chromosomes, and there are four 
different chromosome pairs present. But 
each chromosome is tightly paired with 
its homolog (so that each pair appears 
as a single structure), which is not true 
in most nuclei (except in meiosis). Each 
chromosome has undergone multiple 
rounds of replication, and the homologs 
and all their duplicates have remained in 
exact register with each other, resulting 
in huge chromatin cables many DNA 
strands thick. 

The four polytene chromosomes 
are normally linked together by 
heterochromatic regions near their 
centromeres that aggregate to create 
a single large chromocenter (pink 
region). In this preparation, however, the 
chromocenter has been split into two 
halves by the squashing procedure used. 
(Adapted from T.S. Painter, J. Hered. 
25:465-476, 1934. With permission from 
Oxford University Press.) 


Figure 4-51 Micrographs of polytene 
chromosomes from Drosophila salivary 
glands. (A) Light micrograph of a portion of 
a chromosome. The DNA has been stained 
with a fluorescent dye, but a reverse image 
is presented here that renders the DNA 
black rather than white; the bands are 
clearly seen to be regions of increased 
DNA concentration. This chromosome 

has been processed by a high-pressure 
treatment so as to show its distinct pattern 
of bands and interbands more clearly. 

(B) An electron micrograph of a small 
section of a Drosophila polytene 
chromosome seen in thin section. Bands 
of very different thickness can be readily 
distinguished, separated by interbands, 
which contain less condensed chromatin. 
(A, adapted from D.V. Novikov, |. Kireev 
and A.S. Belmont, Nat. Methods 4:483- 
485, 2007. With permission from 
Macmillan Publishers Ltd; B, courtesy 

of Veikko Sorsa.) 
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Figure 4-52 RNA synthesis in polytene chromosome puffs. 

An autoradiograph of a single puff in a polytene chromosome from the 
salivary glands of the freshwater midge Chironomus tentans. As outlined 

in Chapter 1 and described in detail in Chapter 6, the first step in gene 
expression is the synthesis of an RNA molecule using the DNA as a template. 
The decondensed portion of the chromosome is undergoing RNA synthesis 
and has become labeled with SH-uridine, an RNA precursor molecule that is 
incorporated into growing RNA chains. (Courtesy of José Bonner.) 


whereas the so-called Polycomb form of heterochromatin contains a similar 
number of proteins of a different set (PcG proteins). In addition to the five major 
chromatin types, other more minor forms of chromatin appear to be present, each 
of which may be differently regulated and have distinct roles in the cell. 

The set of proteins bound as part of the chromatin at a given locus varies 
depending on the cell type and its stage of development. These variations make 
the accessibility of specific genes different in different tissues, helping to generate 
the cell diversification that accompanies embryonic development (described in 
Chapter 21). 


Chromatin Loops Decondense When the Genes Within Them Are 
Expressed 


When an insect progresses from one developmental stage to another, distinc- 
tive chromosome puffs arise and old puffs recede in its polytene chromosomes 
as new genes become expressed and old ones are turned off (Figure 4-52). From 
inspection of each puff when it is relatively small and the banding pattern is still 
discernible, it seems that most puffs arise from the decondensation of a single 
chromosome band. 

The individual chromatin fibers that make up a puff can be visualized with 
an electron microscope. In favorable cases, loops are seen, much like those 
observed in amphibian lampbrush chromosomes. When genes in the loop are 
not expressed, the loop assumes a thickened structure, possibly that of a folded 
30-nm fiber, but when gene expression is occurring, the loop becomes more 
extended. In electron micrographs, the chromatin located on either side of the 
decondensed loop appears considerably more compact, suggesting that a loop 
constitutes a distinct functional domain of chromatin structure. 

Observations in human cells also suggest that highly folded loops of chromatin 
expand to occupy an increased volume when a gene within them is expressed. For 
example, quiescent chromosome regions from 0.4 to 2 million nucleotide pairs in 
length appear as compact dots in an interphase nucleus when visualized by fluo- 
rescence microscopy. However, the same DNA is seen to occupy a larger territory 
when its genes are expressed, with elongated, punctate structures replacing the 
original dot. 

New ways of visualizing individual chromosomes have shown that each of the 
46 interphase chromosomes in a human cell tends to occupy its own discrete ter- 
ritory within the nucleus: that is, the chromosomes are not extensively entangled 
with one another (Figure 4-53). However, pictures such as these present only 
an average view of the DNA in each chromosome. Experiments that specifically 
localize the heterochromatic regions of a chromosome reveal that they are often 


Figure 4-53 Simultaneous visualization of the chromosome territories 
for all of the human chromosomes in a single interphase nucleus. Here, 
a mixture of DNA probes for each chromosome has been labeled so as to 
fluoresce with a different spectra; this allows DNA-DNA hybridization to be 
used to detect each chromosome, as in Figure 4-10. Three-dimensional 
reconstructions were then produced. Below the micrograph, each 
chromosome is identified in a schematic of the actual image. Note that 
homologous chromosomes (e.g., the two copies of chromosome 9) are not 
in general co-located. (From M.R. Speicher and N.P. Carter, Nat. Rev. Genet. 
6:782-792, 2005. With permission from Macmillan Publishers Ltd.) 
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Figure 4—54 The distribution of gene-rich regions of the human genome 
in an interphase nucleus. Gene-rich regions have been visualized with a 
fluorescent probe that hybridizes to the Alu interspersed repeat, which is 
present in more than a million copies in the human genome (see page 292). 
For unknown reasons, these sequences cluster in chromosomal regions 

rich in genes. In this representation, regions enriched for the Alu sequence 
are green, regions depleted for these sequences are red, while the average 
regions are yellow. The gene-rich regions are seen to be largely absent in 
the DNA near the nuclear envelope. (From A. Bolzer et al., PLoS Biol. 
3:826-842, 2005.) 


closely associated with the nuclear lamina, regardless of the chromosome exam- 
ined. And DNA probes that preferentially stain gene-rich regions of human chro- 
mosomes produce a striking picture of the interphase nucleus that presumably 
reflects different average positions for active and inactive genes (Figure 4-54). 
How is most of the chromatin in each interphase chromosome condensed 
when its genes are not being expressed? A powerful extension of the chromosome 
conformation capture method described previously (see Figure 4-48), which 
exploits a high-throughput DNA sequencing technology called massive parallel 
sequencing (see Panel 8-1, pp. 478-481), allows the connections between all of 
the different one-megabase (1 Mb) segments of the human genome to be mapped 
in human interphase chromosomes. The results reveal that most regions of our 
chromosomes are folded into a conformation referred to as a fractal globule: a 
knot-free arrangement that facilitates maximally dense packing while, at the same 
time, preserving the ability of the chromatin fiber to unfold and fold (Figure 4-55). 


Chromatin Can Move to Specific Sites Within the Nucleus to Alter 
Gene Expression 


A variety of different types of experiments has led to the conclusion that the 
position of a gene in the interior of the nucleus changes when it becomes highly 
expressed. Thus, a region that becomes very actively transcribed is sometimes 
found to extend out of its chromosome territory, as if in an extended loop (Figure 
4-56). We will see in Chapter 6 that the initiation of transcription—the first step in 
gene expression—requires the assembly of over 100 proteins, and it makes sense 
that this would be facilitated in regions of the nucleus enriched in these proteins. 

More generally, it is clear that the nucleus is very heterogeneous, with func- 
tionally different regions to which portions of chromosomes can move as they are 
subjected to different biochemical processes—such as when their gene expres- 
sion changes. It is this issue that we discuss next. 
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Figure 4-55 A fractal globule model for 
interphase chromatin. An extension of 
the 3C method in Figure 4—48, called Hi-C, 
was used to measure the extent to which 
each of the three thousand 1 Mb segments 
in the human genome was located adjacent 
to any other of these segments. The 
results support the type of model shown. 

In the enlarged fractal globule illustrated, 

a region of 5 million base pairs is seen to 
fold in a way that keeps regions that are 
neighbors along the one-dimensional DNA 
helix as neighbors in three dimensions; 

this gives rise to monochromatic blocks in 
this representation that are obvious both 
on the surface and in cross section. The 
fractal globule is a knot-free conformation 
of the DNA that permits dense packing, yet 
retains an ability to easily fold and unfold 
any genomic locus. (Adapted from 

E. Lieberman-Aiden et al., Science 
326:289-293, 2009. With permission 

from AAAS.) 
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Networks of Macromolecules Form a Set of Distinct Biochemical 
Environments inside the Nucleus 


In Chapter 6, we shall describe the function of a variety of subcompartments that 
are present within the nucleus. The largest and most obvious of these is the nucle- 
olus, a structure well known to microscopists even in the nineteenth century (see 
Figure 4-9). The nucleolus is the cell’s site of ribosome subunit formation, as well 
as the place where many other specialized reactions occur (see Figure 6-42): it 
consists of a network of RNAs and proteins concentrated around ribosomal RNA 
genes that are being actively transcribed. In eukaryotes, the genome contains 
multiple copies of the ribosomal RNA genes, and although they are typically clus- 
tered together in a single nucleolus, they are often located on several separate 
chromosomes. 

A variety of less obvious organelles are also present inside the nucleus. For 
example, spherical structures called Cajal bodies and interchromatin granule 
clusters are present in most plant and animal cells (Figure 4-57). Like the nucle- 
olus, these organelles are composed of selected protein and RNA molecules that 
bind together to create networks that are highly permeable to other protein and 
RNA molecules in the surrounding nucleoplasm. 

Structures such as these can create distinct biochemical environments by 
immobilizing select groups of macromolecules, as can other networks of proteins 
and RNA molecules associated with nuclear pores and with the nuclear envelope. 
In principle, this allows other molecules that enter these spaces to be processed 
with great efficiency through complex reaction pathways. Highly permeable, 
fibrous networks of this sort can thereby impart many of the kinetic advantages of 
compartmentalization (see p. 164) to reactions that take place in subregions of the 
nucleus (Figure 4-58A). However, unlike the membrane-bound compartments in 
the cytoplasm (discussed in Chapter 12), these nuclear subcompartments—lack- 
ing a lipid bilayer membrane—can neither concentrate nor exclude specific small 
molecules. 

The cell has a remarkable ability to construct distinct environments to per- 
form complex biochemical tasks efficiently. Those that we have mentioned in the 
nucleus facilitate various aspects of gene expression, and will be further discussed 
in Chapter 6. These subcompartments, including the nucleolus, appear to form 
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Figure 4—56 An effect of high levels of 
gene expression on the intranuclear 
location of chromatin. (A) Fluorescence 
micrographs of human nuclei showing 
how the position of a gene changes when 
it becomes highly transcribed. The region 
of the chromosome adjacent to the gene 
(red) is seen to leave its chromosomal 
territory (green) only when it is highly 
active. (B) Schematic representation of 

a large loop of chromatin that expands 
when the gene is on, and contracts when 
the gene is off. Other genes that are less 
actively expressed can be shown by the 
same methods to remain inside their 
chromosomal territory when transcribed. 
(From J.R. Chubb and W.A. Bickmore, Cell 
112:403-406, 2003. With permission from 
Elsevier.) 





Figure 4—57 Electron micrograph 
showing two very common fibrous 
nuclear subcompartments. The large 
sphere here is a Cajal body. The smaller 
darker sphere is an interchromatin granule 
cluster, also known as a speckle (see 
also Figure 6-46). These “subnuclear 
organelles” are from the nucleus of a 
Xenopus oocyte. (From K.E. Handwerger 
and J.G. Gall, Trends Cell Biol. 16:19-26, 
2006. With permission from Elsevier.) 


214 Chapter 4: DNA, Chromosomes, and Genomes 


nuclear 
envelope 





only as needed, and they create a high local concentration of the many different 
enzymes and RNA molecules needed for a particular process. In an analogous 
way, when DNA is damaged by irradiation, the set of enzymes needed to carry out 
DNA repair are observed to congregate in discrete foci inside the nucleus, creating 
“repair factories” (see Figure 5-52). And nuclei often contain hundreds of discrete 
foci representing factories for DNA or RNA synthesis (see Figure 6-47). 

It seems likely that all of these entities make use of the type of tethering illus- 
trated in Figure 4-58B, where long flexible lengths of polypeptide chain and/or 
long noncoding RNA molecules are interspersed with specific binding sites that 
concentrate the multiple proteins and other molecules that are needed to catalyze 
a particular process. Not surprisingly, tethers are similarly used to help to speed 
biological processes in the cytoplasm, increasing specific reaction rates there (for 
example, see Figure 16-18). 

Is there also an intranuclear framework, analogous to the cytoskeleton, on 
which chromosomes and other components of the nucleus are organized? The 
nuclear matrix, or scaffold, has been defined as the insoluble material left in the 
nucleus after a series of biochemical extraction steps. Many of the proteins and 
RNA molecules that form this insoluble material are likely to be derived from the 
fibrous subcompartments of the nucleus just discussed, while others may be pro- 
teins that help to form the base of chromosomal loops or to attach chromosomes 
to other structures in the nucleus. 


Mitotic Chromosomes Are Especially Highly Condensed 


Having discussed the dynamic structure of interphase chromosomes, we now 
turn to mitotic chromosomes. The chromosomes from nearly all eukaryotic cells 
become readily visible by light microscopy during mitosis, when they coil up to 
form highly condensed structures. This condensation reduces the length of a 
typical interphase chromosome only about tenfold, but it produces a dramatic 
change in chromosome appearance. 

Figure 4-59 depicts a typical mitotic chromosome at the metaphase stage 
of mitosis (for the stages of mitosis, see Figure 17-3). The two DNA molecules 
produced by DNA replication during interphase of the cell-division cycle are 
separately folded to produce two sister chromosomes, or sister chromatids, held 
together at their centromeres, as mentioned earlier. These chromosomes are nor- 
mally covered with a variety of molecules, including large amounts of RNA-protein 


Figure 4-58 Effective compartmentalization 
without a bilayer membrane. (A) Schematic 
illustration of the organization of a spherical 
subnuclear organelle (left) and of a postulated 
similarly organized subcompartment just 
beneath the nuclear envelope (right). In 

both cases, RNAs and/or proteins (gray) 
associate to form highly porous, gel-like 
structures that contain binding sites for other 
specific proteins and RNA molecules (colored 
objects). (B) How the tethering of a selected 
set of proteins and RNA molecules to long 
flexible polymer chains, as in (A), can create 
“staging areas” that greatly speed the rates of 
reactions in subcompartments of the nucleus. 
The reactions catalyzed will depend on the 
particular macromolecules that are localized 
by the tethering. The same strategy for 
accelerating complex sets of reactions is also 
employed in subcompartments elsewhere in 
the cell (See also Figure 3-78). 


chromosome 






centromere 


chromatid 


Figure 4—59 A typical mitotic 
chromosome at metaphase. Each sister 
chromatid contains one of two identical 
sister DNA molecules generated earlier in 
the cell cycle by DNA replication (see also 
Figure 17-21). 
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Figure 4-60 A scanning electron micrograph of a region near one end 
of a typical mitotic chromosome. Each knoblike projection is believed to 
represent the tip of a separate looped domain. Note that the two identical 
paired chromatids (drawn in Figure 4—59) can be clearly distinguished. 
(From M.P. Marsden and U.K. Laemmli, Cell 17:849-858, 1979. With 
permission from Elsevier.) 


complexes. Once this covering has been stripped away, each chromatid can be 
seen in electron micrographs to be organized into loops of chromatin emanating 
from a central scaffolding (Figure 4-60). Experiments using DNA hybridization 
to detect specific DNA sequences demonstrate that the order of visible features 
along a mitotic chromosome at least roughly reflects the order of genes along the 
DNA molecule. Mitotic chromosome condensation can thus be thought of as the 
final level in the hierarchy of chromosome packaging (Figure 4-61). 

The compaction of chromosomes during mitosis is a highly organized and 
dynamic process that serves at least two important purposes. First, when conden- 
sation is complete (in metaphase), sister chromatids have been disentangled from 
each other and lie side by side. Thus, the sister chromatids can easily separate 
when the mitotic apparatus begins pulling them apart. Second, the compaction 
of chromosomes protects the relatively fragile DNA molecules from being broken 
as they are pulled to separate daughter cells. 

The condensation of interphase chromosomes into mitotic chromosomes 
begins in early M phase, and it is intimately connected with the progression of 
the cell cycle. During M phase, gene expression shuts down, and specific mod- 
ifications are made to histones that help to reorganize the chromatin as it com- 
pacts. Two classes of ring-shaped proteins, called cohesins and condensins, aid 
this compaction. How they help to produce the two separately folded chromatids 
of a mitotic chromosome will be discussed in Chapter 17, along with the details 
of the cell cycle. 
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Figure 4-61 Chromatin packing. This 
model shows some of the many levels 
of chromatin packing postulated to give 
rise to the highly condensed mitotic 
chromosome. 
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Summary 


Chromosomes are generally decondensed during interphase, so that the details 
of their structure are difficult to visualize. Notable exceptions are the specialized 
lampbrush chromosomes of vertebrate oocytes and the polytene chromosomes in 
the giant secretory cells of insects. Studies of these two types of interphase chromo- 
somes suggest that each long DNA molecule in a chromosome is divided into a large 
number of discrete domains organized as loops of chromatin that are compacted by 
further folding. When genes contained in a loop are expressed, the loop unfolds and 
allows the cell’s machinery access to the DNA. 

Interphase chromosomes occupy discrete territories in the cell nucleus; that is, 
they are not extensively intertwined. Euchromatin makes up most of interphase 
chromosomes and, when not being transcribed, it probably exists as tightly folded 
fibers of compacted nucleosomes. However, euchromatin is interrupted by stretches 
of heterochromatin, in which the nucleosomes are subjected to additional packing 
that usually renders the DNA resistant to gene expression. Heterochromatin exists in 
several forms, some of which are found in large blocks in and around centromeres 
and near telomeres. But heterochromatin is also present at many other positions on 
chromosomes, where it can serve to help regulate developmentally important genes. 

The interior of the nucleus is highly dynamic, with heterochromatin often posi- 
tioned near the nuclear envelope and loops of chromatin moving away from their 
chromosome territory when genes are very highly expressed. This reflects the exis- 
tence of nuclear subcompartments, where different sets of biochemical reactions 
are facilitated by an increased concentration of selected proteins and RNAs. The 
components involved in forming a subcompartment can self-assemble into discrete 
organelles such as nucleoli or Cajal bodies; they can also be tethered to fixed struc- 
tures such as the nuclear envelope. 

During mitosis, gene expression shuts down and all chromosomes adopt a 
highly condensed conformation in a process that begins early in M phase to pack- 
age the two DNA molecules of each replicated chromosome as two separately folded 
chromatids. The condensation is accompanied by histone modifications that facil- 
itate chromatin packing, but satisfactory completion of this orderly process, which 
reduces the end-to-end distance of each DNA molecule from its interphase length by 
an additional factor of ten, requires additional proteins. 


HOW GENOMES EVOLVE 


In this final section of the chapter, we provide an overview of some of the ways 
that genes and genomes have evolved over time to produce the vast diversity of 
modern-day life-forms on our planet. The sequencing of the genomes of thou- 
sands of organisms is revolutionizing our view of the process of evolution, uncov- 
ering an astonishing wealth of information about not only family relationships 
among organisms, but also about the molecular mechanisms by which evolution 
has proceeded. 

It is perhaps not surprising that genes with similar functions can be found in 
a diverse range of living things. But the great revelation of the past 30 years has 
been the extent to which the actual nucleotide sequences of many genes have 
been conserved. Homologous genes—that is, genes that are similar in both their 
nucleotide sequence and function because of a common ancestry—can often be 
recognized across vast phylogenetic distances. Unmistakable homologs of many 
human genes are present in organisms as diverse as nematode worms, fruit flies, 
yeasts, and even bacteria. In many cases, the resemblance is so close that, for 
example, the protein-coding portion of a yeast gene can be substituted with its 
human homolog—even though humans and yeast are separated by more than a 
billion years of evolutionary history. 

As emphasized in Chapter 3, the recognition of sequence similarity has 
become a major tool for inferring gene and protein function. Although a sequence 
match does not guarantee similarity in function, it has proved to be an excellent 
clue. Thus, it is often possible to predict the function of genes in humans for which 
no biochemical or genetic information is available simply by comparing their 
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nucleotide sequences with the sequences of genes that have been characterized 
in other more readily studied organisms. 

In general, the sequences of individual genes are much more tightly con- 
served than is overall genome structure. Features of genome organization such 
as genome size, number of chromosomes, order of genes along chromosomes, 
abundance and size of introns, and amount of repetitive DNA are found to differ 
greatly when comparing distant organisms, as does the number of genes that each 
organism contains. 


Genome Comparisons Reveal Functional DNA Sequences by their 
Conservation Throughout Evolution 


A first obstacle in interpreting the sequence of the 3.2 billion nucleotide pairs in 
the human genome is the fact that much of it is probably functionally unimport- 
ant. The regions of the genome that code for the amino acid sequences of proteins 
(the exons) are typically found in short segments (average size about 145 nucle- 
otide pairs), small islands in a sea of DNA whose exact nucleotide sequence is 
thought to be mostly of little consequence. This arrangement makes it difficult 
to identify all the exons in a stretch of DNA, and it is often hard too to determine 
exactly where a gene begins and ends. 

One very important approach to deciphering our genome is to search for DNA 
sequences that are closely similar between different species, on the principle 
that DNA sequences that have a function are much more likely to be conserved 
than those without a function. For example, humans and mice are thought to 
have diverged from a common mammalian ancestor about 80 x 10° years ago, 
which is long enough for the majority of nucleotides in their genomes to have 
been changed by random mutational events. Consequently, the only regions that 
will have remained closely similar in the two genomes are those in which muta- 
tions would have impaired function and put the animals carrying them at a dis- 
advantage, resulting in their elimination from the population by natural selection. 
Such closely similar pieces of DNA sequence are known as conserved regions. In 
addition to revealing those DNA sequences that encode functionally important 
exons and RNA molecules, these conserved regions will include regulatory DNA 
sequences as well as DNA sequences with functions that are not yet known. In 
contrast, most nonconserved regions will reflect DNA whose sequence is much 
less likely to be critical for function. 

The power of this method can be increased by including in such comparisons 
the genomes of large numbers of species whose genomes have been sequenced, 
such as rat, chicken, fish, dog, and chimpanzee, as well as mouse and human. 
By revealing in this way the results of a very long natural “experiment,” lasting 
for hundreds of millions of years, such comparative DNA sequencing studies 
have highlighted the most interesting regions in our genome. The comparisons 
reveal that roughly 5% of the human genome consists of “multispecies conserved 
sequences.’ To our great surprise, only about one-third of these sequences code 
for proteins (see Table 4-1, p. 184). Many of the remaining conserved sequences 
consist of DNA containing clusters of protein-binding sites that are involved in 
gene regulation, while others produce RNA molecules that are not translated 
into protein but are important for other known purposes. But, even in the most 
intensively studied species, the function of the majority of these highly conserved 
sequences remains unknown. This remarkable discovery has led scientists to con- 
clude that we understand much less about the cell biology of vertebrates than we 
had thought. Certainly, there are enormous opportunities for new discoveries, 
and we should expect many more surprises ahead. 


Genome Alterations Are Caused by Failures of the Normal 
Mechanisms for Copying and Maintaining DNA, as well as by 
Transposable DNA Elements 


Evolution depends on accidents and mistakes followed by nonrandom survival. 
Most of the genetic changes that occur result simply from failures in the normal 
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mechanisms by which genomes are copied or repaired when damaged, although 
the movement of transposable DNA elements (discussed below) also plays an 
important part. As we will explain in Chapter 5, the mechanisms that maintain 
DNA sequences are remarkably precise—but they are not perfect. DNA sequences 
are inherited with such extraordinary fidelity that typically, along a given line of 
descent, only about one nucleotide pair in a thousand is randomly changed in the 
germ line every million years. Even so, in a population of 10,000 diploid individu- 
als, every possible nucleotide substitution will have been “tried out” on about 20 
occasions in the course of a million years—a short span of time in relation to the 
evolution of species. 

Errors in DNA replication, DNA recombination, or DNA repair can lead either 
to simple local changes in DNA sequence—so-called point mutations such as the 
substitution of one base pair for another—or to large-scale genome rearrange- 
ments such as deletions, duplications, inversions, and translocations of DNA from 
one chromosome to another. In addition to these failures of the genetic machin- 
ery, genomes contain mobile DNA elements that are an important source of 
genomic change (see Table 5-3, p. 267). These transposable DNA elements (trans- 
posons) are parasitic DNA sequences that can spread within the genomes they 
colonize. In the process, they often disrupt the function or alter the regulation 
of existing genes. On occasion, they have created altogether novel genes through 
fusions between transposon sequences and segments of existing genes. Over long 
periods of evolutionary time, DNA transposition events have profoundly affected 
genomes, so much so that nearly half of the DNA in the human genome consists 
of recognizable relics of past transposition events (Figure 4-62). Even more of our 
genome is thought to have been derived from transpositions that occurred so long 
ago (>108 years) that the sequences can no longer be traced to transposons. 


The Genome Sequences of Two Species Differ in Proportion to the 
Length of Time Since They Have Separately Evolved 


The differences between the genomes of species alive today have accumulated 
over more than 3 billion years. Although we lack a direct record of changes over 
time, scientists can reconstruct the process of genome evolution from detailed 
comparisons of the genomes of contemporary organisms. 

The basic organizing framework for comparative genomics is the phyloge- 
netic tree. A simple example is the tree describing the divergence of humans from 
the great apes (Figure 4-63). The primary support for this tree comes from com- 
parisons of gene or protein sequences. For example, comparisons between the 
sequences of human genes or proteins and those of the great apes typically reveal 
the fewest differences between human and chimpanzee and the most between 
human and orangutan. 

For closely related organisms such as humans and chimpanzees, it is relatively 
easy to reconstruct the gene sequences of the extinct, last common ancestor of the 
two species (Figure 4-64). The close similarity between human and chimpanzee 
genes is mainly due to the short time that has been available for the accumulation 
of mutations in the two diverging lineages, rather than to functional constraints 
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Figure 4—62 A representation of the 
nucleotide sequence content of the 
sequenced human genome. The LINEs 
(long interspersed nuclear elements), SINES 
(short interspersed nuclear elements), 
retroviral-like elements, and DNA-only 
transposons are mobile genetic elements 
that have multiplied in our genome by 
replicating themselves and inserting the 
new copies in different positions. These 
mobile genetic elements are discussed in 
Chapter 5 (see Table 5-3, p. 267). Simple 
sequence repeats are short nucleotide 
sequences (less than 14 nucleotide pairs) 
that are repeated again and again for long 
stretches. Segmental duplications are large 
blocks of DNA sequence (1000—200,000 
nucleotide pairs) that are present at two 

or more locations in the genome. The 
most highly repeated blocks of DNA 

in heterochromatin have not yet been 
completely sequenced; therefore about 
10% of human DNA sequences are not 
represented in this diagram. (Data courtesy 
of E. Margulies.) 
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that have kept the sequences the same. Evidence for this view comes from the 
observation that the human and chimpanzee genomes are nearly identical even 
where there is no functional constraint on the nucleotide sequence—such as in 
the third position of “synonymous” codons (codons specifying the same amino 
acid but differing in their third nucleotide). 

For much less closely related organisms, such as humans and chickens (which 
have evolved separately for about 300 million years), the sequence conservation 
found in genes is almost entirely due to purifying selection (that is, selection that 
eliminates individuals carrying mutations that interfere with important genetic 
functions), rather than to an inadequate time for mutations to occur. 


Phylogenetic Trees Constructed from a Comparison of DNA 
sequences Trace the Relationships of All Organisms 


Phylogenetic trees based on molecular sequence data can be compared with 
the fossil record, and we get our best view of evolution by integrating the two 
approaches. The fossil record remains essential as a source of absolute dates, 
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Figure 4-63 A phylogenetic tree 
showing the relationship between 
humans and the great apes based on 
nucleotide sequence data. As indicated, 
the sequences of the genomes of all four 
species are estimated to differ from the 
sequence of the genome of their last 
common ancestor by a little over 1.5%. 
Because changes occur independently 
on both diverging lineages, pairwise 
comparisons reveal twice the sequence 
divergence from the last common 
ancestor. For example, human-orangutan 
comparisons typically show sequence 
divergences of a little over 3%, while 
human-—chimpanzee comparisons show 
divergences of approximately 1.2%. 
(Modified from F.C. Chen and W.H. Li, 
Am. J. Hum. Genet. 68:444—456, 2001.) 


Figure 4—64 Tracing the ancestral 
sequence from a sequence comparison 
of the coding regions of human and 
chimpanzee leptin genes. Reading left 
to right and top to bottom, a continuous 
300-nucleotide segment of a leptin-coding 
gene is illustrated. Leptin is a hormone 
that regulates food intake and energy 
utilization in response to the adequacy of 
fat reserves. As indicated by the codons 
boxed in green, only 5 nucleotides (of 

441 total) differ between the two species. 
Moreover, in only one of the five positions 
does the difference in nucleotide lead to 

a difference in the encoded amino acid. 
For each of the five variant nucleotide 
positions, the corresponding sequence in 
the gorilla is also indicated. In two cases, 
the gorilla sequence agrees with the human 
sequence, while in three cases it agrees 
with the chimpanzee sequence. 

What was the sequence of the leptin gene 
in the last common ancestor? The most 
economical assumption is that evolution 
has followed a pathway requiring the 
minimum number of mutations consistent 
with the data. Thus, it seems likely that 
the leptin sequence of the last common 
ancestor was the same as the human and 
chimpanzee sequences when they agree; 
when they disagree, the gorilla sequence 
would be used as a tiebreaker. For 
convenience, only the first 300 nucleotides 
of the leptin-coding Sequences are given. 
The remaining 141 are identical between 
humans and chimpanzees. 
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Figure 4-65 The very different rates of evolution of exons and introns, as illustrated by comparing a portion of the 
mouse and human leptin genes. Positions where the sequences differ by a single nucleotide substitution are boxed in green, 
and positions that differ by the addition or deletion of nucleotides are boxed in yellow. Note that, thanks to purifying selection, 
the coding sequence of the exon is much more conserved than is the adjacent intron sequence. 


based on radioisotope decay in the rock formations in which fossils are found. 
Because the fossil record has many gaps, however, precise divergence times 
between species are difficult to establish, even for species that leave good fossils 
with distinctive morphology. 

Phylogenetic trees whose timing has been calibrated according to the fos- 
sil record suggest that changes in the sequences of particular genes or proteins 
tend to occur at a nearly constant rate, although rates that differ from the norm 
by as much as twofold are observed in particular lineages. This provides us with a 
molecular clock for evolution—or rather a set of molecular clocks corresponding 
to different categories of DNA sequence. As in the example in Figure 4-65, the 
clockruns most rapidly and regularly in sequences that are not subject to purifying 
selection. These include portions of introns that lack splicing or regulatory signals, 
the third position in synonymous codons, and genes that have been irreversibly 
inactivated by mutation (the so-called pseudogenes). The clock runs most slowly 
for sequences that are subject to strong functional constraints—for example, the 
amino acid sequences of proteins that engage in specific interactions with large 
numbers of other proteins and whose structure is therefore highly constrained, 
or the nucleotide sequences that encode the RNA subunits of the ribosome, on 
which all protein synthesis depends. 

Occasionally, rapid change is seen in a previously highly conserved sequence. 
As discussed later in this chapter, such episodes are especially interesting because 
they are thought to reflect periods of strong positive selection for mutations that 
have conferred a selective advantage in the particular lineage where the rapid 
change occurred. 

The pace at which molecular clocks run during evolution is determined not 
only by the degree of purifying selection, but also by the mutation rate. Most 
notably, in animals, although not in plants, clocks based on functionally uncon- 
strained mitochondrial DNA sequences run much faster than clocks based on 
functionally unconstrained nuclear sequences, because the mutation rate in ani- 
mal mitochondria is exceptionally high. 

Categories of DNA for which the clock runs fast are most informative for recent 
evolutionary events; the mitochondrial DNA clock has been used, for example, to 
chronicle the divergence of the Neanderthal lineage from that of modern Homo 
sapiens. To study more ancient evolutionary events, one must examine DNA for 
which the clock runs more slowly; thus the divergence of the major branches of 
the tree of life—bacteria, archaea, and eukaryotes—has been deduced from study 
of the sequences specifying ribosomal RNA. 

In general, molecular clocks, appropriately chosen, have a finer time resolu- 
tion than the fossil record, and they are a more reliable guide to the detailed struc- 
ture of phylogenetic trees than are classical methods of tree construction, which 
are based on family resemblances in anatomy and embryonic development. For 
example, the precise family tree of great apes and humans was not settled until 
sufficient molecular sequence data accumulated in the 1980s to produce the ped- 
igree shown previously in Figure 4-63. And with huge amounts of DNA sequence 
now determined from a wide variety of mammals, much better estimates of our 
relationship to them are being obtained (Figure 4-66). 
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A Comparison of Human and Mouse Chromosomes Shows How 
the Structures of Genomes Diverge 


As would be expected, the human and chimpanzee genomes are much more 
alike than are the human and mouse genomes, even though all three genomes 
are roughly the same size and contain nearly identical sets of genes. Mouse and 
human lineages have had approximately 80 million years to diverge through accu- 
mulated mutations, versus 6 million years for humans and chimpanzees. In addi- 
tion, as indicated in Figure 4-66, rodent lineages (represented by the rat and the 
mouse) have unusually fast molecular clocks, and have diverged from the human 
lineage more rapidly than otherwise expected. 

While the way that the genome is organized into chromosomes is almost iden- 
tical between humans and chimpanzees, this organization has diverged greatly 
between humans and mice. According to rough estimates, a total of about 180 
breakage-and-rejoining events have occurred in the human and mouse lineages 
since these two species last shared a common ancestor. In the process, although 
the number of chromosomes is similar in the two species (23 per haploid genome 
in the human versus 20 in the mouse), their overall structures differ greatly. None- 
theless, even after the extensive genomic shuffling, there are many large blocks 
of DNA in which the gene order is the same in the human and the mouse. These 
stretches of conserved gene order in chromosomes are referred to as regions of 
synteny. Figure 4-67 illustrates how segments of the different mouse chromo- 
somes map onto the human chromosome set. For much more distantly related 
vertebrates, such as chicken and human, the number of breakage-and-rejoining 
events has been much greater and the regions of synteny are much shorter; in 
addition, they are often hard to discern because of the divergence of the DNA 
sequences that they contain. 

An unexpected conclusion from a detailed comparison of the complete mouse 
and human genome sequences, confirmed by subsequent comparisons between 
the genomes of other vertebrates, is that small blocks of DNA sequence are being 
deleted from and added to genomes at a surprisingly rapid rate. Thus, if we 
assume that our common ancestor had a genome of human size (about 3.2 billion 
nucleotide pairs), mice would have lost a total of about 45% of that genome from 
accumulated deletions during the past 80 million years, while humans would 
have lost about 25%. However, substantial sequence gains from many small chro- 
mosome duplications and from the multiplication of transposons have compen- 
sated for these deletions. As a result, our genome size is thought to be practically 
unchanged from that of the last common ancestor of humans and mice, while the 
mouse genome is smaller by only about 0.3 billion nucleotides. 
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Figure 4-66 A phylogenetic tree showing 
the evolutionary relationships of some 
present-day mammals. The length of 
each line is proportional to the number of 
“neutral substitutions” —that is, nucleotide 
changes at sites where there is assumed 

to be no purifying selection. (Adapted from 
G.M. Cooper et al., Genome Res. 
15:901-913, 2005. With permission from 
Cold Spring Harbor Laboratory Press.) 
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Good evidence for the loss of DNA sequences in small blocks during evolution 
can be obtained from a detailed comparison of regions of synteny in the human 
and mouse genomes. The comparative shrinkage of the mouse genome can be 
clearly seen from such comparisons, with the net loss of sequences scattered 
throughout the long stretches of DNA that are otherwise homologous (Figure 
4-68). 

DNA is added to genomes both by the spontaneous duplication of chromo- 
somal segments that are typically tens of thousands of nucleotide pairs long 
(as will be discussed shortly) and by insertion of new copies of active transposons. 
Most transposition events are duplicative, because the original copy of the 
transposon stays where it was when a copy inserts at the new site; see, for exam- 
ple, Figure 5-63. Comparison of the DNA sequences derived from transposons 
in the human and the mouse readily reveals some of the sequence additions 
(Figure 4-69). 

It remains a mystery why all mammals have maintained genome sizes of 
roughly 3 billion nucleotide pairs that contain nearly identical sets of genes, 
even though only approximately 150 million nucleotide pairs appear to be under 
sequence-specific functional constraints. 


The Size of a Vertebrate Genome Reflects the Relative Rates of 
DNA Addition and DNA Loss in a Lineage 


In more distantly related vertebrates, genome size can vary considerably, appar- 
ently without a drastic effect on the organism or its number of genes. Thus, the 
chicken genome, at one billion nucleotide pairs, is only about one-third the size 
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Figure 4-67 Synteny between human 
and mouse chromosomes. In this 
diagram, the human chromosome set 

is shown above, with each part of each 
chromosome colored according to the 
mouse chromosome with which it is 
syntenic. The color coding used for each 
mouse chromosome is shown below. 
Heterochromatic highly repetitive regions 
(such as centromeres) that are difficult to 
sequence cannot be mapped in this way; 
these are colored black. (Adapted from 
E.E. Eichler and D. Sankoff, Science 
301:793-797, 2003. With permission 
from AAAS.) 


Figure 4-68 Comparison of a syntenic 
portion of mouse and human genomes. 
About 90% of the two genomes can be 
aligned in this way. Note that while there 

is an identical order of the matched index 
sequences (red marks), there has been a 
net loss of DNA in the mouse lineage that 

is interspersed throughout the entire region. 
This type of net loss is typical for all such 
regions, and it accounts for the fact that the 
mouse genome contains 14% less DNA than 
does the human genome. (Adapted from 
Mouse Genome Sequencing Consortium, 
Nature 420:520-562, 2002. With permission 
from Macmillan Publishers Ltd.) 
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of the mammalian genome. An extreme example is the puffer fish, Fugu rubripes 
(Figure 4-70), which has a tiny genome for a vertebrate (0.4 billion nucleotide 
pairs compared to 1 billion or more for many other fish). The small size of the Fugu 
genome is largely due to the small size of its introns. Specifically, Fugu introns, as 
well as other noncoding segments of the Fugu genome, lack the repetitive DNA 
that makes up a large portion of the genomes of most well-studied vertebrates. 
Nevertheless, the positions of the Fugu introns between the exons of each gene 
are almost the same as in mammalian genomes (Figure 4-71). 

While initially a mystery, we now have a simple explanation for such large dif- 
ferences in genome size between similar organisms: because all vertebrates expe- 
rience a continuous process of DNA loss and DNA addition, the size of a genome 
merely depends on the balance between these opposing processes acting over 
millions of years. Suppose, for example, that in the lineage leading to Fugu, the 
rate of DNA addition happened to slow greatly. Over long periods of time, this 
would result in a major “cleansing” from this fish genome of those DNA sequences 
whose loss could be tolerated. The result is an unusually compact genome, rela- 
tively free of junk and clutter, but retaining through purifying selection the ver- 
tebrate DNA sequences that are functionally important. This makes Fugu, with 
its 400 million nucleotide pairs of DNA, a valuable resource for genome research 
aimed at understanding humans. 


We Can Infer the Sequence of Some Ancient Genomes 


The genomes of ancestral organisms can be inferred, but most can never be 
directly observed. DNA is very stable compared with most organic molecules, 
but it is not perfectly stable, and its progressive degradation, even under the best 
circumstances, means that it is virtually impossible to extract sequence infor- 
mation from fossils that are more than a million years old. Although a modern 
organism such as the horseshoe crab looks remarkably similar to fossil ancestors 
that lived 200 million years ago, there is every reason to believe that the horse- 
shoe-crab genome has been changing during all that time in much the same way 
as in other evolutionary lineages, and at a similar rate. Selection must have main- 
tained key functional properties of the horseshoe-crab genome to account for the 
morphological stability of the lineage. However, comparisons between different 
present-day organisms show that the fraction of the genome subject to purifying 
selection is small; hence, it is fair to assume that the genome of the modern horse- 
shoe crab, while preserving features critical for function, must differ greatly from 
that of its extinct ancestors, known to us only through the fossil record. 

It is possible to get direct sequence information by examining DNA samples 
from ancient materials if these are not too old. In recent years, technical advances 
have allowed DNA sequencing from exceptionally well-preserved bone fragments 
that date from more than 100,000 years ago. Although any DNA this old will be 
imperfectly preserved, a sequence of the Neanderthal genome has been recon- 
structed from many millions of short DNA sequences, revealing—among other 
things—that our human ancestors interbred with Neanderthals in Europe and 
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Figure 4-69 A comparison of the 
B-globin gene cluster in the human 

and mouse genomes, showing the 
locations of transposable elements. This 
stretch of the human genome contains five 
functional B-globin-like genes (orange); 
the comparable region from the mouse 
genome has only four. The positions of 
the human Alu sequences are indicated 
by green circles, and the human L1 
sequences by red circles. The mouse 
genome contains different but related 
transposable elements: the positions of 

B1 elements (which are related to the 
human Alu sequences) are indicated by 
blue triangles, and the positions of the 
mouse L7 elements (which are related to 
the human L7 sequences) are indicated 
by orange triangles. The absence of 
transposable elements from the globin 
structural genes can be attributed to 
purifying selection, which would have 
eliminated any insertion that compromised 
gene function. (Courtesy of Ross Hardison 
and Webb Miller.) 





Figure 4-70 The puffer fish, Fugu 
rubripes. (Courtesy of Byrappa Venkatesh.) 
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that modern humans have inherited specific genes from them (Figure 4-72). The 
average difference in DNA sequence between humans and Neanderthals shows 
that our two lineages diverged somewhere between 270,000 and 440,000 years 
ago, well before the time that humans are believed to have migrated out of Africa. 
But what about deciphering the genomes of much older ancestors, those for 
which no useful DNA samples can be isolated? For organisms that are as closely 
related as human and chimpanzee, we saw that this may not be difficult: reference 
to the gorilla sequence can be used to sort out which of the few sequence differ- 
ences between human and chimpanzee are inherited from our common ancestor 
some 6 million years ago (see Figure 4-64). And for an ancestor that has produced 
a large number of different organisms alive today, the DNA sequences of many 
species can be compared simultaneously to unscramble much of the ancestral 
sequence, allowing scientists to derive DNA sequences much farther back in time. 
For example, from the genome sequences currently being obtained for dozens of 
modern placental mammals, it should be possible to infer much of the genome 
sequence of their 100 million-year-old common ancestor—the precursor of spe- 
cies as diverse as dog, mouse, rabbit, armadillo, and human (see Figure 4-66). 


Multisoecies Sequence Comparisons Identify Conserved DNA 
Sequences of Unknown Function 


The mass of DNA sequence now in databases (hundreds of billions of nucleotide 
pairs) provides a rich resource that scientists can mine for many purposes. This 
information can be used not only to unscramble the evolutionary pathways that 
have led to modern organisms, but also to provide insights into how cells and 
organisms function. Perhaps the most remarkable discovery in this realm comes 
from the observation that a striking amount of DNA sequence that does not code 
for protein has been conserved during mammalian evolution (see Table 4-1, 
p. 184). This is most clearly revealed when we align and compare DNA synteny 


Figure 4-71 Comparison of the 
genomic sequences of the human 

and Fugu genes encoding the protein 
huntingtin. Both genes (indicated in red) 
contain 67 short exons that align in 1:1 
correspondence to one another; these 
exons are connected by curved lines. 
The human gene is 7.5 times larger than 
the Fugu gene (180,000 versus 24,000 
nucleotide pairs). The size difference is 
entirely due to larger introns in the human 
gene. The larger size of the human 
introns is due in part to the presence of 
retrotransposons (discussed in Chapter 
5), whose positions are represented by 
green vertical lines; the Fugu introns lack 
retrotransposons. In humans, mutation of 
the huntingtin gene causes Huntington’s 
disease, an inherited neurodegenerative 
disorder. (Adapted from S. Baxendale et 
al., Nat. Genet. 10:67-76, 1995. With 
permission from Macmillan Publishers Ltd.) 


Figure 4—72 The Neanderthals. (A) Map 
of Europe showing the location of the 
cave in Croatia where most of the bones 
used to isolate the DNA used to derive 
the Neanderthal genome sequence were 
discovered. (B) Photograph of the Vindija 
cave. (C) Photograph of the 38,000-year- 
old bones from Vindija. More recent 
studies have succeeded in extracting 
DNA sequence information from hominid 
remains that are considerably older (see 
Movie 8.3). (B, courtesy of Johannes 
Krause; C, from R.E. Green et al., Science 
328: 710-722, 2010. Reprinted with 
permission from AAAS.) 
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blocks from many different species, thereby identifying large numbers of so-called 
multispecies conserved sequences: some of these code for protein, but most of 
them do not (Figure 4-73). 

Most of the noncoding conserved sequences discovered in this way turn out 
to be relatively short, containing between 50 and 200 nucleotide pairs. Among the 
most mysterious are the so-called “ultraconserved” noncoding sequences, exem- 
plified by more than 5000 DNA segments over 100 nucleotides long that are exactly 
the same in human, mouse, and rat. Most have undergone little or no change 
since mammalian and bird ancestors diverged about 300 million years ago. The 
strict conservation implies that even though the sequences do not encode pro- 
teins, each nevertheless has an important function maintained by purifying selec- 
tion. The puzzle is to unravel what those functions are. 

Many of the conserved sequences that do not code for protein are now known 
to produce untranslated RNA molecules, such as the thousands of long noncoding 
RNAs (IncRNAs) that are thought to have important functions in regulating gene 
transcription. As we shall also see in Chapter 7, others are short regions of DNA 
scattered throughout the genome that directly bind proteins involved in gene reg- 
ulation. But it is uncertain how much of the conserved noncoding DNA can be 
accounted for in these ways, and the function of most of it remains a mystery. This 
enigma highlights how much more we need to learn about the fundamental bio- 
logical mechanisms that operate in animals and other complex organisms, and its 
solution is certain to have profound consequences for medicine. 

How can cell biologists tackle the mystery of noncoding conserved DNA? Tra- 
ditionally, attempts to determine the function of a puzzling DNA sequence begin 
by looking at the consequences of its experimental disruption. But many DNA 
sequences that are crucial for an organism in the wild can be expected to have no 
noticeable effect on its phenotype under laboratory conditions: what is required 
for a mouse to survive in a laboratory cage is very much less than what is required 
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Figure 4-73 The detection of 
multispecies conserved sequences. 

In this example, genome sequences 

for each of the organisms shown have 
been compared with the indicated region 
of the human CFTR (cystic fibrosis 
transmembrane conductance regulator) 
gene; this region contains one exon plus 

a large amount of intronic DNA. For each 
organism, the percent identity with human 
for each 25-nucleotide block is plotted 

in green. In addition, a computational 
algorithm has been used to detect the 
sequences within this region that are most 
highly conserved when the sequences from 
all of the organisms are taken into account. 
Besides the exon (dark blue on the line at 
the top of the figure), the positions of three 
other blocks of multispecies conserved 
sequences are indicated (pale blue). The 
function of most such sequences in the 
human genome is not known. (Courtesy of 
Eric D. Green.) 
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for it to succeed in nature. Moreover, calculations based on population genetics 
reveal that just a tiny selective advantage—less than a 0.1% difference in sur- 
vival—can be enough to strongly favor retaining a particular DNA sequence over 
evolutionary time spans. One should therefore not be surprised to find that many 
DNA sequences that are ultraconserved can be deleted from the mouse genome 
without any noticeable effect on that mouse in a laboratory. 

A second important approach for discovering the function of a mysterious 
noncoding DNA sequence uses biochemical techniques to identify proteins or 
RNA molecules that bind to it—and/or to any RNA molecules that it produces. 
Most of this task still lies before us, but a start has been made (see p. 435). 


Changes in Previously Conserved Sequences Can Help Decipher 
Critical Steps in Evolution 


Given genome sequence information, we can tackle another intriguing question: 
What alterations in our DNA have made humans so different from other ani- 
mals—or for that matter, what makes any individual species so different from its 
relatives? For example, as soon as both the human and the chimpanzee genome 
sequences became available, scientists began searching for DNA sequence 
changes that might account for the striking differences between us and chim- 
panzees. With 3.2 billion nucleotide pairs to compare in the two species, this 
might seem an impossible task. But the job was made much easier by confining 
the search to 35,000 clearly defined multispecies conserved sequences (a total 
of about 5 million nucleotide pairs), representing parts of the genome that are 
most likely to be functionally important. Though these sequences are conserved 
strongly, they are not conserved perfectly, and when the version in one species is 
compared with that in another they are generally found to have drifted apart by 
a small amount corresponding simply to the time elapsed since the last common 
ancestor. In a small proportion of cases, however, one sees signs of a sudden evo- 
lutionary spurt. For example, some DNA sequences that have been highly con- 
served in other mammalian species are found to have accumulated nucleotide 
changes exceptionally rapidly during the 6 million years of human evolution since 
we diverged from the chimpanzees. These human accelerated regions (HARs) are 
thought to reflect functions that have been especially important in making us dif- 
ferent in some useful way. 

About 50 such sites were identified in one study, one-fourth of which were 
located near genes associated with neural development. The sequence exhibiting 
the most rapid change (18 changes between human and chimpanzee, compared 
to only two changes between chimpanzee and chicken) was examined further 
and found to encode a 118-nucleotide noncoding RNA molecule, HARIF (human 
accelerated region 1F), that is produced in the human cerebral cortex at a critical 
time during brain development. The function of this HARIF RNA is not yet known, 
but findings of this type are stimulating research studies that may shed light on 
crucial features of the human brain. 

A related approach in the search for the important mutations that contributed 
to human evolution likewise begins with DNA sequences that have been con- 
served during mammalian evolution, but rather than screening for accelerated 
changes in individual nucleotides, it focuses instead on chromosome sites that 
have experienced deletions in the 6 million years since our lineage diverged from 
that of chimpanzees. More than 500 such sequences—conserved among other 
species but deleted in humans—have been discovered. Each deletion removes an 
average of 95 nucleotides of DNA sequence. Only one of these deletions affects a 
protein-coding region: the rest are thought to alter regions that affect how nearby 
genes are expressed, an expectation that has been experimentally confirmed 
in a few cases. A large proportion of the presumed regulatory regions identified 
in this way lie near genes that affect neural function and/or near genes involved in 
steroid signaling, suggesting that changes in the nervous system and in immune 
or reproductive functions have played an especially important role in human 
evolution. 
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Mutations in the DNA Sequences That Control Gene Expression 
Have Driven Many of the Evolutionary Changes in Vertebrates 


The vast hoard of genomic sequence data now being accumulated can be explored 
in many other ways to reveal events that happened even hundreds of millions of 
years ago. For example, one can attempt to trace the origins of the regulatory ele- 
ments in DNA that have played critical parts in vertebrate evolution. One such 
study began with the identification of nearly 3 million noncoding sequences, aver- 
aging 28 base pairs in length, that have been conserved in recent vertebrate evolu- 
tion while being absent in more ancient ancestors. Each of these special non-cod- 
ing sequences is likely to represent a functional innovation peculiar to a particular 
branch of the vertebrate family tree, and most of them are thought to consist of 
regulatory DNA that governs the expression of a neighboring gene. Given full 
genome sequences, one can identify the genes that lie closest and thus appear 
most likely to have fallen under the sway of these novel regulatory elements. By 
comparing many different species, with known divergence times, one can also 
estimate when each such regulatory element came into existence as a conserved 
feature. The findings suggest remarkable evolutionary differences between the 
various functional classes of genes (Figure 4-74). Conserved regulatory elements 
that originated early in vertebrate evolution—that is, more than about 300 million 
years ago, which is when the mammalian lineage split from the lineage leading to 
birds and reptiles—seem to be mostly associated with genes that code for tran- 
scription regulator proteins and for proteins with roles in organizing embryonic 
development. Then came an era when the regulatory DNA innovations arose next 
to genes coding for receptors for extracellular signals. Finally, over the course of 
the past 100 million years, the regulatory innovations seem to have been concen- 
trated in the neighborhood of genes coding for proteins (such as protein kinases) 
that function to modify other proteins post-translationally. 

Many questions remain to be answered about these phenomena and what 
they mean. One possible interpretation is that the logic—the circuit diagram—of 
the gene regulatory network in vertebrates was established early, and that more 
recent evolutionary change has mainly occurred through the tuning of quantita- 
tive parameters. This could help to explain why, among the mammals, for exam- 
ple, the basic body plan—the topology of the tissues and organs—has been largely 
conserved. 


Gene Duplication Also Provides an Important Source of Genetic 
Novelty During Evolution 


Evolution depends on the creation of new genes, as well as on the modification 
of those that already exist. How does this occur? When we compare organisms 
that seem very different—a primate with a rodent, for example, or a mouse with 
a fish—we rarely encounter genes in the one species that have no homolog in the 
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Figure 4-74 The types of changes 

in gene regulation inferred to have 
predominated during the evolution of 
our vertebrate ancestors. To produce 

the information summarized in this plot, 
wherever possible the type of gene 
regulated by each conserved noncoding 
sequence was inferred from the identity of 
its closest protein-coding gene. The fixation 
time for each conserved sequence was 
then used to derive the conclusions shown. 
(Based on C.B. Lowe et al., Science 
333:1019-1024, 2011. With permission 
from AAAS.) 
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other. Genes without homologous counterparts are relatively scarce even when 
we compare such divergent organisms as a mammal and a worm. On the other 
hand, we frequently find gene families that have different numbers of members in 
different species. To create such families, genes have been repeatedly duplicated, 
and the copies have then diverged to take on new functions that often vary from 
one species to another. 

Gene duplication occurs at high rates in all evolutionary lineages, contributing 
to the vigorous process of DNA addition discussed previously. In a detailed study 
of spontaneous duplications in yeast, duplications of 50,000 to 250,000 nucleotide 
pairs were commonly observed, most of which were tandemly repeated. These 
appeared to result from DNA replication errors that led to the inexact repair of 
double-strand chromosome breaks. A comparison of the human and chimpanzee 
genomes reveals that, since the time that these two organisms diverged, such seg- 
mental duplications have added about 5 million nucleotide pairs to each genome 
every million years, with an average duplication size being about 50,000 nucleo- 
tide pairs (although there are some duplications five times larger). In fact, if one 
counts nucleotides, duplication events have created more differences between 
our two species than have single-nucleotide substitutions. 


Duplicated Genes Diverge 


What is the fate of newly duplicated genes? In most cases, there is presumed to 
be little or no selection—at least initially—to maintain the duplicated state since 
either copy can provide an equivalent function. Hence, many duplication events 
are likely to be followed by loss-of-function mutations in one or the other gene. 
This cycle would functionally restore the one-gene state that preceded the duplica- 
tion. Indeed, there are many examples in contemporary genomes where one copy 
of a duplicated gene can be seen to have become irreversibly inactivated by mul- 
tiple mutations. Over time, the sequence similarity between such a pseudogene 
and the functional gene whose duplication produced it would be expected to be 
eroded by the accumulation of many mutations in the pseudogene—the homolo- 
gous relationship eventually becoming undetectable. 

An alternative fate for gene duplications is for both copies to remain func- 
tional, while diverging in their sequence and pattern of expression, thus taking 
on different roles. This process of “duplication and divergence” almost certainly 
explains the presence of large families of genes with related functions in biolog- 
ically complex organisms, and it is thought to play a critical role in the evolution 
of increased biological complexity. An examination of many different eukaryotic 
genomes suggests that the probability that any particular gene will undergo a 
duplication event that spreads to most or all individuals in a species is approxi- 
mately 1 percent every million years. 

Whole-genome duplications offer particularly dramatic examples of the dupli- 
cation-divergence cycle. A whole-genome duplication can occur quite simply: all 
that is required is one round of genome replication in a germ-line cell lineage 
without a corresponding cell division. Initially, the chromosome number simply 
doubles. Such abrupt increases in the ploidy of an organism are common, par- 
ticularly in fungi and plants. After a whole-genome duplication, all genes exist 
as duplicate copies. However, unless the duplication event occurred so recently 
that there has been little time for subsequent alterations in genome structure, 
the results of a series of segmental duplications—occurring at different times— 
are hard to distinguish from the end product of a whole-genome duplication. In 
mammals, for example, the role of whole-genome duplications versus a series of 
piecemeal duplications of DNA segments is quite uncertain. Nevertheless, it is 
clear that a great deal of gene duplication has occurred in the distant past. 

Analysis of the genome of the zebrafish, in which at least one whole-genome 
duplication is thought to have occurred hundreds of millions of years ago, has cast 
some light on the process of gene duplication and divergence. Although many 
duplicates of zebrafish genes appear to have been lost by mutation, a significant 
fraction—perhaps as many as 30-50%—have diverged functionally while both 
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Figure 4—75 A comparison of the structure of one-chain and four-chain 
globins. The four-chain globin shown is hemoglobin, which is a complex of 

two a-globin and two B-globin chains. The one-chain globin present in some 
primitive vertebrates represents an intermediate in the evolution of the four-chain 
globin. With oxygen bound it exists as a monomer; without oxygen it dimerizes. 


copies have remained active. In many cases, the most obvious functional differ- 
ence between the duplicated genes is that they are expressed in different tissues 
or at different stages of development. One attractive theory to explain such an end 
result imagines that different, mildly deleterious mutations occur quickly in both 
copies of a duplicated gene set. For example, one copy might lose expression in 
a particular tissue as a result of a regulatory mutation, while the other copy loses 
expression in a second tissue. Following such an occurrence, both gene copies 
would be required to provide the full range of functions that were once supplied 
by a single gene; hence, both copies would now be protected from loss through 
inactivating mutations. Over a longer period, each copy could then undergo fur- 
ther changes through which it could acquire new, specialized features. 


The Evolution of the Globin Gene Family Shows How DNA 
Duplications Contribute to the Evolution of Organisms 


The globin gene family provides an especially good example of how DNA dupli- 
cation generates new proteins, because its evolutionary history has been worked 
out particularly well. The unmistakable similarities in amino acid sequence and 
structure among the present-day globins indicate that they all must derive from a 
common ancestral gene, even though some are now encoded by widely separated 
genes in the mammalian genome. 

We can reconstruct some of the past events that produced the various types 
of oxygen-carrying hemoglobin molecules by considering the different forms of 
the protein in organisms at different positions on the tree of life. A molecule like 
hemoglobin was necessary to allow multicellular animals to grow to a large size, 
since large animals cannot simply rely on the diffusion of oxygen through the 
body surface to oxygenate their tissues adequately. But oxygen plays a vital part in 
the life of nearly all living organisms, and oxygen-binding proteins homologous to 
hemoglobin can be recognized even in plants, fungi, and bacteria. In animals, the 
most primitive oxygen-carrying molecule is a globin polypeptide chain of about 
150 amino acids that is found in many marine worms, insects, and primitive fish. 
The hemoglobin molecule in more complex vertebrates, however, is composed of 
two kinds of globin chains. It appears that about 500 million years ago, during the 
continuing evolution of fish, a series of gene mutations and duplications occurred. 
These events established two slightly different globin genes in the genome of each 
individual, coding for a- and B-globin chains that associate to form a hemoglobin 
molecule consisting of two a chains and two p chains (Figure 4-75). The four oxy- 
gen-binding sites in the d2(2 molecule interact, allowing a cooperative allosteric 
change in the molecule as it binds and releases oxygen, which enables hemoglo- 
bin to take up and release oxygen more efficiently than the single-chain version. 

Still later, during the evolution of mammals, the B-chain gene apparently 
underwent duplication and mutation to give rise to a second f-like chain that 
is synthesized specifically in the fetus. The resulting hemoglobin molecule has a 
higher affinity for oxygen than adult hemoglobin and thus helps in the transfer 
of oxygen from the mother to the fetus. The gene for the new B-like chain subse- 
quently duplicated and mutated again to produce two new genes, € and y, the € 
chain being produced earlier in development (to form Q2€2) than the fetal y chain, 
which forms Q2V2. A duplication of the adult B-chain gene occurred still later, 
during primate evolution, to give rise to a 0-globin gene and thus to a minor form 
of hemoglobin (262) that is found only in adult primates (Figure 4-76). 

Each of these duplicated genes has been modified by point mutations that 
affect the properties of the final hemoglobin molecule, as well as by changes in 
regulatory regions that determine the timing and level of expression of the gene. 
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Figure 4-76 An evolutionary scheme 
for the globin chains that carry oxygen 
in the blood of animals. The scheme 
emphasizes the B-like globin gene family. 
A relatively recent gene duplication of the 
y-chain gene produced y@ and y*, which 
are fetal B-like chains of identical function. 
The location of the globin genes in the 
human genome is shown at the top of 
the figure. 
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As aresult, each globin is made in different amounts at different times of human 
development. 

The history of these gene duplications is reflected in the arrangement of hemo- 
globin genes in the genome. In the human genome, the genes that arose from the 
original B gene are arranged as a series of homologous DNA sequences located 
within 50,000 nucleotide pairs of one another on a single chromosome. A similar 
cluster of human a-globin genes is located on a separate chromosome. Not only 
other mammals, but birds too have their a- and B-globin gene clusters on sepa- 
rate chromosomes. In the frog Xenopus, however, they are together, suggesting 
that a chromosome translocation event in the lineage of birds and mammals sep- 
arated the two gene clusters about 300 million years ago, soon after our ancestors 
diverged from amphibians (see Figure 4-76). 

There are several duplicated globin DNA sequences in the a- and B-globin 
gene clusters that are not functional genes but pseudogenes. These have a close 
sequence similarity to the functional genes but have been disabled by muta- 
tions that prevent their expression as functional proteins. The existence of such 
pseudogenes makes it clear that, as expected, not every DNA duplication leads to 
a new functional gene. 


Genes Encoding New Proteins Can Be Created by the 
Recombination of Exons 


The role of DNA duplication in evolution is not confined to the expansion of 
gene families. It can also act on a smaller scale to create single genes by string- 
ing together short duplicated segments of DNA. The proteins encoded by genes 
generated in this way can be recognized by the presence of repeating similar pro- 
tein domains, which are covalently linked to one another in series. The immu- 
noglobulins (Figure 4-77), for example, as well as most fibrous proteins (such as 
collagens) are encoded by genes that have evolved by repeated duplications of a 
primordial DNA sequence. 

In genes that have evolved in this way, as well as in many other genes, each 
separate exon often encodes an individual protein folding unit, or domain. It is 
believed that the organization of DNA coding sequences as a series of such exons 
separated by long introns has greatly facilitated the evolution of new proteins. The 
duplications necessary to form a single gene coding for a protein with repeating 
domains, for example, can easily occur by breaking and rejoining the DNA any- 
where in the long introns on either side of an exon; without introns there would be 
only a few sites in the original gene at which a recombinational exchange between 
DNA molecules could duplicate the domain and not disrupt it. By enabling the 
duplication to occur by recombination at many potential sites rather than just a 
few, introns increase the probability of a favorable duplication event. 

More generally, we know from genome sequences that the various parts of 
genes—both their individual exons and their regulatory elements—have served 
as modular elements that have been duplicated and moved about the genome 
to create the great diversity of living things. Thus, for example, many present-day 
proteins are formed as a patchwork of domains from different origins, reflecting 
their complex evolutionary history (see Figure 3-17). 


Neutral Mutations Often Spread to Become Fixed in a Population, 
with a Probability That Depends on Population Size 


In comparisons between two species that have diverged from one another by mil- 
lions of years, it makes little difference which individuals from each species are 


Figure 4-77 Schematic view of an antibody (immunoglobulin) molecule. 
This molecule is a complex of two identical heavy chains and two identical 
light chains. Each heavy chain contains four similar, covalently linked 
domains, while each light chain contains two such domains. Each domain 

is encoded by a separate exon, and all of the exons are thought to have 
evolved by the serial duplication of a single ancestral exon. 
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compared. For example, typical human and chimpanzee DNA sequences differ 
from one another by about 1%. In contrast, when the same region of the genome 
is sampled from two randomly chosen humans, the differences are typically about 
0.1%. For more distantly related organisms, the interspecies differences outshine 
intraspecies variation even more dramatically. However, each “fixed difference” 
between the human and the chimpanzee (in other words, each difference that is 
now characteristic of all or nearly all individuals of each species) started out as a 
new mutation in a single individual. If the size of the interbreeding population in 
which the mutation occurred is N, the initial allele frequency for a new mutation 
would be 1/(2N) for a diploid organism. How does such a rare mutation become 
fixed in the population, and hence become a characteristic of the species rather 
than of a few scattered individuals? 

The answer to this question depends on the functional consequences of the 
mutation. If the mutation has a significantly deleterious effect, it will simply be 
eliminated by purifying selection and will not become fixed. (In the most extreme 
case, the individual carrying the mutation will die without producing progeny.) 
Conversely, the rare mutations that confer a major reproductive advantage on 
individuals who inherit them can spread rapidly in the population. Because 
humans reproduce sexually and genetic recombination occurs each time a gam- 
ete is formed (discussed in Chapter 5), the genome of each individual who has 
inherited the mutation will be a unique recombinational mosaic of segments 
inherited from a large number of ancestors. The selected mutation along with a 
modest amount of neighboring sequence—ultimately inherited from the individ- 
ual in which the mutation occurred—will simply be one piece of this huge mosaic. 

The great majority of mutations that are not harmful are not beneficial either. 
These selectively neutral mutations can also spread and become fixed in a pop- 
ulation, and they make a large contribution to evolutionary change in genomes. 
For example, as we saw earlier, they account for most of the DNA sequence dif- 
ferences between apes and humans. The spread of neutral mutations is not as 
rapid as the spread of the rare strongly advantageous mutations. It depends on 
a random variation in the number of mutation-bearing progeny produced by 
each mutation-bearing individual, causing changes in the relative frequency of 
the mutant allele in the population. Through a sort of “random walk” process, the 
mutant allele may eventually become extinct, or it may become commonplace. 
This can be modeled mathematically for an idealized interbreeding population, 
on the assumption of constant population size and random mating, as well as 
selective neutrality for the mutations. While neither of the first two assumptions 
is a good description of human population history, study of this idealized case 
reveals the general principles in a clear and simple way. 

When a new neutral mutation occurs in a population of constant size N that 
is undergoing random mating, the probability that it will ultimately become fixed 
is approximately 1/(2N). This is because there are 2N copies of the gene in the 
diploid population, and each of them has an equal chance of becoming the pre- 
dominant version in the long run. For those mutations that do become fixed, the 
mathematics shows that the average time to fixation is approximately 4N gener- 
ations. Detailed analyses of data on human genetic variation have suggested an 
ancestral population size of approximately 10,000 at the time when the current 
pattern of genetic variation was largely established. With a population that has 
reached this size, the probability that a new, selectively neutral mutation would 
become fixed is small (1/20,000), while the average time to fixation would be on 
the order of 800,000 years (assuming a 20-year generation time). Thus, while we 
know that the human population has grown enormously since the development 
of agriculture approximately 15,000 years ago, most of the present-day set of com- 
mon human genetic variants reflects the mixture of variants that was already pres- 
ent long before this time, when the human population was still small. 

Similar arguments explain another phenomenon with important practical 
implications for genetic counseling. In an isolated community descended from 
a small group of founders, such as the people of Iceland or the Jews of Eastern 
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Europe, genetic variants that are rare in the human population as a whole can 
often be present at a high frequency, even if those variants are mildly deleterious 
(Figure 4-78). 


A Great Deal Can Be Learned from Analyses of the Variation 
Among Humans 


Even though the common variant gene alleles among modern humans originate 
from variants present in a comparatively tiny group of ancestors, the total number 
of variants now encountered, including those that are individually rare, is very 
large. New neutral mutations are constantly occurring and accumulating, even 
though no single one of them has had enough time to become fixed in the vast 
modern human population. 

From detailed comparisons of the DNA sequences of a large number of mod- 
ern humans located around the globe, scientists can estimate how many gener- 
ations have elapsed since the origin of a particular neutral mutation. From such 
data, it has been possible to map the routes of ancient human migrations. For 
example, by combining this type of genetic analysis with archaeological findings, 
scientists have been able to deduce the most probable routes that our ancestors 
took when they left Africa 60,000 to 80,000 years ago (Figure 4-79). 

We have been focusing on mutations that affect a single gene, but these are not 
the only source of variation. Another source, perhaps even more important but 
missed for many years, lies in the many duplications and deletions of large blocks 
of human DNA. When one compares any individual human with the standard 
reference genome in the database, one will generally find roughly 100 differences 
involving gain or loss of long sequence blocks, totaling perhaps 3 million nucleo- 
tide pairs. Some of these copy number variations (CNVs) will be very common, 
presumably reflecting relatively ancient origins, while others will be present in 
only a small minority of people (Figure 4-80). On average, nearly half of the CNVs 
contain known genes. CNVs have been implicated in many human traits, includ- 
ing color blindness, infertility, hypertension, and a wide variety of disease suscep- 
tibilities. In retrospect, this type of variation is not surprising, given the prominent 
role of DNA addition and DNA loss in vertebrate evolution. 

The intraspecies variations that have been most extensively characterized, 
however, are single-nucleotide polymorphisms (SNPs). These are simply points 
in the genome sequence where one large fraction of the human population has 
one nucleotide, while another substantial fraction has another. To qualify as 
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Figure 4-78 How founder effects 
determine the set of genetic variants in 
a population of individuals belonging to 
the same species. This example illustrates 
how a rare allele (red) can become 
established in an isolated population, 

even though the mutation that produced 

it has no selective advantage —or is mildly 
deleterious. 


Figure 4-79 Tracing the course of 
human history by analyses of genome 
sequences. The map shows the 

routes of the earliest successful human 
migrations. Dotted lines indicate two 
alternative routes that our ancestors are 
thought to have taken out of Africa. DNA 
sequence comparisons suggest that 
modern Europeans descended from a 
small ancestral population that existed 
about 30,000 to 50,000 years ago. 

In agreement, archaeological findings 
suggest that the ancestors of modern 
native Australians (solid red arrows) — and 
of modern European and Middle Eastern 
populations — reached their destinations 
about 45,000 years ago. Even more recent 
studies, comparing the genome sequences 
of living humans with those of Neanderthals 
and another extinct population from 
southern Siberia (the Denisovans), suggest 
that our exit from Africa was a bit more 
convoluted, while also revealing that a 
number of our ancestors interbred with 
these hominid neighbors as they made 
their way across the globe. (Modified from 
P. Forster and S. Matsumura, Science 
308:965-966, 2005.) 
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a polymorphism, the variants must be common enough to give a reasonably 
high probability that the genomes of two randomly chosen individuals will dif- 
fer at the given site; a probability of 1% is commonly chosen as the cutoff. Two 
human genomes sampled from the modern world population at random will dif- 
fer at approximately 2.5 x 10° such sites (1 per 1300 nucleotide pairs). As will be 
described in the overview of genetics in Chapter 8, SNPs in the human genome 
can be extremely useful for genetic mapping analyses, in which one attempts to 
associate specific traits (phenotypes) with specific DNA sequences for medical or 
scientific purposes (see p. 493). But while useful as genetic markers, there is good 
evidence that most of these SNPs have little or no effect on human fitness. This 
is as expected, since deleterious variants will have been selected against during 
human evolution and, unlike SNPs, should therefore be rare. 

Against the background of ordinary SNPs inherited from our prehistoric 
ancestors, certain sequences with exceptionally high mutation rates stand out. A 
dramatic example is provided by CA repeats, which are ubiquitous in the human 
genome and in the genomes of other eukaryotes. Sequences with the motif (CA), 
are replicated with relatively low fidelity because of a slippage that occurs between 
the template and the newly synthesized strands during DNA replication; hence, 
the precise value of n can vary over a considerable range from one genome to the 
next. These repeats make ideal DNA-based genetic markers, since most humans 
are heterozygous, having inherited one repeat length (n) from their mother and a 
different repeat length from their father. While the value of n changes sufficiently 
rarely that most parent-child transmissions propagate CA repeats faithfully, the 
changes are sufficiently frequent to maintain high levels of heterozygosity in the 
human population. These and some other simple repeats that display exception- 
ally high variability therefore provide the basis for identifying individuals by DNA 
analysis in crime investigations, paternity suits, and other forensic applications 
(see Figure 8-39). 

While most of the SNPs and CNVs in the human genome sequence are thought 
to have little or no effect on phenotype, a subset of the genome sequence varia- 
tions must be responsible for the heritable aspects of human individuality. We 
know that even a single nucleotide change that alters one amino acid in a protein 
can cause a serious disease, as for example in sickle-cell anemia, which is caused 
by such a mutation in hemoglobin (Movie 4.3). We also know that gene dosage—a 
doubling or halving of the copy number of some genes—can have a profound 
effect on development by altering the level of gene product, as can changes in 
regulatory DNA sequences. There is therefore every reason to suppose that some 
of the many differences between any two human beings will have substantial 
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Figure 4—80 Detection of copy number 
variations on human chromosome 17. 
When 100 individuals were tested by a 
DNA microarray analysis that detects 

the copy number of DNA sequences 
throughout the entire length of this 
chromosome, the indicated distributions 

of DNA additions (green bars) and DNA 
losses (red bars) were observed compared 
with an arbitrary human reference 
sequence. The shortest red and green bars 
represent a single occurrence among the 
200 chromosomes examined, whereas the 
longer bars indicate that the addition or 
loss was correspondingly more frequent. 
The results show preferred regions where 
the variations occur, and these tend to 

be in or near regions that already contain 
blocks of segmental duplications. Many 

of the changes include known genes. 
(Adapted from J.L. Freeman et al., Genome 
Res. 16:949-961, 2006. With permission 
from Cold Spring Harbor Laboratory Press.) 
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effects on human health, physiology, behavior, and physique. A major challenge 
in human genetics is to recognize those relatively few variations that are function- 
ally important against a large background of variation that is neutral and of no 
consequence. 


Summary 


Comparisons of the nucleotide sequences of present-day genomes have revolution- 
ized our understanding of gene and genome evolution. Because of the extremely 
high fidelity of DNA replication and DNA repair processes, random errors in main- 
taining the nucleotide sequences in genomes occur so rarely that only about one 
nucleotide in a thousand is altered in every million years in any particular eukary- 
otic line of descent. Not surprisingly, therefore, a comparison of human and chim- 
panzee chromosomes—which are separated by about 6 million years of evolution— 
reveals very few changes. Not only are our genes essentially the same, but their order 
on each chromosome is almost identical. Although a substantial number of seg- 
mental duplications and segmental deletions have occurred in the past 6 million 
years, even the positions of the transposable elements that make up a major portion 
of our noncoding DNA are mostly unchanged. 

When one compares the genomes of two more distantly related organisms—such 
as a human and a mouse, separated by about 80 million years—one finds many 
more changes. Now the effects of natural selection can be clearly seen: through puri- 
fying selection, essential nucleotide sequences—both in regulatory regions and in 
coding sequences (exons)—have been highly conserved. In contrast, nonessential 
sequences (for example, much of the DNA in introns) have been altered to such an 
extent that one can no longer see any family resemblance. 

Because of purifying selection, the comparison of the genome sequences of 
multiple related species is an especially powerful way to find DNA sequences with 
important functions. Although about 5% of the human genome has been conserved 
as a result of purifying selection, the function of the majority of this DNA (tens of 
thousands of multispecies conserved sequences) remains mysterious. Future exper- 
iments characterizing its functions should teach us many new lessons about verte- 
brate biology. 

Other sequence comparisons show that a great deal of the genetic complexity of 
present-day organisms is due to the expansion of ancient gene families. DNA dupli- 
cation followed by sequence divergence has clearly been a major source of genetic 
novelty during evolution. On a more recent time scale, the genomes of any two 
humans will differ from each other both because of nucleotide substitutions (sin- 
gle-nucleotide polymorphisms, or SNPs) and because of inherited DNA gains and 
DNA losses that cause copy number variations (CNVs). Understanding the effects 
of these differences will improve both medicine and our understanding of human 
biology. 


PROBLEMS 


Which statements are true? Explain why or why not. 


WHAT WE DON’T KNOW 


e How many different types of 
chromatin structure are important for 
cells? How is each of these structures 
established and maintained, and 
which ones tend to be inherited 
following DNA replication? 


e Why are there so many different 
chromatin remodeling complexes in 
cells? What are their essential roles, 
and how do they get loaded onto 
chromatin at specific places and at 
specific times? 


e How do chromosomal loops form 
during interphase, and what happens 
to these loops in condensed mitotic 
chromosomes? 


e What genetic changes made 

us uniquely human? What further 
aspects of our recent evolutionary 
develooment can be reconstructed 
by sequencing DNA from remains of 
ancient hominids? 


e How much of the enormous 
complexity that we find in cell biology 
is unnecessary, having evolved by 
random drift? 


4-1 Human females have 23 different chromosomes, served DNA sequences facilitates the search for function- 
whereas human males have 24. ally important regions. 
4-2 The four core histones are relatively small proteins 4-5 Gene duplication and divergence is thought to 


with a very high proportion of positively charged amino 
acids; the positive charge helps the histones bind tightly to 
DNA, regardless of its nucleotide sequence. 


4-3 Nucleosomes bind DNA so tightly that they cannot 
move from the positions where they are first assembled. 


4-4 In a comparison between the DNAs of related 
organisms such as humans and mice, identifying the con- 


have played a critical role in the evolution ofincreased bio- 
logical complexity. 


Discuss the following problems. 


4-6 DNA isolated from the bacterial virus M13 con- 
tains 25% A, 33% T, 22% C, and 20% G. Do these results 
strike you as peculiar? Why or why not? How might you 
explain these values? 


CHAPTER 4 END-OF-CHAPTER PROBLEMS 


Figure Q4-1 Three nucleotides from the interior A 
of a single strand of DNA (Problem 4-7). Arrows O 
at the ends of the DNA strand indicate that the 
structure continues in both directions. 


4-7 A segment of DNA from the 

interior of a single strand is shown in O 
Figure Q4-1. What is the polarity of this 
DNA from top to bottom? O 


4-8 Human DNA contains 20% C 
on a molar basis. What are the mole 
percents of A, G, and T? 


4-9 Chromosome 3 in orangutans 
differs from chromosome 3 in humans O 
by two inversion events that occurred | 


in the human lineage (Figure Q4-2). P20 
Draw the intermediate chromosome 
that resulted from the first inversion pe 
and explicitly indicate the segments 1 
included in each inversion. 

@ Figure Q4—2 Chromosome 


3 in orangutans and humans 
(Problem 4-9). Differently colored 
blocks indicate segments of the 
chromosomes that are homologous 
in DNA sequence. 


two inversions 
— 


orangutan human 


4-10 Assuming that the 30-nm chromatin fiber con- 
tains about 20 nucleosomes (200 bp/nucleosome) per 50 
nm of length, calculate the degree of compaction of DNA 
associated with this level of chromatin structure. What 
fraction of the 10,000-fold condensation that occurs at 
mitosis does this level of DNA packing represent? 


4-11 In contrast to histone acetylation, which always 
correlates with gene activation, histone methylation can 
lead to either transcriptional activation or repression. How 
do you suppose that the same modification—methyla- 
tion—can mediate different biological outcomes? 


4-12 Why is a chromosome with two centromeres (a 
dicentric chromosome) unstable? Would a backup cen- 
tromere not be a good thing for a chromosome, giving it 
two chances to form a kinetochore and attach to microtu- 
bules during mitosis? Would that not help to ensure that 
the chromosome did not get left behind at mitosis? 


4-13 Lookatthetwoyeast colonies in Figure Q4-3. Each 
of these colonies contains about 100,000 cells descended 
from a single yeast cell, originally somewhere in the mid- 
dle of the clump. A white colony arises when the Ade2 gene 
is expressed from its normal chromosomal location. When 
the Ade2 gene is moved to a location near a telomere, it 
is packed into heterochromatin and inactivated in most 
cells, giving rise to colonies that are mostly red. In these 
largely red colonies, white sectors fan out from the middle 
of the colony. In both the red and white sectors, the Ade2 
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Figure Q4-3 Position effect on expression of the yeast Ade2 gene 
(Problem 4-13). The Ade2 gene codes for one of the enzymes of 
adenosine biosynthesis, and the absence of the Ade2 gene product 
leads to the accumulation of a red pigment. Therefore a colony of cells 
that express Ade2 is white, and one composed of cells in which the 
Ade2 gene is not expressed is red. 


gene is still located near telomeres. Explain why white sec- 
tors have formed near the rim of the red colony. Based on 
the patterns observed, what can you conclude about the 
propagation of the transcriptional state of the Ade2 gene 
from mother to daughter cells in this experiment? 


4-14 Mobile pieces of DNA—transposable elements— 
that insert themselves into chromosomes and accumulate 
during evolution make up more than 40% of the human 
genome. Transposable elements of four types—long inter- 
spersed nuclear elements (LINEs), short interspersed 
nuclear elements (SINEs), long terminal repeat (LTR) 
retrotransposons, and DNA transposons—are inserted 
more-or-less randomly throughout the human genome. 
These elements are conspicuously rare at the four homeo- 
box gene clusters, HoxA, HoxB, HoxC, and HoxD, as illus- 
trated for HoxD in Figure Q4-4, along with an equivalent 
region of chromosome 22, which lacks a Hox cluster. Each 
Hox cluster is about 100 kb in length and contains 9 to 11 
genes, whose differential expression along the anteropos- 
terior axis of the developing embryo establishes the basic 
body plan for humans (and for other animals). Why do you 
suppose that transposable elements are so rare in the Hox 
clusters? 


chromosome 22 


chromosome 2 





100 kb 


HoxD cluster 


Figure Q4—4 Transposable elements and genes in 1-Mb regions of 
chromosomes 2 and 22 (Problem 4-14). Blue lines that project upward 
indicate exons of known genes. Red lines that project downward 
indicate transposable elements; they are so numerous (constituting more 
than 40% of the human genome) that they merge into nearly a solid 
block outside the Hox clusters. (Adapted from E. Lander et al., Nature 
409:860-921, 2001. With permission from Macmillan Publishers Ltd.) 
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LNA Replication, Repair, 
and Recombination 


The ability of cells to maintain a high degree of order in a chaotic universe depends 
upon the accurate duplication of vast quantities of genetic information carried in 
chemical form as DNA. This process, called DNA replication, must occur before 
a cell can produce two genetically identical daughter cells. Maintaining order 
also requires the continued surveillance and repair of this genetic information, 
because DNA inside cells is repeatedly damaged by chemicals and radiation from 
the environment, as well as by thermal accidents and reactive molecules gener- 
ated inside the cell. In this chapter, we describe the protein machines that repli- 
cate and repair the cell’s DNA. These machines catalyze some of the most rapid 
and accurate processes that take place within cells, and their mechanisms illus- 
trate the elegance and efficiency of cell chemistry. 

While the short-term survival of a cell can depend on preventing changes 
in its DNA, the long-term survival of a species requires that DNA sequences be 
changeable over many generations to permit evolutionary adaptation to changing 
circumstances. We shall see that despite the great efforts that cells make to pro- 
tect their DNA, occasional changes in DNA sequences do occur. Over time, these 
changes provide the genetic variation upon which selection pressures act during 
the evolution of organisms. 

We begin this chapter with a brief discussion of the changes that occur in 
DNA as it is passed down from generation to generation. Next, we discuss the cell 
mechanisms—DNA replication and DNA repair—that are responsible for mini- 
mizing these changes. Finally, we consider some of the most intriguing pathways 
that alter DNA sequences—in particular, those of DNA recombination including 
the movement within chromosomes of special DNA sequences called transpos- 
able elements. 


THE MAINTENANCE OF DNA SEQUENCES 


Although, as just pointed out, occasional genetic changes enhance the long-term 
survival of a species through evolution, the survival of the individual demands a 
high degree of genetic stability. Only rarely do the cell’s DNA-maintenance pro- 
cesses fail, resulting in permanent change in the DNA. Such a change is called a 
mutation, and it can destroy an organism if it occurs in a vital position in the DNA 
sequence. 


Mutation Rates Are Extremely Low 


The mutation rate, the rate at which changes occur in DNA sequences, can be 
determined directly from experiments carried out with a bacterium such as Esch- 
erichia coli—a resident of our intestinal tract and a commonly used laboratory 
organism (see Figure 1-24). Under laboratory conditions, E. coli divides about 
once every 30 minutes, and a single cell can generate a very large population— 
several billion—in less than a day. In such a population, it is possible to detect the 
small fraction of bacteria that have suffered a damaging mutation in a particular 
gene, if that gene is not required for the bacterium’s survival. For example, the 
mutation rate of a gene specifically required for cells to use the sugar lactose as an 
energy source can be determined by growing the cells in the presence of a different 
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sugar, such as glucose, and testing them subsequently to see how many have lost 
the ability to survive on a lactose diet. The fraction of damaged genes underes- 
timates the actual mutation rate because many mutations are silent (for exam- 
ple, those that change a codon but not the amino acid it specifies, or those that 
change an amino acid without affecting the activity of the protein coded for by the 
gene). After correcting for these silent mutations, one finds that a single gene that 
encodes an average-sized protein (~10° coding nucleotide pairs) accumulates a 
mutation (not necessarily one that would inactivate the protein) approximately 
once in about 10° bacterial cell generations. Stated differently, bacteria display 
a mutation rate of about three nucleotide changes per 10!" nucleotides per cell 
generation. 

Recently, it has become possible to measure the germ-line mutation rate 
directly in more complex, sexually reproducing organisms such as humans. In 
this case, the complete genomes from a family—parents and offspring—were 
directly sequenced, and a careful comparison revealed that approximately 70 new 
single-nucleotide mutations arose in the germ lines of each offspring. Normal- 
ized to the size of the human genome, the mutation rate is one nucleotide change 
per 10° nucleotides per human generation. This is a slight underestimate because 
some mutations will be lethal and will therefore be absent from progeny; however, 
because relatively little of the human genome carries critical information, this 
consideration has only a small effect on the true mutation rate. It is estimated that 
approximately 100 cell divisions occur in the germ line from the time of concep- 
tion to the time of production of the eggs and sperm that go on to make the next 
generation. Thus, the human mutation rate, expressed in terms of cell divisions 
(instead of human generations), is approximately 1 mutation/10!° nucleotides/ 
cell division. 

Although E. coliand humans differ greatly in their modes of reproduction and 
in their generation times, when the mutation rates of each are normalized to a 
single round of DNA replication, they are both extremely low and within a factor 
of three of each other. We shall see later in the chapter that the basic mechanisms 
that ensure these low rates of mutation have been conserved since the very early 
history of cells on Earth. 


Low Mutation Rates Are Necessary for Life as We Know It 


Since many mutations are deleterious, no species can afford to allow them to 
accumulate at a high rate in its germ cells. Although the observed mutation fre- 
quency is low, it is nevertheless thought to limit the number of essential proteins 
that any organism can depend upon to perhaps 30,000. More than this, and the 
probability that at least one critical component will suffer a damaging mutation 
becomes catastrophically high. By an extension of the same argument, a mutation 
frequency tenfold higher would limit an organism to about 3000 essential genes. 
In this case, evolution would have been limited to organisms considerably less 
complex than a fruit fly. 

The cells of a sexually reproducing animal or plant are of two types: germ cells 
and somatic cells. The germ cells transmit genetic information from parent to off- 
spring; the somatic cells form the body of the organism (Figure 5-1). We have 
seen that germ cells must be protected against high rates of mutation to maintain 
the species. However, the somatic cells of multicellular organisms must also be 
protected from genetic change to properly maintain the organized structure of the 
body. Nucleotide changes in somatic cells can give rise to variant cells, some of 
which, through “local” natural selection, proliferate rapidly at the expense of the 
rest of the organism. In an extreme case, the result is the uncontrolled cell prolif- 
eration that we know as cancer, a disease that causes (in Europe and North Amer- 
ica) more than 20% of human deaths each year. These deaths are due largely to 
an accumulation of changes in the DNA sequences of somatic cells, as discussed 
in Chapter 20. A significant increase in the mutation frequency would presum- 
ably cause a disastrous increase in the incidence of cancer by accelerating the rate 
at which somatic-cell variants arise. Thus, both for the perpetuation of a species 
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with a large number of genes (germ-cell stability) and for the prevention of can- 
cer resulting from mutations in somatic cells (somatic-cell stability), multicellular 
organisms like ourselves depend on the remarkably high fidelity with which their 
DNA sequences are replicated and maintained. 


Summary 


In all cells, DNA sequences are maintained and replicated with high fidelity. The 
mutation rate, approximately one nucleotide change per 10!" nucleotides each time 
the DNA is replicated, is roughly the same for organisms as different as bacteria and 
humans. Because of this remarkable accuracy, the sequence of the human genome 
(approximately 3.2 x 10° nucleotide pairs) is unchanged or changed by only a few 
nucleotides each time a typical human cell divides. This allows most humans to 
pass accurate genetic instructions from one generation to the next, and also to avoid 
the changes in somatic cells that lead to cancer. 


DNA REPLICATION MECHANISMS 


All organisms duplicate their DNA with extraordinary accuracy before each cell 
division. In this section, we explore how an elaborate “replication machine” 
achieves this accuracy, while duplicating DNA at rates as high as 1000 nucleotides 
per second. 


Base-Pairing Underlies DNA Replication and DNA Repair 


As introduced in Chapter 1, DNA templating is the mechanism the cell uses to copy 
the nucleotide sequence of one DNA strand into a complementary DNA sequence 
(Figure 5-2). This process requires the separation of the DNA helix into two tem- 
plate strands, and entails the recognition of each nucleotide in the DNA template 
strands by a free (unpolymerized) complementary nucleotide. The separation of 
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Figure 5-1 Germ-line cells and somatic 
cells carry out fundamentally different 
functions. In sexually reproducing 
organisms, the germ-line cells (red) 
propagate genetic information into the next 
generation. Somatic cells (blue), which form 
the body of the organism, are necessary 
for the survival of germ-line cells but do not 
themselves leave any progeny. 


Figure 5-2 The DNA double helix acts 
as a template for its own duplication. 
Because the nucleotide A will pair 
successfully only with T, and G only with 

C, each strand of DNA can serve as 

a template to specify the sequence of 
nucleotides in its complementary strand by 
DNA base-pairing. In this way, a double- 
helical DNA molecule can be copied 
precisely. 
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the DNA helix exposes the hydrogen-bond donor and acceptor groups on each 
DNA base for base-pairing with the appropriate incoming free nucleotide, align- 
ing it for its enzyme-catalyzed polymerization into a new DNA chain. 

The first nucleotide-polymerizing enzyme, DNA polymerase, was discovered 
in 1957. The free nucleotides that serve as substrates for this enzyme were found 
to be deoxyribonucleoside triphosphates, and their polymerization into DNA 
required a single-strand DNA template. Figure 5-3 and Figure 5-4 illustrate the 
stepwise mechanism of this reaction. 


The DNA Replication Fork Is Asymmetrical 


During DNA replication inside a cell, each of the two original DNA strands serves 
as a template for the formation of an entire new strand. Because each of the two 
daughters of a dividing cell inherits anew DNA double helix containing one origi- 
nal and one new strand (Figure 5-5), the DNA double helix is said to be replicated 
“semiconservatively.’ How is this feat accomplished? 
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Figure 5-3 The chemistry of DNA synthesis. The addition of a deoxyribonucleotide to the 

3’ end of a polynucleotide chain (the primer strand) is the fundamental reaction by which DNA is 
synthesized. As shown, base-pairing between an incoming deoxyribonucleoside triphosphate and 
an existing strand of DNA (the template strand) guides the formation of the new strand of DNA 
and causes it to have a complementary nucleotide sequence. The way in which complementary 
nucleotides base-pair is shown in Figure 4—4. 
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Figure 5-4 DNA synthesis catalyzed by DNA polymerase. (A) DNA polymerase catalyzes the stepwise addition of a 
deoxyribonucleotide to the 3'-OH end of a polynucleotide chain, the growing primer strand that is paired to an existing template 
strand. The newly synthesized DNA strand therefore polymerizes in the 5’-to-3’ direction as shown also in the previous figure. 
Because each incoming deoxyribonucleoside triphosphate must pair with the template strand to be recognized by the DNA 
polymerase, this strand determines which of the four possible deoxyribonucleotides (A, C, G, or T) will be added. The reaction 
is driven by a large, favorable free-energy change, caused by the release of pyrophosphate and its subsequent hydrolysis 

to two molecules of inorganic phosphate. (B) Structure of DNA polymerase complexed wth DNA (orange), as determined 

by x-ray crystallography (Movie 5.1). The template DNA strand is the longer strand and the newly synthesized DNA is the 
shorter. (C) Schematic diagram of DNA polymerase, based on the structure in (B). The proper base-pair geometry of a correct 
incoming deoxyribonucleoside triphosphate causes the polymerase to tighten around the base pair, thereby initiating the 
nucleotide addition reaction (middle diagram (C)). Dissociation of pyrophosphate relaxes the polymerase, allowing translocation 
of the DNA by one nucleotide so the active site of the polymerase is ready to receive the next deoxyribonucleoside triphosphate. 
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Analyses carried out in the early 1960s on whole replicating chromosomes 
revealed a localized region of replication that moves progressively along the 
parental DNA double helix. Because of its Y-shaped structure, this active region 
is called a replication fork (Figure 5-6). At the replication fork, a multienzyme 
complex that contains the DNA polymerase synthesizes the DNA of both new 
daughter strands. 

Initially, the simplest mechanism of DNA replication seemed to be the con- 
tinuous growth of both new strands, nucleotide by nucleotide, at the replication 
fork as it moves from one end of a DNA molecule to the other. But because of 
the antiparallel orientation of the two DNA strands in the DNA double helix (see 
Figure 5-2), this mechanism would require one daughter strand to polymerize 
in the 5'-to-3’ direction and the other in the 3’-to-5’ direction. Such a replication 
fork would require two distinct types of DNA polymerase enzymes. However, as 
attractive as this model might be, the DNA polymerases at replication forks can 
synthesize only in the 5’-to-3’ direction. 

How, then, can a DNA strand grow in the 3’-to-5’ direction? The answer 
was first suggested by the results of an experiment performed in the late 1960s. 
Researchers added highly radioactive *H-thymidine to dividing bacteria for a 
few seconds, so that only the most recently replicated DNA—that just behind the 
replication fork—became radiolabeled. This experiment revealed the transient 
existence of pieces of DNA that were 1000-2000 nucleotides long, now commonly 
known as Okazaki fragments, at the growing replication fork. (Similar replication 
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Figure 5-5 The semiconservative nature of DNA replication. In a round 
of replication, each of the two strands of DNA is used as a template for the 
formation of a complementary DNA strand. The original strands therefore 
remain intact through many cell generations. 


intermediates were later found in eukaryotes, where they are only 100-200 nucle- 
otides long.) The Okazaki fragments were shown to be polymerized only in the 
5'-to-3' chain direction and to be joined together after their synthesis to create 
long DNA chains. 

A replication fork therefore has an asymmetric structure (Figure 5-7). 
The DNA daughter strand that is synthesized continuously is known as the 
leading strand. Its synthesis slightly precedes the synthesis of the daughter strand 
that is synthesized discontinuously, known as the lagging strand. For the lagging 
strand, the direction of nucleotide polymerization is opposite to the overall direc- 
tion of DNA chain growth. The synthesis of this strand by a discontinuous “back- 
stitching” mechanism means that DNA replication requires only the 5'-to-3' type 
of DNA polymerase. 


The High Fidelity of DNA Replication Requires Several 
Proofreading Mechanisms 


As discussed above, the fidelity of copying DNA during replication is such that only 
about one mistake occurs for every 10!° nucleotides copied. This fidelity is much 
higher than one would expect from the accuracy of complementary base-pairing. 
The standard complementary base pairs (see Figure 4-4) are not the only ones 
possible. For example, with small changes in helix geometry, two hydrogen bonds 
can form between G and T in DNA. In addition, rare tautomeric forms of the four 
DNA bases occur transiently in ratios of 1 part to 104 or 10°. These forms mispair 
without a change in helix geometry: the rare tautomeric form of C pairs with A 
instead of G, for example. 

If the DNA polymerase did nothing special when a mispairing occurred 
between an incoming deoxyribonucleoside triphosphate and the DNA template, 
the wrong nucleotide would often be incorporated into the new DNA chain, 
producing frequent mutations. The high fidelity of DNA replication, however, 
depends not only on the initial base-pairing but also on several “proofreading” 
mechanisms that act sequentially to correct any initial mispairings that might 
have occurred. 
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Figure 5-6 Two replication forks moving 
in opposite directions on a circular 
chromosome. An active zone of DNA 
replication moves progressively along 

a replicating DNA molecule, creating a 
Y-shaped DNA structure known as a 
replication fork: the two arms of each Y 
are the two daughter DNA molecules, 

and the stem of the Y is the parental DNA 
helix. In this diagram, parental strands are 
orange; newly synthesized strands are red. 
(Micrograph courtesy of Jerome Vinograd.) 
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Figure 5-7 The structure of a DNA replication fork. Left, replication fork with newly synthesized 
DNA in red and arrows indicating the 5'-to-3' direction of DNA synthesis. Because both daughter 
DNA strands are polymerized in the 5’-to-3’ direction, the DNA synthesized on the lagging strand 
must be made initially as a series of short DNA molecules, called Okazaki fragments, named after 
the scientist who discovered them. Right, the same fork a short time later. On the lagging strand, 
the Okazaki fragments are synthesized sequentially, with those nearest the fork being the most 
recently made. 


DNA polymerase performs the first proofreading step just before a new nucle- 
otide is covalently added to the growing chain. Our knowledge of this mechanism 
comes from studies of several different DNA polymerases, including one pro- 
duced by a bacterial virus, T7, that replicates inside E. coli. The correct nucleotide 
has a higher affinity for the moving polymerase than does the incorrect nucleo- 
tide, because the correct pairing is more energetically favorable. Moreover, after 
nucleotide binding, but before the nucleotide is covalently added to the grow- 
ing chain, the enzyme must undergo a conformational change in which its “grip” 
tightens around the active site (see Figure 5-4). Because this change occurs more 
readily with correct than incorrect base-pairing, it allows the polymerase to “dou- 
ble-check” the exact base-pair geometry before it catalyzes the addition of the 
nucleotide. Incorrectly paired nucleotides are harder to add and therefore more 
likely to diffuse away before the polymerase can mistakenly add them. 

The next error-correcting reaction, known as exonucleolytic proofreading, 
takes place immediately after those rare instances in which an incorrect nucle- 
otide is covalently added to the growing chain. DNA polymerase enzymes are 
highly discriminating in the types of DNA chains they will elongate: they require 
a previously formed, base-paired 3’-OH end of a primer strand (see Figure 5-4). 
Those DNA molecules with a mismatched (improperly base-paired) nucleotide 
at the 3’-OH end of the primer strand are not effective as templates because the 
polymerase has difficulty extending such a strand. DNA polymerase molecules 
correct such a mismatched primer strand by means of a separate catalytic site 
(either in a separate subunit or in a separate domain of the polymerase molecule, 
depending on the polymerase). This 3'-to-5' proofreading exonuclease clips off any 
unpaired or mispaired residues at the primer terminus, continuing until enough 
nucleotides have been removed to regenerate a correctly base-paired 3'-OH ter- 
minus that can prime DNA synthesis. In this way, DNA polymerase functions as a 
“self-correcting” enzyme that removes its own polymerization errors as it moves 
along the DNA (Figure 5-8 and Figure 5-9). 

The self-correcting properties of the DNA polymerase depend on its require- 
ment for a perfectly base-paired primer terminus, and it is apparently not pos- 
sible for such an enzyme to start synthesis de novo, without an existing primer. 
By contrast, the RNA polymerase enzymes involved in gene transcription do not 
need such an efficient exonucleolytic proofreading mechanism: errors in making 
RNA are not passed on to the next generation, and the occasional defective RNA 
molecule that is produced has no long-term significance. RNA polymerases are 
thus able to start new polynucleotide chains without a primer. 


Figure 5-8 Exonucleolytic proofreading by DNA polymerase during DNA 
replication. In this example, a C is accidentally incorporated at the growing 
3'-OH end of a DNA chain. The part of DNA polymerase that removes the 
misincorporated nucleotide is a specialized member of a large class of 
enzymes, known as exonucleases, that cleave nucleotides one at a time from 
the ends of polynucleotides. 
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Figure 5-9 Editing by DNA polymerase. DNA polymerase complexed with the DNA template in 
the polymerizing mode (left) and the editing mode (right). The catalytic sites for the exonucleolytic (E) 
and the polymerization (P) reactions are indicated. In the editing mode, the newly synthesized DNA 
transiently unpairs from the template and enters the editing site where the most recently added 
nucleotide is catalytically removed. 


There is an error frequency of about one mistake for every 10* polymerization 
events both in RNA synthesis and in the separate process of translating mRNA 
sequences into protein sequences. This error rate is over 100,000 times greater 
than that in DNA replication, where, as we have seen, a series of proofreading 
processes makes the process unusually accurate (Table 5-1). 


Only DNA Replication in the 5’-to-3’ Direction Allows Efficient Error 
Correction 


The need for accuracy probably explains why DNA replication occurs only in the 
5'-to-3' direction. If there were a DNA polymerase that added deoxyribonucleo- 
side triphosphates in the 3'-to-5’ direction, the growing 5’ end of the chain, rather 
than the incoming mononucleotide, would have to provide the activating triphos- 
phate needed for the covalent linkage. In this case, the mistakes in polymeriza- 
tion could not be simply hydrolyzed away, because the bare 5’ end of the chain 
thus created would immediately terminate DNA synthesis (see Figure 5-3). It is 
therefore possible to correct a mismatched base only if it has been added to the 
3’ end of a DNA chain. Although the backstitching mechanism for DNA replica- 
tion seems complex, it preserves the 5’-to-3’ direction of polymerization that is 
required for exonucleolytic proofreading. 

Despite these safeguards against DNA replication errors, DNA polymerases 
occasionally make mistakes. However, as we shall see later, cells have yet another 


TABLE 5-1 


The third step, strand-directed mismatch repair, is described later in this chapter. For the 
polymerization step, “errors per nucleotide added” describes the probability that an incorrect 
nucleotide will be added to the growing chain. For the other two steps, “errors per nucleotide 
added” describes the probability that an error will not be corrected. Each step therefore reduces 
the chance of a final error by the factor shown. 
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Figure 5-10 RNA primer synthesis. A schematic view of the reaction 
catalyzed by DNA primase, the enzyme that synthesizes the short RNA 
primers made on the lagging strand using DNA as a template. Unlike DNA 
polymerase, this enzyme can start a new polynucleotide chain by joining 
two nucleoside triphosphates together. The primase synthesizes a short 
polynucleotide in the 5’-to-3" direction and then stops, making the 

3’ end of this primer available for the DNA polymerase. 


chance to correct these errors by a process called strand-directed mismatch repair. 
Before discussing this mechanism, however, we describe the other types of pro- 
teins that function at the replication fork. 


A Special Nucleotide-Polymerizing Enzyme Synthesizes Short 
RNA Primer Molecules on the Lagging Strand 


For the leading strand, a primer is needed only at the start of replication: once 
a replication fork is established, the DNA polymerase is continuously presented 
with a base-paired chain end on which to add new nucleotides. On the lagging 
side of the fork, however, each time the DNA polymerase completes a short DNA 
Okazaki fragment (which takes a few seconds), it must start synthesizing a com- 
pletely new fragment at a site further along the template strand (see Figure 5-7). 
A special mechanism produces the base-paired primer strand required by the 
DNA polymerase molecules. The mechanism depends on an enzyme called 
DNA primase, which uses ribonucleoside triphosphates to synthesize short 
RNA primers on the lagging strand (Figure 5-10). In eukaryotes, these primers 
are about 10 nucleotides long and are made at intervals of 100-200 nucleotides on 
the lagging strand. 

The chemical structure of RNA was introduced in Chapter 1 and is described 
in detail in Chapter 6. Here, we note only that RNA is very similar in structure to 
DNA. A strand of RNA can form base pairs with a strand of DNA, generating a 
DNA-RNA hybrid double helix if the two nucleotide sequences are complemen- 
tary. Thus, the same templating principle used for DNA synthesis guides the syn- 
thesis of RNA primers. Because an RNA primer contains a properly base-paired 
nucleotide with a 3'-OH group at one end, it can be elongated by the DNA poly- 
merase at this end to begin an Okazaki fragment. The synthesis of each Okazaki 
fragment ends when this DNA polymerase runs into the RNA primer attached to 
the 5’ end of the previous fragment. To produce a continuous DNA chain from the 
many DNA fragments made on the lagging strand, a special DNA repair system 
acts quickly to erase the old RNA primer and replace it with DNA. An enzyme 
called DNA ligase then joins the 3’ end of the new DNA fragment to the 5’ end of 
the previous one to complete the process (Figure 5-11 and Figure 5-12). 

Why might an erasable RNA primer be preferred to a DNA primer that would 
not need to be erased? The argument that a self-correcting polymerase cannot 
start chains de novo also implies the converse: an enzyme that starts chains anew 
cannot be efficient at self-correction. Thus, any enzyme that primes the synthesis 
of Okazaki fragments will of necessity make a relatively inaccurate copy (at least 
one error in 10°). Even if the copies retained in the final product constituted as 
little as 5% of the total genome (for example, 10 nucleotides per 200-nucleotide 
DNA fragment), the resulting increase in the overall mutation rate would be enor- 
mous. It therefore seems likely that the use of RNA rather than DNA for priming 
brings a powerful advantage to the cell: the ribonucleotides in the primer auto- 
matically mark these sequences as “suspect copy” to be efficiently removed and 
replaced. 


Figure 5-11 The synthesis of one of many DNA fragments on the 
lagging strand. In eukaryotes, RNA primers are made at intervals spaced 
by about 200 nucleotides on the lagging strand, and each RNA primer is 
approximately 10 nucleotides long. This primer is erased by a special DNA 
repair enzyme (an RNAse H) that recognizes an RNA strand in an RNA/DNA 
helix and fragments it; this leaves gaps that are filled in by DNA polymerase 
and DNA ligase. 
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ahead of the replication fork so that the incoming deoxyribonucleoside triphos- forming the new bond (step 2). In this way, 
phates can form base pairs with the template strands. However, the DNA double the energetically unfavorable nick-sealing 
helix is very stable under physiological conditions; the base pairs are locked in reaction is driven by being coupled to the 
place so strongly that it requires temperatures approaching that of boiling water to ee JAVON PROCESS Ole 
separate the two strands in a test tube. For this reason, two additional types of rep- a 

lication proteins—DNA helicases and single-strand DNA-binding proteins—are 
needed to open the double helix and provide the appropriate single-strand DNA 
templates for the DNA polymerase to copy. 

DNA helicases were first isolated as proteins that hydrolyze ATP when they are 
bound to single strands of DNA. As described in Chapter 3, the hydrolysis of ATP 
can change the shape of a protein molecule in a cyclical manner that allows the 
protein to perform mechanical work. DNA helicases use this principle to propel 
themselves rapidly along a DNA single strand. When they encounter a region of 
double helix, they continue to move along their strand, thereby prying apart the 
helix at rates of up to 1000 nucleotide pairs per second (Figure 5-13 and Figure 
5-14). 

The two strands of DNA have opposite polarities, and, in principle, a helicase 
could unwind the DNA double helix by moving in the 5'-to-3' direction along one 5’ 
strand or in the 3’-to-5’ direction along the other. In fact, both types of DNA heli- 
case exist. In the best-understood replication systems in bacteria, a helicase mov- DNA helicase 
ing 5’ to 3’ along the lagging-strand template appears to have the predominant binds 
role, for reasons that will become clear shortly. 

Single-strand DNA-binding (SSB) proteins, also called helix-destabilizing 
proteins, bind tightly and cooperatively to exposed single-strand DNA without 
covering the bases, which therefore remain available as templates. These proteins 
are unable to open a long DNA helix directly, but they aid helicases by stabiliz- 
ing the unwound, single-strand conformation. In addition, through cooperative 
binding, they coat and straighten out the regions of single-strand DNA, which 
occur routinely in the lagging-strand template, thereby preventing the formation 
of the short hairpin helices that readily form in single-strand DNA (Figure 5-15 
and Figure 5-16). If not removed, these hairpin helices can impede the DNA syn- 
thesis catalyzed by DNA polymerase. 





3' 


A Sliding Ring Holds a Moving DNA Polymerase Onto the DNA 


On their own, most DNA polymerase molecules will synthesize only a short string 
of nucleotides before falling off the DNA template. The tendency to dissociate 
quickly from a DNA molecule allows a DNA polymerase molecule that has just 


Figure 5-13 An assay for DNA helicase enzymes. A short DNA fragment 
is annealed to a long DNA single strand to form a region of DNA double helix. 
The double helix is melted as the helicase runs along the DNA single strand, 
releasing the short DNA fragment in a reaction that requires the presence 

of both the helicase protein and ATP. The rapid stepwise movement of the 
helicase is powered by its ATP hydrolysis (shown schematically in Figure 
3-75A). As indicated, many DNA helicases are composed of six subunits. 
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Figure 5-14 The structure of a DNA helicase. (A) Diagram of the protein 
as a hexameric ring drawn to scale with a replication fork. (B) Detailed 
structure of the bacteriophage T7 replicative helicase, as determined by 
x-ray diffraction. Six identical subunits bind and hydrolyze ATP in an ordered 
fashion to propel this molecule, like a rotary engine, along a DNA single 
strand that passes through the central hole. Red indicates bound ATP 
molecules in the structure (Movie 5.2). (PDB code: 1EOu.) 


finished synthesizing one Okazaki fragment on the lagging strand to be recycled 
quickly, so as to begin the synthesis of the next Okazaki fragment on the same 
strand. This rapid dissociation, however, would make it difficult for the poly- 
merase to synthesize the long DNA strands produced at a replication fork were it 
not for an accessory protein (called PCNA in eukaryotes) that functions as a regu- 
lated sliding clamp. This clamp keeps the polymerase firmly on the DNA when 
it is moving, but releases it as soon as the polymerase runs into a double-strand 
region of DNA. 

How can a sliding clamp prevent the polymerase from dissociating without at 
the same time impeding the polymerase’s rapid movement along the DNA mole- 
cule? The three-dimensional structure of the clamp protein, determined by x-ray 
diffraction, revealed it to be a large ring around the DNA double helix. One face 
of the ring binds to the back of the DNA polymerase, and the whole ring slides 
freely along the DNA as the polymerase moves. The assembly of the clamp around 
the DNA requires ATP hydrolysis by a special protein complex, the clamp loader, 
which hydrolyzes ATP as it loads the clamp on to a primer-template junction 
(Figure 5-17). 

On the leading-strand template, the moving DNA polymerase is tightly bound 
to the clamp, and the two remain associated for a very long time. The DNA poly- 
merase on the lagging-strand template also makes use of the clamp, but each 
time the polymerase reaches the 5’ end of the preceding Okazaki fragment, the 
polymerase releases itself from the clamp and dissociates from the template. This 
polymerase molecule then associates with a new clamp that is assembled on the 
RNA primer of the next Okazaki fragment. 
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Figure 5-15 The effect of single-strand DNA-binding proteins (SSB proteins) on the structure 
of single-strand DNA. Because each protein molecule prefers to bind next to a previously 

bound molecule, long rows of this protein form on a DNA single strand. This cooperative binding 
straightens out the DNA template and facilitates the DNA polymerization process. The “hairpin 
helices” shown in the bare, single-strand DNA result from a chance matching of short regions of 
complementary nucleotide sequence; they are similar to the short helices that typically form in RNA 
molecules (see Figure 1—6). 
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Figure 5-16 Human single-strand binding protein bound to DNA. (A) Front view of the two 
DNA-binding domains of the protein (called RPA) which cover a total of eight nucleotides. Note 
that the DNA bases remain exposed in this protein-DNA complex. (B) Diagram showing the three- 
dimensional structure, with the DNA strand (orange) viewed end-on. (PDB code: 1JMC.) 


Figure 5-17 The regulated sliding clamp that holds DNA polymerase on the DNA. (A) The 
structure of the clamp protein from E. coli, as determined by x-ray crystallography, with a DNA helix 
added to indicate how the protein fits around DNA (Movie 5.3). (B) Schematic illustration showing 
how the clamp (with red and yellow subunits) is loaded onto DNA to serve as a tether for a moving 
DNA polymerase molecule. The structure of the clamp loader (dark green) resembles a screw nut, 
with its threads matching the grooves of double-stranded DNA. The loader binds to a free clamp 
molecule, forcing a gap in its ring of Subunits so that this ring is able to slip around DNA. The clamp 
loader, thanks to its screw-nut structure, recognises the region of DNA that is double-stranded 

and latches onto it, tightening around the complex of a template strand with a freshly synthesized 
elongating (primer) strand. It carries the clamp along this double-stranded region until it encounters 
the 3’ end of the primer, at which point the loader hydrolyzes ATP and releases the clamp, allowing 
it to close around the DNA and bind to DNA polymerase. In the simplified reaction shown here, the 
clamp loader dissociates into solution once the clamp has been assembled. At a true replication 
fork, the clamp loader remains close to the polymerase so that, on the lagging strand, it is ready to 
assemble a new clamp at the start of each new Okazaki fragment (See Figure 5-18). (A, from 

X.P. Kong et al., Cell 69:425-437, 1992. With permission from Elsevier; B, adapted from B.A. Kelch 
et al., Science 334:1675-1680, 2011. With permission from AAAS. PDB code: 3BEP,) 
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The Proteins at a Replication Fork Cooperate to Form a 
Replication Machine 


Although we have discussed DNA replication as though it were performed by a 
mix of proteins all acting independently, in reality most of the proteins are held 
together in a large and orderly multienzyme complex that rapidly synthesizes 
DNA. This complex can be likened to a tiny sewing machine composed of protein 
parts and powered by nucleoside triphosphate hydrolysis. Like a sewing machine, 
the replication complex probably remains stationary with respect to its immedi- 
ate surroundings; the DNA can be thought of as a long strip of cloth being rapidly 
threaded through it. Although the replication complex has been most intensively 
studied in E. coli and several of its viruses, a very similar complex also operates in 
eukaryotes, as we see below. 

Figure 5-18 summarizes the functions of the subunits of the replication 
machine. At the front of the replication fork, DNA helicase opens the DNA helix. 
Two DNA polymerase molecules work at the fork, one on the leading strand and 
one on the lagging strand. Whereas the DNA polymerase molecule on the leading 
strand can operate in a continuous fashion, the DNA polymerase molecule on the 
lagging strand must restart at short intervals, using a short RNA primer made by 
a DNA primase molecule. The close association of all these protein components 
increases the efficiency of replication and is made possible by a folding back of 
the lagging strand as shown in Figure 5-18A. This arrangement also facilitates the 
loading of the polymerase clamp each time that an Okazaki fragment is synthe- 
sized: the clamp loader and the lagging-strand DNA polymerase molecule are 
kept in place as a part of the protein machine even when they detach from their 
DNA template. The replication proteins are thus linked together into a single large 
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Figure 5-18 A bacterial replication fork. (A) This schematic diagram shows a current view of the arrangement of replication proteins at a replication 
fork when DNA is being synthesized. The lagging-strand DNA is folded to bring the lagging-strand DNA polymerase molecule into a complex with the 
leading-strand DNA polymerase molecule. This folding also brings the 3’ end of each completed Okazaki fragment close to the start site for the next 
Okazaki fragment. Because the lagging-strand DNA polymerase molecule remains bound to the rest of the replication proteins, it can be reused to 
synthesize successive Okazaki fragments. In this diagram, it is about to let go of its completed DNA fragment and move to the RNA primer that is just 
being synthesized. Additional proteins (not shown) help to hold the different protein components of the fork together, enabling them to function as a 
well-coordinated protein machine (Movie 5.4 and Movie 5.5). (B) An electron micrograph showing the replication machine from the bacteriophage 
T4 as it moves along a template synthesizing DNA behind it. (C) An interpretation of the micrograph is given in the sketch: note especially the 

DNA loop on the lagging strand. Apparently, the replication proteins became partly detached from the very front of the replication fork during the 
preparation of this sample for electron microscopy. (B, courtesy of Jack Griffith; see P.D. Chastain et al., J. Biol. Chem. 278:21276-21 285, 2003.) 
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unit (total molecular mass >10° daltons), enabling DNA to be synthesized on both 
sides of the replication fork in a coordinated and efficient manner. 

On the lagging strand, the DNA replication machine leaves behind a series of 
unsealed Okazaki fragments, which still contain the RNA that primed their syn- 
thesis at their 5’ ends. As discussed earlier, this RNA is removed and the resulting 
gap is filled in by DNA repair enzymes that operate behind the replication fork 
(see Figure 5-11). 


A Strand-Directed Mismatch Repair System Removes Replication 
Errors That Escape from the Replication Machine 


As stated previously, bacteria such as E. coli are capable of dividing once every 30 
minutes, making it relatively easy to screen large populations to find a rare mutant 
cell that is altered in a specific process. One interesting class of mutants consists of 
those with alterations in so-called mutator genes, which greatly increase the rate 
of spontaneous mutation. Not surprisingly, one such mutant makes a defective 
form of the 3’-to-5’ proofreading exonuclease that is a part of the DNA polymerase 
enzyme (see Figures 5-8 and 5-9). The mutant DNA polymerase no longer proof- 
reads effectively, and many replication errors that would otherwise have been 
removed accumulate in the DNA. 

The study of other E. coli mutants exhibiting abnormally high mutation rates 
has uncovered a proofreading system that removes replication errors made by the 
polymerase that have been missed by the proofreading exonuclease. This strand- 
directed mismatch repair system detects the potential for distortion in the DNA 
helix from the misfit between noncomplementary base pairs. 

If the proofreading system simply recognized a mismatch in newly replicated 
DNA and randomly corrected one of the two mismatched nucleotides, it would 
mistakenly “correct” the original template strand to match the error exactly half 
the time, thereby failing to lower the overall error rate. To be effective, such a proof- 
reading system must be able to distinguish and remove the mismatched nucleo- 
tide only on the newly synthesized strand, where the replication error occurred. 

The strand-distinction mechanism used by the mismatch proofreading system 
in E. coli depends on the methylation of selected A residues in the DNA. Methyl 
groups are added to all A residues in the sequence GATC, but not until some 
time after the A has been incorporated into a newly synthesized DNA chain. As 
a result, the only GATC sequences that have not yet been methylated are in the 
new strands just behind a replication fork. The recognition of these unmethylated 
GATCs allows the new DNA strands to be transiently distinguished from old ones, 
as required if their mismatches are to be selectively removed. The three-step pro- 
cess involves recognition of a newly synthesized strand, excision of the portion 
containing the mismatch, and resynthesis of the excised segment using the old 
strand as a template. This strand-directed mismatch repair system reduces the 
number of errors made during DNA replication by an additional factor of 100 to 
1000 (see Table 5-1, p. 244). 

A similar mismatch proofreading system functions in eukaryotic cells but uses 
a different strategy to distinguish the new strand from the old (Figure 5-19). Newly 
synthesized lagging-strand DNA transiently contains nicks (before they are sealed 
by DNA ligase) and such nicks (also called single-strand breaks) provide the sig- 
nal that directs the mismatch proofreading system to the appropriate strand. This 
strategy also requires that the newly synthesized DNA on the leading strand be 
transiently nicked; how this occurs is uncertain. 

The importance of mismatch proofreading in humans is seen in individuals 
who inherit one defective copy of a mismatch repair gene (along with a functional 
gene on the other copy of the chromosome). These people have a marked predis- 
position for certain types of cancers. For example, in a type of colon cancer called 
hereditary nonpolyposis colon cancer (HNPCC), spontaneous mutation of the one 
functional gene produces a clone of somatic cells that, because they are deficient 
in mismatch proofreading, accumulate mutations unusually rapidly. Most can- 
cers arise in cells that have accumulated multiple mutations (see pp. 1096-1097), 
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and cells deficient in mismatch proofreading therefore have a greatly enhanced 
chance of becoming cancerous. Fortunately, most of us inherit two good cop- 
ies of each gene that encodes a mismatch proofreading protein; this protects us, 
because it is highly unlikely for both copies to become mutated in the same cell. 


DNA Topoisomerases Prevent DNA Tangling During Replication 


As a replication fork moves along double-strand DNA, it creates what has been 
called the “winding problem.’ The two parental strands, which are wound around 
each other, must be unwound and separated for replication to occur. For every 10 
nucleotide pairs replicated at the fork, one complete turn of the parental double 
helix must be unwound. In principle, this unwinding can be achieved by rapidly 
rotating the entire chromosome ahead of a moving fork; however, this is energet- 
ically highly unfavorable (particularly for long chromosomes) and, instead, the 
DNA in front of a replication fork becomes overwound (Figure 5-20). The over- 
winding, in turn, is continually relieved by proteins known as DNA topoisomer- 
ases. 

A DNA topoisomerase can be viewed as a reversible nuclease that adds itself 
covalently to a DNA backbone phosphate, thereby breaking a phosphodiester 
bond in a DNA strand. This reaction is reversible, and the phosphodiester bond 
re-forms as the protein leaves. 

One type of topoisomerase, called topoisomerase I, produces a transient sin- 
gle-strand break; this break in the phosphodiester backbone allows the two sec- 
tions of DNA helix on either side of the nick to rotate freely relative to each other, 
using the phosphodiester bond in the strand opposite the nick as a swivel point 
(Figure 5-21). Any tension in the DNA helix will drive this rotation in the direction 
that relieves the tension. As a result, DNA replication can occur with the rotation 
of only a short length of helix—the part just ahead of the fork. Because the cova- 
lent linkage that joins the DNA topoisomerase protein to a DNA phosphate retains 
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Figure 5-19 Strand-directed mismatch 
repair. (A) The two proteins shown are 
present in both bacteria and eukaryotic 
cells: MutS binds specifically to a 
mismatched base pair, while MutL scans 
the nearby DNA for a nick. Once MutL 
finds a nick, it triggers the degradation of 
the nicked strand all the way back through 
the mismatch. Because nicks are largely 
confined to newly replicated strands in 
eukaryotes, replication errors are selectively 
removed. In bacteria, an additional protein 
in the complex (MutH) nicks unmethylated 
(and therefore newly replicated) GATC 
sequences, thereby beginning the process 
illustrated here. In eukaryotes, MutL 
contains a DNA nicking activity that aids in 
the removal of the damaged strand. 

(B) The structure of the MutS protein bound 
to a DNA mismatch. This protein is a 

dimer, which grips the DNA double helix as 
shown, kinking the DNA at the mismatched 
base pair. It seems that the MutS protein 
scans the DNA for mismatches by testing 
for sites that can be readily kinked, which 
are those with an abnormal base pair. 

(PDB code: 1EWQ.) 


Figure 5-20 The “winding problem” 

that arises during DNA replication. 

(A) For a bacterial replication fork moving at 
500 nucleotides per second, the parental 
DNA helix ahead of the fork must rotate 

at 50 revolutions per second. (B) If the 
ends of the DNA double helix remain fixed 
(or difficult to rotate), tension builds up in 
front of the replication fork as it becomes 
overwound. Some of this tension can be 
taken up by supercoiling, whereby the DNA 
double helix twists around itself (see Figure 
6-19). However, if the tension continues to 
build up, the replication fork will eventually 
stop because further unwinding requires 
more energy than the helicase can provide. 
Note that in (A), the dotted line represents 
about 20 turns of DNA. 
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the energy of the cleaved phosphodiester bond, resealing is rapid and does not 
require additional energy input. In this respect, the rejoining mechanism differs 
from that catalyzed by the enzyme DNA ligase, discussed previously (see Figure 
5-12). 

A second type of DNA topoisomerase, topoisomerase II, forms a covalent 
linkage to both strands of the DNA helix at the same time, making a transient 


Figure 5-21 The reversible DNA nicking 
reaction catalyzed by a eukaryotic DNA 
topoisomerase | enzyme. As indicated, 
these enzymes transiently form a single 
covalent bond with DNA; this allows free 
rotation of the DNA around the covalent 
backbone bonds linked to the blue 
phosphate. 
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Figure 5-22 The DNA-helix-passing reaction catalyzed by DNA 
topoisomerase II. Unlike type | topoisomerases, type Il enzymes hydrolyze 
ATP (red), which is needed to release and reset the enzyme after each cycle. 
Type Il topoisomerases are largely confined to proliferating cells in eukaryotes; 
partly for that reason, they have been effective targets for anticancer drugs. 
Some of these drugs inhibit topoisomerase II at the third step in the figure 
and thereby produce high levels of double-strand breaks that kill rapidly 
dividing cells. The small yellow circles represent the phosphates in the DNA 
backbone that become covalently bonded to the topoisomerase (see 

Figure 5-21). 


double-strand break in the helix. These enzymes are activated by sites on chromo- 
somes where two double helices cross over each other such as those generated by 
supercoiling in front of a replication fork (see Figure 5-20). Once a topoisomerase 
II molecule binds to such a crossing site, the protein uses ATP hydrolysis to per- 
form the following set of reactions efficiently: (1) it breaks one double helix revers- 
ibly to create a DNA “gate”; (2) it causes the second, nearby double helix to pass 
through this opening; and (3) it then reseals the break and dissociates from the 
DNA. At crossover points generated by supercoiling, passage of the double helix 
through the gate occurs in the direction that will reduce supercoiling. In this way, 
type II topoisomerases can relieve the overwinding tension generated in front of 
a replication fork. Their reaction mechanism also allows type II DNA topoisomer- 
ases to efficiently separate two interlocked DNA circles (Figure 5-22). 

Topoisomerase II also prevents the severe DNA tangling problems that would 
otherwise arise during DNA replication. This role is nicely illustrated by mutant 
yeast cells that produce, in place of the normal topoisomerase II, a version that is 
inactive above 37°C. When the mutant cells are warmed to this temperature, their 
daughter chromosomes remain intertwined after DNA replication and are unable 
to separate. The enormous usefulness of topoisomerase II for untangling chromo- 
somes can readily be appreciated by anyone who has struggled to remove a tangle 
from a fishing line without the aid of scissors. 


DNA Replication Is Fundamentally Similar in Eukaryotes and 
Bacteria 


Much of what we know about DNA replication was first derived from studies 
of purified bacterial and bacteriophage multienzyme systems capable of DNA 
replication in vitro. The development of these systems in the 1970s was greatly 
facilitated by the prior isolation of mutants in a variety of replication genes; these 
mutants were exploited to identify and purify the corresponding replication pro- 
teins. The first mammalian replication system that accurately replicated DNA in 
vitro was described in the mid-1980s, and mutations in genes encoding nearly all 
of the replication components have now been isolated and analyzed in the yeast 
Saccharomyces cerevisiae. As a result, much is known about the detailed enzymol- 
ogy of DNA replication in eukaryotes, and it is clear that the fundamental features 
of DNA replication—including replication-fork geometry and the use of a multi- 
protein replication machine—have been conserved during the long evolutionary 
process that separated bacteria from eukaryotes. 

There are more protein components in eukaryotic replication machines than 
there are in the bacterial analogs, even though the basic functions are the same. 
Thus, for example, the eukaryotic single-strand binding (SSB) protein is formed 
from three subunits, whereas only a single subunit is found in bacteria. Similarly, 
the eukaryotic DNA primase is incorporated into a multisubunit enzyme that also 
contains a polymerase called DNA polymerase a-primase. This protein complex 
begins each Okazaki fragment on the lagging strand with RNA and then extends 
the RNA primer with a short length of DNA. At this point, the two main eukary- 
otic replicative DNA polymerases, Pold and Pole, come into play: Pold completes 
each Okazaki fragment on the lagging strand and Pole extends the leading strand. 
The increased complexity of eukaryotic replication machinery probably reflects 
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more elaborate controls. For example, the orderly maintenance of different cell 
types and tissues in animals and plants requires that DNA replication be tightly 
regulated. Moreover, eukaryotic DNA replication must be coordinated with the 
elaborate process of mitosis, as we discuss in Chapter 17. 

As we see in the next section, the eukaryotic replication machinery has the 
added complication of having to replicate through nucleosomes, the repeating 
structural unit of chromosomes discussed in Chapter 4. Nucleosomes are spaced 
at intervals of about 200 nucleotide pairs along the DNA, which, as we will see, 
explains why new Okazaki fragments are synthesized on the lagging strand at 
intervals of 100-200 nucleotides in eukaryotes, instead of 1000-2000 nucleotides 
as in bacteria. Nucleosomes may also act as barriers that slow down the move- 
ment of DNA polymerase molecules, which may be why eukaryotic replication 
forks move only about one-tenth as fast as bacterial replication forks. 


Summary 


DNA replication takes place at a Y-shaped structure called a replication fork. A 
self-correcting DNA polymerase enzyme catalyzes nucleotide polymerization in a 
5'-to-3' direction, copying a DNA template strand with remarkable fidelity. Since 
the two strands of a DNA double helix are antiparallel, this 5'-to-3' DNA synthesis 
can take place continuously on only one of the strands at a replication fork (the 
leading strand). On the lagging strand, short DNA fragments must be made by a 
“backstitching” process. Because the self-correcting DNA polymerase cannot start 
a new chain, these lagging-strand DNA fragments are primed by short RNA primer 
molecules that are subsequently erased and replaced with DNA. 

DNA replication requires the cooperation of many proteins. These include (1) 
DNA polymerase and DNA primase to catalyze nucleoside triphosphate polymer- 
ization; (2) DNA helicases and single-strand DNA-binding (SSB) proteins to help in 
opening up the DNA helix so that it can be copied; (3) DNA ligase and an enzyme 
that degrades RNA primers to seal together the discontinuously synthesized lagging- 
strand DNA fragments; and (4) DNA topoisomerases to help to relieve helical wind- 
ing and DNA tangling problems. Many of these proteins associate with each other 
at a replication fork to form a highly efficient “replication machine,” through which 
the activities and spatial movements of the individual components are coordinated. 


THE INITIATION AND COMPLETION OF DNA 
REPLICATION IN CHROMOSOMES 


We have seen how a set of replication proteins rapidly and accurately generates 
two daughter DNA double helices behind a replication fork. But how is this rep- 
lication machinery assembled in the first place, and how are replication forks 
created on an intact, double-strand DNA molecule? In this section, we discuss 
how cells initiate DNA replication and how they carefully regulate this process to 
ensure that it takes place not only at the proper positions on the chromosome but 
also at the appropriate time in the life of the cell. We also discuss a few of the spe- 
cial problems that the replication machinery in eukaryotic cells must overcome. 
These include the need to replicate the enormously long DNA molecules found in 
eukaryotic chromosomes, as well as the difficulty of copying DNA molecules that 
are tightly complexed with histones in nucleosomes. 


DNA Synthesis Begins at Replication Origins 


As discussed previously, the DNA double helix is normally very stable: the two 
DNA strands are locked together firmly by many hydrogen bonds formed between 
the bases on each strand. To begin DNA replication, the double helix must first 
be opened up and the two strands separated to expose unpaired bases. As we 
shall see, the process of DNA replication is begun by special initiator proteins that 
bind to double-strand DNA and pry the two strands apart, breaking the hydrogen 
bonds between the bases. 
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Figure 5-23 A replication bubble formed 
by replication-fork initiation. This diagram 
Outlines the major steps in the initiation of 
replication forks at replication origins. The 
structure formed at the last step, in which 
both strands of the parental DNA helix have 
been separated from each other and serve 
as templates for DNA synthesis, is called a 
replication bubble. 
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Figure 5-24 DNA replication of a bacterial genome. It takes E. coli 
about 30 minutes to duplicate its genome of 4.6 x 108 nucleotide pairs. For 
simplicity, no Okazaki fragments are shown on the lagging strand. What 
happens as the two replication forks approach each other and collide at the 
end of the replication cycle is not well understood, although the replication 
machines are disassembled as part of the process. 


The positions at which the DNA helix is first opened are called replication ori- 
gins (Figure 5-23). In simple cells like those of bacteria or yeast, origins are spec- 
ified by DNA sequences several hundred nucleotide pairs in length. This DNA 
contains both short sequences that attract initiator proteins and stretches of DNA 
that are especially easy to open. We saw in Figure 4-4 that an A-T base pair is held 
together by fewer hydrogen bonds than a G-C base pair. Therefore, DNA rich in 
A-T base pairs is relatively easy to pull apart, and regions of DNA enriched in A-T 
base pairs are typically found at replication origins. 

Although the basic process of replication-fork initiation depicted in Figure 
5-23 is fundamentally the same for bacteria and eukaryotes, the detailed way in 
which this process is performed and regulated differs between these two groups 
of organisms. We first consider the simpler and better-understood case in bacte- 
ria and then turn to the more complex situation found in yeasts, mammals, and 
other eukaryotes. 


Bacterial Chromosomes Typically Have a Single Origin of DNA 
Replication 


The genome of E. coli is contained in a single circular DNA molecule of 4.6 x 10° 
nucleotide pairs. DNA replication begins at a single origin of replication, and the 
two replication forks assembled there proceed (at approximately 1000 nucleotides 
per second) in opposite directions until they meet up roughly halfway around the 
chromosome (Figure 5-24). The only point at which E. coli can control DNA rep- 
lication is initiation: once the forks have been assembled at the origin, they syn- 
thesize DNA at relatively constant speed until replication is finished. Therefore, 
it is not surprising that the initiation of DNA replication is highly regulated. The 
process begins when initiator proteins (in their ATP-bound state) bind in multiple 
copies to specific DNA sites located at the replication origin, wrapping the DNA 
around the proteins to form a large protein-DNA complex that destabilizes the 
adjacent double helix. This complex then attracts two DNA helicases, each bound 
to a helicase loader, and these are placed around adjacent DNA single strands 
whose bases have been exposed by the assembly of the initiator protein-DNA 
complex. The helicase loader is analogous to the clamp loader we encountered 
above; it has the additional job of keeping the helicase in an inactive form until it 
is properly loaded onto a nascent replication fork. Once the helicases are loaded, 
the loaders dissociate and the helicases begin to unwind DNA, exposing enough 
single-strand DNA for DNA primase to synthesize the first RNA primers (Figure 
5-25). This quickly leads to the assembly of remaining proteins to create two rep- 
lication forks, with replication machines that move, with respect to the replication 
origin, in opposite directions. They continue to synthesize DNA until all of the 
DNA template downstream of each fork has been replicated. 

In E. coli, the interaction of the initiator protein with the replication origin is 
carefully regulated, with initiation occurring only when sufficient nutrients are 
available for the bacterium to complete an entire round of replication. Initiation is 
also controlled to ensure that only one round of DNA replication occurs for each 
cell division. After replication is initiated, the initiator protein is inactivated by 
hydrolysis of its bound ATP molecule, and the origin of replication experiences a 
“refractory period.’ The refractory period is caused by a delay in the methylation 
of newly incorporated A nucleotides in the origin (Figure 5-26). Initiation cannot 
occur again until the A’s are methylated and the initiator protein is restored to its 
ATP-bound state. 
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Eukaryotic Chromosomes Contain Multiple Origins of Replication 


We have seen how two replication forks begin at a single replication origin in bac- 
teria and proceed in opposite directions, moving away from the origin until all of 
the DNA in the single circular chromosome is replicated. The bacterial genome is 
sufficiently small for these two replication forks to duplicate the genome in about 
30 minutes. Because of the much greater size of most eukaryotic chromosomes, a 
different strategy is required to allow their replication in a timely manner. 

A method for determining the general pattern of eukaryotic chromosome 
replication was developed in the early 1960s. Human cells growing in culture are 
labeled for a short time with *H-thymidine so that the DNA synthesized during 
this period becomes highly radioactive. The cells are then gently lysed, and the 
DNA is stretched on the surface of a glass slide coated with a photographic emul- 
sion. Development of the emulsion reveals the pattern of labeled DNA through 
a technique known as autoradiography. The time allotted for radioactive label- 
ing is chosen to allow each replication fork to move several micrometers along 
the DNA, so that the replicated DNA can be detected in the light microscope as 
lines of silver grains, even though the DNA molecule itself is too thin to be visible. 


Figure 5-25 The proteins that initiate 
DNA replication in bacteria. The 
mechanism shown was established 

by studies in vitro with mixtures of 

highly purified proteins. For E. coli DNA 
replication, the major initiator protein, the 
helicase, and the primase are the dnaA, 
dnaB, and dnaG proteins, respectively. 
In the first step, several molecules of 

the initiator protein bind to specific DNA 
sequences at the replication origin and 
destabilize the double helix by forming a 
compact structure in which the DNA is 
tightly wrapped around the protein. Next, 
two helicases are brought in by helicase- 
loading proteins (the dnaC proteins), 
which inhibit the helicases until they are 
properly loaded at the replication origin. 
Helicase-loading proteins prevent the 
replicative DNA helices from inappropriately 
entering other single-strand stretches of 
DNA in the bacterial genome. Aided by 
single-strand binding protein (not shown), 
the loaded helicases open up the DNA, 
thereby enabling primases to enter and 
synthesize initial primers. In subsequent 
steps, two complete replication forks are 
assembled at the origin and move off in 
opposite directions. The initiator proteins 
are displaced as the left-hand fork moves 
through them (not shown). 
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In this way, both the rate and the direction of replication-fork movement can 
be determined (Figure 5-27). From the rate at which tracks of replicated DNA 
increase in length with increasing labeling time, the eukaryotic replication forks 
are estimated to travel at about 50 nucleotides per second. This is approximately 
twentyfold slower than the rate at which bacterial replication forks move, possibly 
reflecting the increased difficulty of replicating DNA that is packaged tightly in 
chromatin. 

An average-size human chromosome contains a single linear DNA molecule 
of about 150 million nucleotide pairs. It would take 0.02 seconds/nucleotide x 150 
x 10° nucleotides = 3.0 x 10° seconds (about 35 days) to replicate such a DNA mol- 
ecule from end to end with a single replication fork moving at a rate of 50 nucleo- 
tides per second. As expected, therefore, the autoradiographic experiments just 
described reveal that many forks, belonging to separate replication bubbles, are 
moving simultaneously on each eukaryotic chromosome. 

Much faster and more sophisticated methods now exist for monitoring DNA 
replication initiation and tracking the movement of DNA replication forks across 
whole genomes. One approach uses DNA microarrays—grids the size of a post- 
age stamp studded with hundreds of thousands of fragments of known DNA 
sequence. As we will see in detail in Chapter 8, each different DNA fragment is 
placed at a unique position on the microarray, and whole genomes can thereby 
be represented in an orderly manner. Ifa DNA sample from a group of replicating 
cells is broken up and hybridized to a microarray representing that organism’s 
genome, the amount of each DNA sequence can be determined. Because a seg- 
ment of a genome that has been replicated will contain twice as much DNA as 
an unreplicated segment, replication-fork initiation and fork movement can be 
accurately monitored across an entire genome (Figure 5-28). 

Experiments of this type have shown the following: (1) Approximately 30,000- 
50,000 origins of replication are used each time a human cell divides. (2) The 
human genome has many more (perhaps tenfold more) potential origins than 
this, and different cell types use different sets of origins. This may allow a cell to 
coordinate its active origins with other features of its chromosomes such as which 
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Figure 5-26 Methylation of the E. coli 
replication origin creates a refractory 
period for DNA initiation. DNA 
methylation occurs at GATC sequences, 
11 of which are found in the origin of 
replication (Spanning approximately 250 
nucleotide pairs). In its hemimethylated 
state, the origin of replication is bound 
by an inhibitor protein (Seq A, not 
shown), which blocks the ability of the 
initiator proteins to unwind the origin 
DNA. Eventually (about 15 minutes after 
replication is initiated), the hemimethylated 
origins become fully methylated by a DNA 
methylase enzyme; Seq A then dissociates. 
A single enzyme, the Dam methylase, 
is responsible for methylating all E. coli 
GATC sequences. A lag in methylation after 
the replication of GATC sequences is also 
used by the E. coli mismatch proofreading 
system to distinguish the newly synthesized 
DNA strand from the parental DNA strand; 
in that case, the relevant GATC sequences 
are scattered throughout the chromosome, 
and they are not bound by Seq A. 


Figure 5-27 The experiments that 
demonstrated the pattern in which 
replication forks are formed and move 
on eukaryotic chromosomes. The new 
DNA made in human cells in culture 

was labeled briefly with a pulse of highly 
radioactive thymidine (SH-thymidine). 

(A) In this experiment, the cells were 

lysed, and the DNA was stretched out 

on a glass slide that was subsequently 
covered with a photographic emulsion. 
After several months, the emulsion was 
developed, revealing a line of silver grains 
over the radioactive DNA. The brown DNA 
in this figure is shown only to help with 

the interpretation of the autoradiograph; 
the unlabeled DNA is invisible in Such 
experiments. (B) This experiment was the 
same except that a further incubation in 
unlabeled medium allowed additional DNA, 
with a lower level of radioactivity, to be 
replicated. The pairs of dark tracks in (B) 
were found to have silver grains tapering 
off in opposite directions, demonstrating 
bidirectional fork movement from a central 
replication origin where a replication bubble 
forms (See Figure 5-23). A replication fork 
is thought to stop only when it encounters 
a replication fork moving in the opposite 
direction or when it reaches the end of the 
chromosome; in this way, all the DNA is 
eventually replicated. 
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genes are being expressed. The excess origins also provide “backups” in case a pri- 
mary origin fails. (3) As in bacteria, replication forks are formed in pairs and create 
a replication bubble as they move in opposite directions away from a common 
point of origin, stopping only when they collide head-on with a replication fork 
moving in the opposite direction or when they reach a chromosome end. In this 
way, many replication forks operate independently on each chromosome and yet 
form two complete daughter DNA helices. 


In Eukaryotes, DNA Replication Takes Place During Only One Part 
of the Cell Cycle 


When growing rapidly, bacteria replicate their DNA nearly continuously. In con- 
trast, DNA replication in most eukaryotic cells occurs only during a specific part of 
the cell-division cycle, called the DNA synthesis phase or S phase (Figure 5-29). In 
a mammalian cell, the S phase typically lasts for about 8 hours; in simpler eukary- 
otic cells such as yeasts, the S phase can be as short as 40 minutes. By its end, each 
chromosome has been replicated to produce two complete copies, which remain 
joined together at their centromeres until the M phase (M for mitosis), which soon 
follows. In Chapter 17, we describe the control system that runs the cell cycle, and 
we explain why entry into each phase of the cycle requires the cell to have suc- 
cessfully completed the previous phase. 

In the following sections, we explore how chromosome replication is coordi- 
nated within the S phase of the cell cycle. 


Different Regions on the Same Chromosome Replicate at Distinct 
Times in S Phase 


In mammalian cells, the replication of DNA in the region between one replica- 
tion origin and the next should normally require only about an hour to complete, 
given the rate at which a replication fork moves and the largest distances mea- 
sured between replication origins. Yet S phase usually lasts for about 8 hours in 
a mammalian cell. This implies that the replication origins are not all activated 
simultaneously; indeed, replication origins are activated in clusters of about 50 
adjacent replication origins, each of which is replicated during only a small part of 
the total S-phase interval. 


Figure 5-28 Use of DNA microarrays 

to monitor the formation and progress 
of replication forks. For this experiment, 
a population of cells is synchronized so 
that they all begin replication at the same 
time. DNA is collected and hybridized 

to the microarray; DNA that has been 
replicated once gives a hybridization 

signal (dark green squares) twice as high 
as that of unreplicated DNA (light green 
Squares). The spots on these microarrays 
represent consecutive sequences along a 
segment of a chromosome arranged left 
to right, top to bottom. Only 81 spots are 
shown here, but the actual arrays contain 
hundreds of thousands of sequences 

that span an entire genome. As can be 
seen, replication begins at an origin and 
proceeds bidirectionally. For simplicity, only 
one origin is shown here. In human cells, 
replication begins at 30,000-50,000 origins 
located throughout the genome. Using 
this approach it is possible to observe the 
formation and progress of every replication 
fork across a genome. 
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Figure 5-29 The four successive phases 
of a standard eukaryotic cell cycle. 
During the G;, S, and Ga phases, the 

cell grows continuously. During M phase 
growth stops, the nucleus divides, and 

the cell divides in two. DNA replication is 
confined to the part of the cell cycle known 
as S phase. G4 is the gap between 

M phase and S phase; Go is the gap 
between S phase and M phase. 
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It seems that the order in which replication origins are activated depends, in 
part, on the chromatin structure in which the origins reside. We saw in Chap- 
ter 4 that heterochromatin is a particularly condensed state of chromatin, while 
euchromatin, where most transcription occurs, has a less condensed conforma- 
tion. Heterochromatin tends to be replicated very late in S phase, suggesting that 
the timing of replication is related to the packing of the DNA in chromatin. 

Once initiated, however, replication forks seem to move at comparable rates 
throughout S phase, so the extent of chromosome condensation seems to influ- 
ence the time at which replication forks are initiated, rather than their speed once 
formed. 


A Large Multisubunit Complex Binds to Eukaryotic Origins of 
Replication 


Having seen that a eukaryotic chromosome is replicated using many origins of 
replication, each of which “fires” at a characteristic time in S phase of the cell 
cycle, we turn to the nature of these origins of replication. We saw earlier in this 
chapter that replication origins have been precisely defined in bacteria as specific 
DNA sequences that attract initiator proteins, which then assemble the DNA rep- 
lication machinery. We shall see that this is the case for the single-cell budding 
yeast S. cerevisiae, but it appears not to be strictly true for most other eukaryotes. 

For budding yeast, the location of every origin of replication on each chromo- 
some has been determined. The particular chromosome shown in Figure 5-30— 
chromosome III from S. cerevisiae—is one of the smallest chromosomes known, 
with a length less than 1/100 that of a typical human chromosome. Its major ori- 
gins are spaced an average of 30,000 nucleotide pairs apart, but only a subset of 
these origins is used by a given cell. Nonetheless, this chromosome can be repli- 
cated in about 15 minutes. 

The minimal DNA sequence required for directing DNA replication initiation 
in S. cerevisiae has been determined by taking a segment of DNA that spans an 
origin of replication and testing smaller and smaller DNA fragments for their abil- 
ity to function as origins. Most DNA sequences that can serve as an origin of rep- 
lication are found to contain (1) a binding site for a large, multisubunit initiator 
protein called ORC, for origin recognition complex; (2) a stretch of DNA that is 
rich in As and Ts and therefore easy to melt; and (3) at least one binding site for 
proteins that facilitate ORC binding, probably by adjusting chromatin structure. 

In bacteria, once the initiator protein is properly bound to the single origin 
of replication, the assembly of the replication forks seems to follow more or less 
automatically. In eukaryotes, the situation is significantly different because of a 
profound problem eukaryotes have in replicating chromosomes: with so many 
places to begin replication, how is the process regulated to ensure that all the DNA 
is copied once and only once? 

The answer lies in the sequential manner in which the replicative helicase is 
first loaded onto origins and is then activated to initiate DNA replication. This 
matter is discussed in detail in Chapter 17, where we consider the machinery that 
underlies the cell-division cycle. In brief, during G; phase, the replicative heli- 
cases are loaded onto DNA next to ORC to create a prereplicative complex. Then, 
upon passage from G, phase to S phase, specialized protein kinases come into 
play to activate the helicases. The resulting opening of the double helix allows the 


loading of the remaining replication proteins, including the DNA polymerases. 
Figure 5-30 The origins of DNA 
replication on chromosome III of the 
yeast S. cerevisiae. This chromosome, 
one of the smallest eukaryotic 
CHROMOSOME III origins of replication chromosomes known, carries a total of 
180 genes. As indicated, it contains 18 


T ' replication origins, although they are used 


telomere centromere telomere with different frequencies. Those in red 
0 100 200 300 are typically used in less than 10% of cell 
ss sss... sss divisions, while those in green are used 


nucleotide pairs (thousands) about 90% of the time. 
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The protein kinases that trigger DNA replication simultaneously prevent 
assembly of new prereplicative complexes until the next M phase resets the entire 
cycle (for details, see pp. 974-975). They do this, in part, by phosphorylating ORC, 
rendering it unable to accept new helicases. This strategy provides a single win- 
dow of opportunity for prereplicative complexes to form (G; phase, when kinase 
activity is low) and a second window for them to be activated and subsequently 
disassembled (S phase, when kinase activity is high). Because these two phases of 
the cell cycle are mutually exclusive and occur in a prescribed order, each origin 
of replication can fire once and only once during each cell cycle. 


Features of the Human Genome That Specify Origins of 
Replication Remain to Be Discovered 


Compared with the situation in budding yeast, the determinants of replication 
origins in other eukaryotes have been difficult to discover. It has been possible to 
identify specific human DNA sequences, each several thousand nucleotide pairs 
in length, that are sufficient to serve as replication origins. These origins continue 
to function when moved to a different chromosomal region by recombinant DNA 
methods, as long as they are placed in a region where the chromatin is relatively 
uncondensed. However, comparisons of such DNA sequences have not revealed 
specific DNA sequences that mark origins of replication. 


Figure 5-31 DNA replication initiation in 
eukaryotes. This mechanism ensures that 
each origin of replication is activated only 
once per cell cycle. An origin of replication 
can be used only if a prereplicative 
complex forms there in G4 phase. At the 
beginning of S phase, specialized kinases 
phosphorylate Mcm and ORC, activating 
the former and inactivating the latter. A new 
orereplicative complex cannot form at the 
origin until the cell progresses to the next 
G1 phase, when the bound ORC has been 
dephosphorylated. Note that the eukaryotic 
Mcm helicase moves along the leading- 
strand template, whereas the bacterial 
helicase moves along the lagging-strand 
template (see Figure 5-25). As the forks 
begin to move, ORC is displaced, and new 
ORCs rapidly bind to the newly replicated 
Origins. 
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Despite this, a human ORC that is very similar to the yeast ORC binds to ori- 
gins of replication and initiates DNA replication in humans. Many of the other 
proteins that function in the initiation process in yeast likewise have central roles 
in humans. It therefore seems likely that the yeast and human initiation mecha- 
nisms are similar in outline, but chromatin structure, transcriptional activity, or 
some property of the genome other than a specific DNA sequence has the cen- 
tral role in attracting ORC and specifying mammalian origins of replication. These 
ideas could also help to explain how a given mammalian cell chooses which of the 
many possible origins to use when it replicates its genome and how this choice 
could differ from cell to cell. Clearly, we have a great deal to discover about the 
fundamental process of DNA replication initiation. 


New Nucleosomes Are Assembled Behind the Replication Fork 


Several additional aspects of DNA replication are specific to eukaryotes. As dis- 
cussed in Chapter 4, eukaryotic chromosomes are composed of roughly equal 
mixtures of DNA and protein. Chromosome duplication therefore requires not 
only the replication of DNA, but also the synthesis and assembly of new chro- 
mosomal proteins onto the DNA behind each replication fork. Although we are 
far from understanding this process in detail, we are beginning to learn how the 
fundamental unit of chromatin packaging, the nucleosome, is duplicated. The cell 
requires a large amount of new histone protein, approximately equal in mass to 
the newly synthesized DNA, to make the new nucleosomes in each cell cycle. For 
this reason, most eukaryotic organisms possess multiple copies of the gene for 
each histone. Vertebrate cells, for example, have about 20 repeated gene sets, most 
containing the genes that encode all five histones (H1, H2A, H2B, H3, and H4). 

Unlike most proteins, which are made continuously, histones are synthesized 
mainly in S phase, when the level of histone mRNA increases about fiftyfold as 
a result of both increased transcription and decreased mRNA degradation. The 
major histone mRNAs are degraded within minutes when DNA synthesis stops at 
the end of S phase. The mechanism depends on special properties of the 3’ ends 
of these mRNAs, as discussed in Chapter 7. In contrast, the histone proteins them- 
selves are remarkably stable and may survive for the entire life of a cell. The tight 
linkage between DNA synthesis and histone synthesis appears to reflect a feed- 
back mechanism that monitors the level of free histone to ensure that the amount 
of histone made exactly matches the amount of new DNA synthesized. 

As areplication fork advances, it must pass through the parental nucleosomes. 
In the cell, efficient replication requires chromatin remodeling complexes (dis- 
cussed in Chapter 4) to destabilize the DNA-histone interfaces. Aided by such 
complexes, replication forks can transit even highly condensed chromatin effi- 
ciently. 

As a replication fork passes through chromatin, the histones are transiently 
displaced leaving about 600 nucleotide pairs of non-nucleosomal DNA in its 
wake. The reestablishment of nucleosomes behind a moving fork occurs in an 
intriguing way. When a nucleosome is traversed by a replication fork, the histone 
octamer appears to be broken into an H3-H4 tetramer and two H2A-H2B dimers 
(discussed in Chapter 4). The H3-H4 tetramer remains loosely associated with 
DNA and is distributed at random to one or the other daughter duplex, but the 
H2A-H2B dimers are released completely from DNA. Freshly made H3-H4 tetram- 
ers are added to the newly synthesized DNA to fill in the “spaces,” and H2A-H2B 
dimers—half of which are old and half new—are then added at random to com- 
plete the nucleosomes (Figure 5-32). The formation of new nucleosomes behind 
a replication fork has an important consequence for the process of DNA replica- 
tion itself. As DNA polymerase 6 discontinuously synthesizes the lagging strand 
(see pp. 253-254), the length of each Okazaki fragment is determined by the point 
at which DNA polymerase ò is blocked by a newly formed nucleosome. This tight 
coupling between nucleosome duplication and DNA replication explains why the 
length of Okazaki fragments in eukaryotes (~200 nucleotides) is approximately 
the same as the nucleosome repeat length. 
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The orderly and rapid addition of new H3-H4 tetramers and H2A-H2B dimers 
behind a replication fork requires histone chaperones (also called chromatin 
assembly factors). These multisubunit complexes bind the highly basic histones 
and release them for assembly only in the appropriate context. The histone chap- 
erones, along with their cargoes, are directed to newly replicated DNA through 
a specific interaction with the eukaryotic sliding clamp called PCNA (see Figure 
5-32B). These clamps are left behind moving replication forks and remain on the 
DNA long enough for the histone chaperones to complete their tasks. 


Telomerase Replicates the Ends of Chromosomes 


We saw earlier that synthesis of the lagging strand at a replication fork must occur 
discontinuously through a backstitching mechanism that produces short DNA 
fragments. This mechanism encounters a special problem when the replication 
fork reaches an end of a linear chromosome. The final RNA primer synthesized 
on the lagging-strand template cannot be replaced by DNA because there is no 
3’-OH end available for the repair polymerase. Without a mechanism to deal with 
this problem, DNA would be lost from the ends of all chromosomes each time a 
cell divides. 

Bacteria solve this “end-replication” problem by having circular DNA mole- 
cules as chromosomes (see Figure 5-24). Eukaryotes solve it in a different way: 
they have specialized nucleotide sequences at the ends of their chromosomes 
that are incorporated into structures called telomeres (discussed in Chapter 4). 
Telomeres contain many tandem repeats of a short sequence that is similar in 
organisms as diverse as protozoa, fungi, plants, and mammals. In humans, the 
sequence of the repeat unit is GGGTTA, and it is repeated roughly a thousand 
times at each telomere. 

Telomere DNA sequences are recognized by sequence-specific DNA-bind- 
ing proteins that attract an enzyme, called telomerase, that replenishes these 
sequences each time a cell divides. Telomerase recognizes the tip of an existing 
telomere DNA repeat sequence and elongates it in the 5’-to-3’ direction, using 
an RNA template that is a component of the enzyme itself to synthesize new cop- 
ies of the repeat (Figure 5-33). The enzymatic portion of telomerase resembles 
other reverse transcriptases, proteins that synthesize DNA using an RNA template, 
although, in this case, the telomerase RNA also contributes functional groups to 
make the catalysis more efficient. After extension of the parental DNA strand by 
telomerase, replication of the lagging strand at the chromosome end can be com- 
pleted by the conventional DNA polymerases, using these extensions as a tem- 
plate to synthesize the complementary strand (Figure 5-34). 


Figure 5-32 Formation of nucleosomes 
behind a replication fork. Parental 

H3-H4 tetramers are distributed at random 
to the daughter DNA molecules, with 
roughly equal numbers inherited by each 
daughter. In contrast, H2A-H2B dimers 
are released from the DNA as the 
replication fork passes. This release 

begins just in front of the replication fork 
and is facilitated by chromatin remodeling 
complexes that move with the fork. 
Histone chaperones (NAP1 and 

CAF 1) restore the full complement of 
histones to daughter molecules using both 
parental and newly synthesized histones. 
Although some daughter nucleosomes 
contain only parental histones or only newly 
synthesized histones, most are hybrids of 
old and new. For simplicity, the DNA double 
helix shown as a single red line. (Adapted 
from J.D. Watson et al., Molecular Biology 
of the Gene, 5th ed. Cold Spring Harbor: 
Cold Spring Harbor Laboratory Press, 
2004.) 
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Telomeres Are Packaged Into Specialized Structures That Protect 
the Ends of Chromosomes 


The ends of chromosomes present cells with an additional problem. As we will 
see in the next part of this chapter, when a chromosome is accidently broken, 
the break is rapidly repaired (see Figure 5-45). Telomeres must clearly be distin- 
guished from these accidental breaks; otherwise the cell will attempt to “repair” 
telomeres, causing chromosome fusions and other genetic abnormalities. Telo- 
meres have several features to prevent this from happening. 

A specialized nuclease chews back the 5’ end of a telomere leaving a protrud- 
ing single-strand end. This protruding end—in combination with the GGGTTA 
repeats in telomeres—attracts a group of proteins that form a protective chromo- 
some cap known as shelterin. In particular, shelterin “hides” telomeres from the 
cell’s damage detectors that continually monitor DNA. When human telomeres 
are artificially cross-linked and viewed by electron microscopy, structures known 
as “t-loops” are observed in which the protruding end of the telomere loops back 
and tucks itself into the duplex DNA of the telomere repeat sequence (Figure 
5-35). It is believed that t-loops are regulated by shelterin and provide additional 
protection for the ends of chromosomes. 
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Figure 5-33 Structure of a portion of 
telomerase. Telomerase is a large protein- 
RNA complex. The RNA (blue) contains 

a templating sequence for synthesizing 
new DNA telomere repeats. The synthesis 
reaction itself is carried out by the reverse 
transcriptase domain of the protein, shown 
in green. A reverse transcriptase is a 
special form of polymerase enzyme that 
uses an RNA template to make a DNA 
strand; telomerase is unique in carrying 

its own RNA template with it. Telomerase 
also has several additional protein domains 
(not shown) that are needed to assemble 
the enzyme at the ends of chromosomes. 
(Modified from J. Lingner and T.R. Cech, 
Curr. Opin. Genet. Dev. 8:226-232, 1998. 
With permission from Elsevier.) 


Figure 5-34 Telomere replication. Shown 
here are the reactions that synthesize 

the repeating sequences that form the 
ends of the chromosomes (telomeres) of 
diverse eukaryotic organisms. The 3’ end 
of the parental DNA strand is extended 

by RNA-templated DNA synthesis; this 
allows the incomplete daughter DNA 
strand that is paired with it to be extended 
in its 5’ direction. This incomplete, lagging 
strand is presumed to be completed by 
DNA polymerase a, which carries a DNA 
primase as one of its subunits (Movie 5.6). 
The telomere sequence illustrated is that 
of the ciliate Tetrahymena, in which these 
reactions were first discovered. 
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Telomere Length Is Regulated by Cells and Organisms 


Because the processes that grow and shrink each telomere sequence are only 
approximately balanced, a chromosome end contains a variable number of telo- 
meric repeats. Not surprisingly, many cells have homeostatic mechanisms that 
maintain the number of these repeats within a limited range (Figure 5-36). 

In most of the dividing somatic cells of humans, however, telomeres gradually 
shorten, and it has been proposed that this provides a counting mechanism that 
helps prevent the unlimited proliferation of wayward cells in adult tissues. In its 
simplest form, this idea holds that our somatic cells start off in the embryo with a 
full complement of telomeric repeats. These are then eroded to different extents in 
different cell types. Some stem cells, notably those in tissues that must be replen- 
ished at a high rate throughout life—bone marrow or gut lining, for example— 
retain full telomerase activity. However, in many other types of cells, the level of 
telomerase is turned down so that the enzyme cannot quite keep up with chromo- 
some duplication. Such cells lose 100-200 nucleotides from each telomere every 
time they divide. After many cell generations, the descendant cells will inherit 
chromosomes that lack telomere function, and, as a result of this defect, activate 
a DNA-damage response causing them to withdraw permanently from the cell 
cycle and cease dividing—a process called replicative cell senescence (discussed 
in Chapter 17). In theory, such a mechanism could provide a safeguard against 
the uncontrolled cell proliferation of abnormal cells in somatic tissues, thereby 
helping to protect us from cancer. 
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Figure 5-35 A t-loop at the end of a 
mammalian chromosome. (A) Electron 
micrograph of the DNA at the end of an 
interphase human chromosome. The 
chromosome was fixed, deproteinated, and 
artificially thickened before viewing. The 
loop seen here is approximately 15,000 
nucleotide pairs in length. (B) Structure 

of a t-loop. The insertion of the single- 
strand 3’ end into the duplex repeats is 
carried out, and the structure maintained, 
by specialized proteins. (From J.D. Griffith 
et al., Cell 97:503-514, 1999. With 
permission from Elsevier.) 


Figure 5-36 A demonstration that 

yeast cells control the length of their 
telomeres. In this experiment, the telomere 
at one end of a particular chromosome 

is artificially made either longer (left) or 
shorter (right) than average. After many 
cell divisions, the chromosome recovers, 
showing an average telomere length and 

a length distribution that is typical of the 
other chromosomes in the yeast cell. A 
similar feedback mechanism for controlling 
telomere length has been proposed for the 
germ-line cells of animals. 
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The idea that telomere length acts as a “measuring stick” to count cell divisions 
and thereby regulate the lifetime of the cell lineage has been tested in several 
ways. For certain types of human cells grown in tissue culture, the experimental 
results support such a theory. Human fibroblasts normally proliferate for about 60 
cell divisions in culture before undergoing replicative cell senescence. Like most 
other somatic cells in humans, fibroblasts produce only low levels of telomerase, 
and their telomeres gradually shorten each time they divide. When telomerase is 
provided to the fibroblasts by inserting an active telomerase gene, telomere length 
is maintained and many of the cells now continue to proliferate indefinitely. 

It has been proposed that this type of control on cell proliferation may con- 
tribute to the aging of animals like ourselves. These ideas have been tested by 
producing transgenic mice that lack telomerase entirely. The telomeres in mouse 
chromosomes are about five times longer than human telomeres, and the mice 
must therefore be bred through three or more generations before their telomeres 
have shrunk to the normal human length. It is therefore perhaps not surprising 
that the first generations of mice develop normally. However, the mice in later 
generations develop progressively more defects in some of their highly prolifera- 
tive tissues. In addition, these mice show signs of premature aging and have a 
pronounced tendency to develop tumors. In these and other respects these mice 
resemble humans with the genetic disease dyskeratosis congenita. Individuals 
afflicted with this disease carry one functional and one nonfunctional copy of the 
telomerase RNA gene; they have prematurely shortened telomeres and typically 
die of progressive bone marrow failure. They also develop lung scarring and liver 
cirrhosis and show abnormalities in various epidermal structures including skin, 
hair follicles, and nails. 

The above observations demonstrate that controlling cell proliferation by telo- 
mere shortening poses a risk to an organism, because not all of the cells that begin 
losing the ends of their chromosomes will stop dividing. Some apparently become 
genetically unstable, but continue to divide, giving rise to variant cells that can 
lead to cancer. Clearly, the use of telomere shortening as a regulating mechanism 
is not foolproof and, like many mechanisms in the cell, seems to strike a balance 
between benefit and risk. 


Summary 


The proteins that initiate DNA replication bind to DNA sequences at a replication 
origin to catalyze the formation of a replication bubble with two outward-moving 
replication forks. The process begins when an initiator protein-DNA complex is 
formed that subsequently loads a DNA helicase onto the DNA template. Other pro- 
teins are then added to form the multienzyme “replication machine” that catalyzes 
DNA synthesis at each replication fork. 

In bacteria and some simple eukaryotes, replication origins are specified by spe- 
cific DNA sequences that are only several hundred nucleotide pairs long. In other 
eukaryotes, such as humans, the sequences needed to specify an origin of DNA 
replication seem to be less well defined, and the origin can span several thousand 
nucleotide pairs. 

Bacteria typically have a single origin of replication in a circular chromosome. 
With fork speeds of up to 1000 nucleotides per second, they can replicate their 
genome in less than an hour. Eukaryotic DNA replication takes place in only one 
part of the cell cycle, the S phase. The replication fork in eukaryotes moves about 10 
times more slowly than the bacterial replication fork, and the much longer eukary- 
otic chromosomes each require many replication origins to complete their replica- 
tion in an S phase, which typically lasts for 8 hours in human cells. The different 
replication origins in these eukaryotic chromosomes are activated in a sequence, 
determined in part by the structure of the chromatin, with the most condensed 
regions of chromatin typically beginning their replication last. After the replication 
fork has passed, chromatin structure is re-formed by the addition of new histones to 
the old histones that are directly inherited by each daughter DNA molecule. 

Eukaryotes solve the problem of replicating the ends of their linear chromosomes 
with a specialized end structure, the telomere, maintained by a special nucleotide 
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polymerizing enzyme called telomerase. Telomerase extends one of the DNA strands 
at the end of a chromosome by using an RNA template that is an integral part of the 
enzyme itself, producing a highly repeated DNA sequence that typically extends for 
thousands of nucleotide pairs at each chromosome end. Telomeres have specialized 
structures that distinguish them from broken ends of chromosomes, ensuring that 
they are not mistakenly repaired. 


DNA REPAIR 


Maintaining the genetic stability that an organism needs for its survival requires 
not only an extremely accurate mechanism for replicating DNA, but also mecha- 
nisms for repairing the many accidental lesions that DNA continually suffers. Most 
such spontaneous changes in DNA are temporary because they are immediately 
corrected by a set of processes that are collectively called DNA repair. Of the tens 
of thousands of random changes created every day in the DNA of a human cell by 
heat, metabolic accidents, radiation of various sorts, and exposure to substances 
in the environment, only a few (less than 0.02%) accumulate as permanent muta- 
tions in the DNA sequence. The rest are eliminated with remarkable efficiency by 
DNA repair. 

The importance of DNA repair is evident from the large investment that cells 
make in the enzymes that carry it out: several percent of the coding capacity of 
most genomes is devoted solely to DNA repair functions. The importance of DNA 
repair is also demonstrated by the increased rate of mutation that follows the 
inactivation of a DNA repair gene. Many DNA repair proteins and the genes that 
encode them—which we now know operate in a wide range of organisms, includ- 
ing humans—were originally identified in bacteria by the isolation and charac- 
terization of mutants that displayed an increased mutation rate or an increased 
sensitivity to DNA-damaging agents. 

Recent studies of the consequences of a diminished capacity for DNA repair 
in humans have linked many human diseases with decreased repair (Table 5-2). 
Thus, we saw previously that defects in a human gene whose product normally 
functions to repair the mismatched base pairs resulting from DNA replication 
errors can lead to an inherited predisposition to cancers of the colon and some 
other organs, reflecting an increased mutation rate. In another human disease, 


TABLE 5-2 


Xeroderma pigmentosum (XP) | Skin cancer, UV sensitivity, neurological Nucleotide excision repair 

groups A-G abnormalities 

Cockayne syndrome UV sensitivity; developmental abnormalities Coupling of nucleotide excision repair to 
transcription 


UV sensitivity, skin cancer Translesion synthesis by DNA polymerase v 


Ataxia telangiectasia (A Leukemia, lymphoma, y-ray sensitivity, genome | ATM protein, a protein kinase activated by 
instability ian strand breaks 


BRC Repair by Repair by homologous recombination recombination 


C __. Breast, ovarian, and prostate cancer Repair by homologous recombination 

Werner syndrome Premature aging, cancer at several sites, Accessory 3’-exonuclease and DNA 
genome instability helicase used in repair 

Bloom syndrome Cancer at several sites, stunted growth, DNA helicase needed for recombination 
genome instability 

Fanconi anemia groups A-G Congenital abnormalities, leukemia, genome DNA interstrand cross-link repair 
instability 

46 BR patient Hypersensitivity to DNA-damaging agents, DNA ligase | 
genome instability 
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xeroderma pigmentosum (XP), the afflicted individuals have an extreme sensitiv- Figure 5-37 A summary of spontaneous 
ity to ultraviolet radiation because they are unable to repair certain DNA photo- Alterations that require DNA repair. 
ducts. This repair defect results in an increased mutation rate thatleads to seri- |” S99 on each nucleotide modified 
pro ew S. i p i BEDA i , : by spontaneous oxidative damage (red 
ous skin lesions and an increased susceptibility to skin cancers. Finally, mutations arrows), hydrolytic attack (blue arrows), 


in the Brcal and Brca2 genes compromise a type of DNA repair known as homolo- and methylation (green arrows) are shown, 
gous recombination and are a cause of hereditary breast and ovarian cancer. with the width of each arrow indicating the 
relative frequency of each event (see 
Table 5-3). (After T. Lindahl, Nature 
Without DNA Repair, Soontaneous DNA Damage Would Rapidly 362:709-715, 1993. With permission from 
Change DNA Sequences Macmillan Publishers Ltd.) 


Although DNA is a highly stable material—as required for the storage of genetic 
information—it is a complex organic molecule that is susceptible, even under 
normal cell conditions, to spontaneous changes that would lead to mutations if 
left unrepaired (Figure 5-37 and see Table 5-3). For example, the DNA of each 


TABLE 5-3 


Hydrolysis 


Oxidation 
8-0xo G 


Ring-saturated pyrimidines (thymine glycol, cytosine 2000 
hydrates) 


Lipid peroxidation products (M1G, etheno-A, 1000 
etheno-C) 


Nonenzymatic methylation by nitrosated polyamines and peptides 


O®-Methylguanine 20-100 


The DNA lesions listed in the table are the result of the normal chemical reactions that take 
place in cells. Cells that are exposed to external chemicals and radiation suffer greater and more 
diverse forms of DNA damage. (From T. Lindahl and D.E. Barnes, Cold Spring Harb. Symp. 
Quant. Biol. 65:127-133, 2000.) 
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human cell loses about 18,000 purine bases (adenine and guanine) every day 
because their N-glycosyl linkages to deoxyribose hydrolyze, a spontaneous reac- 
tion called depurination. Similarly, a spontaneous deamination of cytosine to 
uracil in DNA occurs at a rate of about 100 bases per cell per day (Figure 5-38). 
DNA bases are also occasionally damaged by an encounter with reactive metab- 
olites produced in the cell, including reactive forms of oxygen and the high-en- 
ergy methyl donor S-adenosylmethionine, or by exposure to chemicals in the 
environment. Likewise, ultraviolet radiation from the sun can produce a covalent 
linkage between two adjacent pyrimidine bases in DNA to form, for example, 
thymine dimers (Figure 5-39). If left uncorrected when the DNA is replicated, 
most of these changes would be expected to lead either to the deletion of one or 
more base pairs or to a base-pair substitution in the daughter DNA chain (Figure 
5-40). The mutations would then be propagated throughout subsequent cell gen- 
erations. Such a high rate of random changes in the DNA sequence would have 
disastrous consequences. 


The DNA Double Helix Is Readily Repaired 


The double-helical structure of DNA is ideally suited for repair because it carries 
two separate copies of all the genetic information—one in each of its two strands. 
Thus, when one strand is damaged, the complementary strand retains an intact 
copy of the same information, and this copy is generally used to restore the correct 
nucleotide sequences to the damaged strand. 

An indication of the importance of a double-strand helix to the safe storage of 
genetic information is that all cells use it; only a few small viruses use single-strand 
DNA or RNA as their genetic material. The types of repair processes described in 
this section cannot operate on such nucleic acids, and once damaged, the chance 
of a permanent nucleotide change occurring in these single-strand genomes of 
viruses is thus very high. It seems that only organisms with tiny genomes (and 
therefore tiny targets for DNA damage) can afford to encode their genetic infor- 
mation in any molecule other than a DNA double helix. 
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Figure 5-38 Depurination and deamination. These reactions are two of the most frequent spontaneous chemical reactions that create serious 
DNA damage in cells. Depurination can release guanine (shown here), as well as adenine, from DNA. The major type of deamination reaction 
converts cytosine to an altered DNA base, uracil (Shown here), but deamination occurs on other bases as well. These reactions normally take place 
in double-helical DNA; for convenience, only one strand is shown. 
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Figure 5-39 The most common type of thymine dimer. This type of 
damage occurs in the DNA of cells exposed to ultraviolet irradiation (as in 
sunlight). A similar dimer will form between any two neighboring pyrimidine 
bases (C or T residues) in DNA. 


DNA Damage Can Be Removed by More Than One Pathway 


Cells have multiple pathways to repair their DNA using different enzymes that act 
upon different kinds of lesions. Figure 5-41 shows two of the most common path- 
ways. In both, the damage is excised, the original DNA sequence is restored by a 
DNA polymerase that uses the undamaged strand as its template, and a remain- 
ing break in the double helix is sealed by DNA ligase (see Figure 5-12). 

The two pathways differ in the way in which they remove the damage from 
DNA. The first pathway, called base excision repair, involves a battery of enzymes 
called DNA glycosylases, each of which can recognize a specific type of altered 
base in DNA and catalyze its hydrolytic removal. There are at least six types of 
these enzymes, including those that remove deaminated Cs, deaminated As, dif- 
ferent types of alkylated or oxidized bases, bases with opened rings, and bases in 
which a carbon-carbon double bond has been accidentally converted to a car- 
bon-carbon single bond. How is an altered base detected within the context of 
the double helix? A key step is an enzyme-mediated “flipping-out” of the altered 
nucleotide from the helix, which allows the DNA glycosylase to probe all faces of 
the base for damage (Figure 5-42). It is thought that these enzymes travel along 
DNA using base-flipping to evaluate the status of each base. Once an enzyme 
finds the damaged base that it recognizes, it removes that base from its sugar. 

The “missing tooth” created by DNA glycosylase action is recognized by an 
enzyme called AP endonuclease (AP for apurinic or apyrimidinic, endo to signify 
that the nuclease cleaves within the polynucleotide chain), which cuts the phos- 
phodiester backbone, after which the resulting gap is repaired (see Figure 5-41A). 
Depurination, which is by far the most frequent type of damage suffered by DNA, 
also leaves a deoxyribose sugar with a missing base. Depurinations are directly 
repaired beginning with AP endonuclease, following the bottom half of the path- 
way in Figure 5-41A. 
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Figure 5-40 How chemical modifications of nucleotides produce mutations. (A) Deamination of cytosine, if uncorrected, results in the 
substitution of one base for another when the DNA is replicated. As shown in Figure 5-38, deamination of cytosine produces uracil. Uracil differs 
from cytosine in its base-pairing properties and preferentially base-pairs with adenine. The DNA replication machinery therefore adds an adenine 
when it encounters a uracil on the template strand. (B) Depurination can lead to the loss of a nucleotide pair. When the replication machinery 
encounters a missing purine on the template strand, it may skip to the next complete nucleotide as illustrated here, thus producing a nucleotide 
deletion in the newly synthesized strand. Many other types of DNA damage (see Figure 5-37), if left uncorrected, also produce mutations when the 
DNA is replicated. 
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(A) BASE EXCISION REPAIR (B) NUCLEOTIDE EXCISION REPAIR 
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Figure 5- 41 A comparison of two major DNA repair pathways. (A) Base excision repair. This pathway starts with a DNA 
glycosylase. Here, the enzyme uracil DNA glycosylase removes an accidentally deaminated cytosine in DNA. After the action 

of this glycosylase (or another DNA glycosylase that recognizes a different kind of damage), the sugar phosphate with the 
missing base is cut out by the sequential action of AP endonuclease and a phosphodiesterase. (These same enzymes begin 
the repair of depurinated sites directly.) The gap of a single nucleotide is then filled by DNA polymerase and DNA ligase. The 
net result is that the U that was created by accidental deamination is restored to a C. AP endonuclease is so-named because 
it recognizes any site in the DNA helix that contains a deoxyribose sugar with a missing base; such sites can arise either by 

the loss of a purine (apurinic sites) or by the loss of a pyrimidine (apyrimidinic sites). (B) Nucleotide excision repair. In bacteria, 
after a multienzyme complex has recognized a lesion such as a pyrimidine dimer (See Figure 5-39), one cut is made on each 
side of the lesion, and an associated DNA helicase then removes the entire portion of the damaged strand. The excision repair 
machinery in bacteria leaves the gap of 12 nucleotides shown. In humans, once the damaged DNA is recognized, a helicase is 
recruited to unwind the DNA duplex locally. Next, the excision nuclease enters and cleaves on either side of the damage, leaving 
a gap of about 30 nucleotides. The nucleotide excision repair machinery in both bacteria and humans can recognize and repair 


many different tyoes of DNA damage. 


The second major repair pathway is called nucleotide excision repair. This 
mechanism can repair the damage caused by almost any large change in the 
structure of the DNA double helix. Such “bulky lesions” include those created by 
the covalent reaction of DNA bases with large hydrocarbons (such as the carcino- 
gen benzopyrene, found in tobacco smoke, coal tar, and diesel exhaust), as well as 
the various pyrimidine dimers (T-T, T-C, and C-C) caused by sunlight. In this path- 
way, a large multienzyme complex scans the DNA for a distortion in the double 
helix, rather than for a specific base change. Once it finds a lesion, it cleaves the 
phosphodiester backbone of the abnormal strand on both sides of the distortion, 
and a DNA helicase peels away the single-strand oligonucleotide containing the 
lesion. The large gap produced in the DNA helix is then repaired by DNA poly- 
merase and DNA ligase (see Figure 5-41B). 

An alternative to base and nucleotide excision repair processes is direct chemi- 
cal reversal of DNA damage, and this strategy is selectively employed for the rapid 
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removal of certain highly mutagenic or cytotoxic lesions. For example, the alkyla- 
tion lesion O®-methylguanine has its methyl group removed by direct transfer to 
a cysteine residue in the repair protein itself, which is destroyed in the reaction. 
In another example, methyl groups in the alkylation lesions 1-methyladenine and 
3-methylcytosine are “burnt off” by an iron-dependent demethylase, with release 
of formaldehyde from the methylated DNA and regeneration of the native base. 


Coupling Nucleotide Excision Repair to Transcription Ensures That 
the Cell’s Most Important DNA Is Efficiently Repaired 


All ofa cell’s DNA is under constant surveillance for damage, and the repair mech- 
anisms we have described act on all parts of the genome. However, cells have a 
way of directing DNA repair to the DNA sequences that are most urgently needed. 
They do this by linking RNA polymerase, the enzyme that transcribes DNA into 
RNA as the first step in gene expression, to the nucleotide excision repair pathway. 
As discussed above, this repair system can correct many different types of DNA 
damage. RNA polymerase stalls at DNA lesions and, through the use of coupling 
proteins, directs the excision repair machinery to these sites. In bacteria, where 
genes are relatively short, the stalled RNA polymerase can be dissociated from the 
DNA; the DNA is repaired, and the gene is transcribed again from the beginning. 
In eukaryotes, where genes can be enormously long, a more complex reaction is 
used to “back up” the RNA polymerase, repair the damage, and then restart the 
polymerase. 

The importance of transcription-coupled excision repair is seen in people 
with Cockayne syndrome, which is caused by a defect in this coupling. These indi- 
viduals suffer from growth retardation, skeletal abnormalities, progressive neural 
retardation, and severe sensitivity to sunlight. Most of these problems are thought 
to arise from RNA polymerase molecules that become permanently stalled at sites 
of DNA damage that lie in important genes. 


The Chemistry of the DNA Bases Facilitates Damage Detection 


The DNA double helix seems optimal for repair. As noted above, it contains a 
backup copy of all genetic information. Equally importantly, the nature of the 
four bases in DNA makes the distinction between undamaged and damaged 
bases very clear. For example, every possible deamination event in DNA yields 
an “unnatural” base, which can be directly recognized and removed by a specific 
DNA glycosylase. Hypoxanthine, for example, is the simplest purine base capable 
of pairing specifically with C, but hypoxanthine is the direct deamination prod- 
uct of A (Figure 5-43A). The addition of a second amino group to hypoxanthine 
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Figure 5-42 The recognition of an 
unusual nucleotide in DNA by base- 
flipping. The DNA glycosylase family of 
enzymes recognizes specific inappropriate 
bases in the conformation shown. Each of 
these enzymes cleaves the glycosyl bond 
that connects a particular recognized base 
(yellow) to the backbone sugar, removing it 
from the DNA. (A) Stick model; (B) space- 
filling model. 
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Figure 5-43 The deamination of DNA nucleotides. In each case, the oxygen atom that is 

added in this reaction with water is colored red. (A) The spontaneous deamination products of 

A and G are recognizable as unnatural when they occur in DNA and thus are readily found and 
repaired. The deamination of C to U was also illustrated in Figure 5-38; T has no amino group to 
remove. (B) About 3% of the C nucleotides in vertebrate DNAs are methylated to help in controlling 
gene expression (discussed in Chapter 7). When these 5-methyl C nucleotides are accidentally 
deaminated, they form the natural nucleotide T. However, this T will be paired with a G on the 
opposite strand, forming a mismatched base pair. 
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produces G, which cannot be formed from A by spontaneous deamination, and 
whose deamination product (xanthine) is likewise unique. 

As discussed in Chapter 6, RNA is thought, on an evolutionary time scale, 
to have served as the genetic material before DNA, and it seems likely that the 
genetic code was initially carried in the four nucleotides A, C, G, and U. This raises 
the question of why the U in RNA was replaced in DNA by T (which is 5-methyl U). 
We have seen that the spontaneous deamination of C converts it to U, but that this 
event is rendered relatively harmless by uracil DNA glycosylase. However, if DNA 
contained U as a natural base, the repair system would not be able to distinguish 
a deaminated C from a naturally occurring U. 

A special situation occurs in vertebrate DNA, in which selected C nucleo- 
tides are methylated at specific CG sequences that are associated with inactive 
genes (discussed in Chapter 7). The accidental deamination of these methylated 
C nucleotides produces the natural nucleotide T (Figure 5-43B) in a mismatched 
base pair with a G on the opposite DNA strand. To help in repairing deaminated 
methylated C nucleotides, a special DNA glycosylase recognizes a mismatched 
base pair involving T in the sequence T-G and removes the T. This DNA repair 
mechanism must be relatively ineffective, however, because methylated C nucle- 
otides are exceptionally common sites for mutations in vertebrate DNA. It is strik- 
ing that, even though only about 3% of the C nucleotides in human DNA are meth- 
ylated, mutations in these methylated nucleotides account for about one-third of 
the single-base mutations that have been observed in inherited human diseases. 


Special Translesion DNA Polymerases Are Used in Emergencies 


If a cell’s DNA suffers heavy damage, the repair mechanisms that we have dis- 
cussed are often insufficient to cope with it. In these cases, a different strategy is 
called into play, one that entails some risk to the cell. The highly accurate replica- 
tive DNA polymerases stall when they encounter damaged DNA, and in emer- 
gencies cells employ versatile, but less accurate, backup polymerases, known as 
translesion polymerases, to replicate through the DNA damage. 

Human cells have seven translesion polymerases, some of which can recog- 
nize a specific type of DNA damage and correctly add the nucleotide required to 
restore the initial sequence. Others make only “good guesses,’ especially when the 
template base has been extensively damaged. These enzymes are not as accurate 
as the normal replicative polymerases when they copy a normal DNA sequence. 
For one thing, the translesion polymerases lack exonucleolytic proofreading 
activity; in addition, many are much less discriminating than the replicative poly- 
merase in choosing which nucleotide to incorporate initially. Presumably for this 
reason, each such translesion polymerase is given a chance to add only one or a 
few nucleotides before the highly accurate replicative polymerase resumes DNA 
synthesis. 

Despite their usefulness in allowing heavily damaged DNA to be replicated, 
these translesion polymerases do, as noted above, pose risks to the cell. They are 
probably responsible for most of the base-substitution and single-nucleotide 
deletion mutations that accumulate in genomes; although they generally produce 
mutations when copying damaged DNA (see Figure 5-40), they probably also cre- 
ate mutations—at a low level—on undamaged DNA. Clearly, it is important for 
the cell to tightly regulate these polymerases, releasing them only at sites of DNA 
damage. Exactly how this happens for each translesion polymerase remains to 
be discovered, but a conceptual model is given in Figure 5-44. The principle of 
this model applies to many of the DNA repair processes discussed in this chapter: 
because the enzymes that carry out these reactions are potentially dangerous to 
the genome, they must be brought into play only at sites of damage. 


Double-Strand Breaks Are Efficiently Repaired 


An especially dangerous type of DNA damage occurs when both strands of the 
double helix are broken, leaving no intact template strand to enable accurate 
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repair. Ionizing radiation, replication errors, oxidizing agents, and other metabo- 
lites produced in the cell cause breaks of this type. If these lesions were left unre- 
paired, they would quickly lead to the breakdown of chromosomes into smaller 
fragments and to loss of genes when the cell divides. However, two distinct mecha- 
nisms have evolved to deal with this type of damage (Figure 5-45). The simplest to 
understand is nonhomologous end joining, in which the broken ends are simply 
brought together and rejoined by DNA ligation, generally with the loss of nucleo- 
tides at the site of joining (Figure 5-46). This end-joining mechanism, which can 
be seen as a “quick and dirty” solution to the repair of double-strand breaks, is 
common in mammalian somatic cells. Although a change in the DNA sequence 
(a mutation) results at the site of breakage, so little of the mammalian genome is 
essential for life that this mechanism is apparently an acceptable solution to the 
problem of rejoining broken chromosomes. By the time a human reaches the age 
of 70, the typical somatic cell contains over 2000 such “scars,” distributed through- 
out its genome, representing places where DNA has been inaccurately repaired by 
nonhomologous end joining. But nonhomologous end joining presents another 
danger: because there seems to be no mechanism to ensure that two ends being 
joined were originally next to each other in the genome, nonhomologous end 
joining can occasionally generate rearrangements in which one broken chromo- 
some becomes covalently attached to another. This can result in chromosomes 
with two centromeres and chromosomes lacking centromeres altogether; both 


Figure 5-44 Translesion DNA 
polymerases can use damaged 
templates. According to this model, a 
replicative polymerase stalled at a site of 
DNA damage is recognized by the cell 

as needing rescue. Specialized enzymes 
covalently modify the sliding clamp 
(typically, it is ubiquitylated—see Figure 
3-69) which releases the replicative DNA 
polymerase and, together with damaged 
DNA, attracts a translesion polymerase 
specific to that type of damage. Once the 
damaged DNA is bypassed, the covalent 
modification of the clamp is removed, the 
translesion polymerase dissociates, and 
the replicative polymerase is brought back 
into play. 
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types of aberrant chromosomes are missegregated during cell division. As previ- 
ously discussed, the specialized structure of telomeres prevents the natural ends 
of chromosomes from being mistaken for broken DNA and “repaired” in this way. 

A much more accurate type of double-strand break repair occurs in newly rep- 
licated DNA (Figure 5-45B). Here, the DNA is repaired using the sister chromatid 
as a template. This reaction is an example of homologous recombination, and we 
consider its mechanism later in this chapter. Most organisms employ both non- 
homologous end joining and homologous recombination to repair double-strand 
breaks in DNA. Nonhomologous end joining predominates in humans; homol- 
ogous recombination is used only during and shortly after DNA replication (in 
S and Gz phases), when sister chromatids are available to serve as templates. 
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Figure 5-45 Two ways to repair double- 
strand breaks. (A) Nonhomologous end 
joining alters the original DNA sequence 
when repairing a broken chromosome. The 
initial degradation of the broken DNA ends 
is important because the nucleotides at the 
site of the initial break are often damaged 
and cannot be ligated. Nonhomologous 
end joining usually takes place when 

cells have not yet duplicated their DNA. 

(B) Repairing double-strand breaks by 
homologous recombination is more difficult 
to accomplish but restores the original DNA 
sequence. It typically takes place after the 
DNA has been duplicated (when a duplex 
template is available) but before the cell 
has divided. Details of the homologous 
recombination pathway are presented in 
the following section (See Figure 5—48). 


Figure 5-46 Nonhomologous end 
joining. (A) A central role is played by the 
Ku protein, a heterodimer that grasps 

the broken chromosome ends. The 
additional proteins shown are needed 

to hold the broken ends together while 
they are processed and eventually joined 
covalently. (B) Three-dimensional structure 
of a Ku heterodimer bound to the end of 
a duplex DNA fragment. The Ku protein is 
also essential for V(D)J joining, a specific 
recombination process through which 
antibody and T cell receptor diversity is 
generated in developing B and T cells 
(discussed in Chapter 24). V(D)J joining and 
nonhomologous end joining show many 
similarities in mechanism but the former 
relies on specific double-strand breaks 
produced deliberately by the cell. 

(B, from J.R. Walker, R.A. Corpina, and 

J. Goldberg, Nature 412:607-614, 2001. 
With permission from Macmillan 
Publishers Ltd.) 
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DNA Damage Delays Progression of the Cell Cycle 


We have just seen that cells contain multiple enzyme systems that can recognize 
and repair many types of DNA damage (Movie 5.7). Because of the importance 
of maintaining intact, undamaged DNA from generation to generation, eukary- 
otic cells have an additional mechanism that maximizes the effectiveness of their 
DNA repair enzymes: they delay progression of the cell cycle until DNA repair is 
complete. As discussed in detail in Chapter 17, the orderly progression of the cell 
cycle is stopped if damaged DNA is detected, and it restarts when the damage has 
been repaired. Thus, in mammalian cells, the presence of DNA damage can block 
entry from G; into S phase, it can slow S phase once it has begun, and it can block 
the transition from Gz phase to M phase. These delays facilitate DNA repair by 
providing the time needed for the repair to reach completion. 

DNA damage also results in an increased synthesis of some DNA repair 
enzymes. This response depends on special signaling proteins that sense DNA 
damage and up-regulate the appropriate DNA repair enzymes. The importance 
of this mechanism is revealed by the phenotype of humans who are born with 
defects in the gene that encodes the ATM protein. These individuals have the 
disease ataxia telangiectasia (AT), the symptoms of which include neurodegen- 
eration, a predisposition to cancer, and genome instability. The ATM protein is 
a large kinase needed to generate the intracellular signals that sound the alarm 
in response to many types of spontaneous DNA damage (see Figure 17-62), and 
individuals with defects in this protein therefore suffer from the effects of unre- 
paired DNA lesions. 


Summary 


Genetic information can be stored stably in DNA sequences only because a large 
set of DNA repair enzymes continuously scan the DNA and replace any damaged 
nucleotides. Most types of DNA repair depend on the presence of a separate copy 
of the genetic information in each of the two strands of the DNA double helix. An 
accidental lesion on one strand can therefore be cut out by a repair enzyme and a 
corrected strand resynthesized by reference to the information in the undamaged 
strand. 

Most of the damage to DNA bases is excised by one of two major DNA repair 
pathways. In base excision repair, the altered base is removed by a DNA glycosylase 
enzyme, followed by excision of the resulting sugar phosphate. In nucleotide exci- 
sion repair, a small section of the DNA strand surrounding the damage is removed 
from the DNA double helix as an oligonucleotide. In both cases, the gap left in the 
DNA helix is filled in by the sequential action of DNA polymerase and DNA ligase, 
using the undamaged DNA strand as the template. Some types of DNA damage can 
be repaired by a different strategy—the direct chemical reversal of the damage— 
which is carried out by specialized repair proteins. When DNA damage is excessive, 
a special class of inaccurate DNA polymerases, called translesion polymerases, is 
used to bypass the damage, allowing the cell to survive but sometimes creating per- 
manent mutations at the sites of damage. 

Other critical repair systems—based on either nonhomologous end joining or 
homologous recombination—reseal the accidental double-strand breaks that occur 
in the DNA helix. In most cells, an elevated level of DNA damage causes a delay in 
the cell cycle, which ensures that DNA damage is repaired before a cell divides. 


HOMOLOGOUS RECOMBINATION 


In the two preceding sections, we discussed the mechanisms that allow the DNA 
sequences in cells to be maintained from generation to generation with very little 
change. In this section, we further explore one of the DNA repair mechanisms, 
a diverse set of reactions known collectively as homologous recombination. The 
key feature of homologous recombination (also known as general recombina- 
tion) is an exchange of DNA strands between a pair of homologous duplex DNA 
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sequences, that is, segments of double helix that are very similar or identical in 
nucleotide sequence. This exchange allows one stretch of duplex DNA to act as 
a template to restore lost or damaged information on a second stretch of duplex 
DNA. Because the template for repair is not limited to the strand complementary 
to that containing the damage, homologous recombination can repair many types 
of DNA damage. It is, for example, the main way to accurately repair double-strand 
breaks, as introduced in the previous section (see Figure 5-45B). Double-strand 
breaks can result from radiation and reactive chemicals, but most of the time they 
arise from DNA replication forks that become stalled or broken independently 
of any such external cause. Homologous recombination accurately corrects these 
accidents and, because they occur during nearly every round of DNA replica- 
tion, this repair mechanism is essential for every proliferating cell. Homologous 
recombination is perhaps the most versatile DNA repair mechanism available to 
the cell; the “all-purpose” nature of recombinational repair probably explains why 
its mechanism and the proteins that carry it out have been conserved in virtually 
all cells on Earth. 

Additionally, we shall see that homologous recombination plays a special role 
in sexually reproducing organisms. During meiosis, a key step in gamete (sperm 
and egg) production, it catalyzes the orderly exchange of bits of genetic informa- 
tion between corresponding (homologous) maternal and paternal chromosomes 
to create new combinations of DNA sequences in the chromosomes passed to the 
offspring. 


Homologous Recombination Has Common Features in All Cells 


The current view of homologous recombination as a critical DNA repair mecha- 
nism in all cells evolved slowly from its original discovery as a key component in 
the specialized process of meiosis in plants and animals. The subsequent recogni- 
tion that homologous recombination also occurs in unicellular organisms made 
it much more amenable to molecular analyses. Thus, most of what we know about 
the biochemistry of genetic recombination was originally derived from studies of 
bacteria, especially of E. coli and its viruses, as well as from experiments with sim- 
ple eukaryotes such as yeasts. For these organisms with short generation times 
and relatively small genomes, it was possible to isolate a large set of mutants with 
defects in their recombination processes. The protein altered in each mutant was 
then identified and, ultimately, studied biochemically. Close relatives of these 
proteins have been found in more complex eukaryotes including flies, mice, and 
humans, and more recently, it has been possible to directly analyze homologous 
recombination in these species as well. These studies reveal that the fundamental 
processes that catalyze homologous recombination are common to all cells. 


DNA Base-Pairing Guides Homologous Recombination 


The hallmark of homologous recombination is that it takes place only between 
DNA duplexes that have extensive regions of sequence similarity (homology). Not 
surprisingly, base-pairing underlies this requirement, and two DNA duplexes that 
are undergoing homologous recombination “sample” each other’s DNA sequence 
by engaging in extensive base-pairing between a single strand from one DNA 
duplex and the complementary single strand from the other. The match need not 
be perfect, but it must be very close for homologous recombination to succeed. 

In its simplest form, this type of base-pairing interaction can be mimicked in 
a test tube by allowing a DNA double helix to re-form from its separated single 
strands. This process, called DNA renaturation or hybridization, occurs when a 
rare random collision juxtaposes complementary nucleotide sequences on two 
matching DNA single strands, allowing the formation of a short stretch of double 
helix between them. This relatively slow helix-nucleation step is followed by a very 
rapid “zippering” step, as the region of double helix is extended to maximize the 
number of base-pairing interactions (Figure 5-47). 
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DNA hybridization can create a region of DNA double helix consisting of 
strands that originate from two different duplex DNA molecules as long as they 
are complementary, or nearly so. As we will see shortly, the formation of such a 
hybrid molecule, known as a heteroduplex, is an essential feature of homologous 
recombination. DNA hybridization and heteroduplex formation is also the basis 
for many of the methods used to study cells, and we will discuss these uses in 
Chapter 8. 

The DNA in a living cell is almost all in the stable double-helical form, so the 
reaction depicted in Figure 5-47 rarely occurs in vivo. Instead, as we shall see, 
homologous recombination is brought about through a carefully controlled set of 
reactions that allow two DNA duplexes to sample each other’s sequences without 
fully dissociating into single strands. 


Homologous Recombination Can Flawlessly Repair Double-Strand 
Breaks in DNA 


We saw in the previous section that nonhomologous end-joining occurs without a 
template and usually leaves a mutation at the site at which a double-strand break 
is repaired. In contrast, homologous recombination can repair double-strand 
breaks accurately, without any loss or alteration of nucleotides at the site of repair. 
For homologous recombination to do this repair job, the broken DNA has to be 
brought into proximity with homologous but unbroken DNA, which can serve as a 
template for repair. For this reason, homologous recombination often occurs just 
after DNA replication, when the two daughter DNA molecules lie close together 
and one can serve as a template for repair of the other. As we shall see, the process 
of DNA replication itself creates a special risk of accidents requiring this sort of 
repair. 

The simplest pathway through which homologous recombination can repair 
double-strand breaks is shown in Figure 5-48. In essence, the broken DNA duplex 
and the template duplex carry out a “strand dance” so that one of the damaged 
strands can use the complementary strand of the intact DNA duplex as a tem- 
plate for repair. First, the ends of the broken DNA are chewed back, or “resected,” 
by specialized nucleases to produce overhanging, single-strand 3’ ends. The next 
step is strand exchange (also called strand invasion), during which one of the 
single-strand 3’ ends from the damaged DNA molecule worms its way into the 
template duplex and searches it for homologous sequences through base-pair- 
ing. We describe this remarkable reaction in detail in the next section. Once sta- 
ble base-pairing is established (which completes the strand exchange step), an 
accurate DNA polymerase extends the invading strand by using the information 
provided by the undamaged template molecule, thus restoring the damaged 
DNA. The last steps—strand displacement, further repair synthesis, and liga- 
tion—restore the two original DNA double helices and complete the repair pro- 
cess. Homologous recombination resembles other DNA repair reactions in that a 





A A 
B B 
RAPID 
ZIPPERING c c 
= 
D D 
E E 


Figure 5-47 DNA hybridization. DNA 
double helices can re-form from their 
separated strands in a reaction that 
depends on the random collision of 

two complementary DNA strands. The 
vast majority of such collisions are not 
productive, as shown on the left, but 

a few result in a short region where 
complementary base pairs have formed 
(helix nucleation). A rapid zippering then 
leads to the formation of a complete double 
helix. Through this trial-and-error process, 
a DNA strand will find its complementary 
partner even in the midst of millions of 
nonmatching DNA strands. 
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DNA polymerase utilizes a pristine template to restore damaged DNA. However, 
instead of using the partner complementary strand as a template, as occurs in 
most DNA repair pathways, homologous recombination exploits a complemen- 
tary strand from a separate DNA duplex. 


Strand Exchange Is Carried Out by the RecA/Rad51 Protein 


Of all the steps of homologous recombination, strand exchange is the most diffi- 
cult to imagine. How does the invading single strand rapidly sample a DNA duplex 
for homology? Once the homology is found, how does the exchange occur? How 
is the inherent stability of the template double helix overcome? 

The answers to these questions came from biochemical and structural stud- 
ies of the protein that carries out these feats, called RecA in E. coli and Rad51 in 
virtually all eukaryotic organisms. To catalyze strand exchange, RecA first binds 
cooperatively to the invading single strand, forming a protein-DNA filament 
that forces the DNA into an unusual configuration: groups of three consecutive 
nucleotides are held as though they were in a conventional DNA double helix 
but, between adjacent triplets, the DNA backbone is untwisted and stretched out 
(Figure 5-49). This unusual protein-DNA filament then binds to duplex DNA 
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Figure 5-48 Mechanism of double- 
strand break repair by homologous 
recombination. This is the preferred 
method for repairing DNA double-strand 
breaks that arise shortly after the DNA has 
been replicated, while the daughter DNA 
molecules are still held close together. In 
general, homologous recombination can be 
regarded as a flexible series of reactions, 
with the exact pathway differing from one 
case to the next. For example, the length 
of the repair “patch” can vary considerably 
depending on the extent of 5’ processing 
and new DNA synthesis, indicated in green. 
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in a way that stretches the duplex, destabilizing it and making it easy to pull the 
strands apart. The invading single strand then can sample the sequence of the 
duplex by conventional base-pairing. This sampling occurs in triplet nucleotide 
blocks: if a triplet match is found, the adjacent triplet is sampled, and so on. In 
this way, mismatches quickly lead to dissociation and only an extended stretch of 
base-pairing (at least 15 nucleotides) stabilizes the invading strand and leads to 
strand exchange. 

RecA hydrolyzes ATP, and the steps described above require that each RecA 
monomer along the filament be in the ATP-bound state. However, the search- 
ing itself does not require ATP hydrolysis; instead, the process occurs by simple 
molecular collision, allowing many potential sequences to be rapidly sampled. 
Once the strand-exchange reaction is completed, however, ATP hydrolysis is nec- 
essary to disassemble RecA from the complex of DNA molecules. At this point, 
repair DNA polymerases and DNA ligase can complete the repair process, as 
shown in Figure 5-48. 


Homologous Recombination Can Rescue Broken DNA Replication 
Forks 


Although accurately repairing double-strand breaks, which can arise from radi- 
ation or chemical reactions, is a crucial function of homologous recombination, 
perhaps its most important role is in rescuing stalled or broken DNA replication 
forks. Many types of events can cause a replication fork to break, and here we con- 
sider just one example: a single-strand nick or gap in the parental DNA helix just 
ahead of a replication fork. When the fork reaches this lesion, it falls apart—result- 
ing in one broken and one intact daughter chromosome. The broken fork can be 
flawlessly repaired (Figure 5-50) using the same basic homologous recombina- 
tion reactions we discussed above for the repair of double-strand breaks. With 
slight modifications, the set of reactions depicted in Figures 5-48 and 5-50— 
known collectively as homologous recombination—can accurately repair many 
different types of DNA damage. 


Cells Carefully Regulate the Use of Homologous Recombination in 
DNA Repair 


Although homologous recombination neatly solves the problem of accurately 
repairing double-strand breaks and other types of DNA damage, it does present 


Figure 5-49 Strand invasion catalyzed 
by the RecA protein. Our understanding 
of this reaction is based in part on 
structures determined by x-ray diffraction 
studies of RecA bound to single- and 
double-strand DNA. These DNA structures 
(shown without the RecA protein) are 

on the left side of the diagram. Starting 
at the top, ATP-bound RecA associates 
with single-strand DNA, holding it in an 
elongated form where groups of three 
bases are separated from each other by 
a stretched and twisted backbone. In the 
next step, the RecA-bound single strand 
then binds to duplex DNA, destabilizing it 
and allowing the single strand to sample 
its sequence through base-pairing, three 
bases at a time. If no match is found, the 
RecA-bound single strand of DNA rapidly 
dissociates and begins a new search. If 
an extensive match is found, the structure 
is disassembled through ATP hydrolysis, 
resulting in the dissociation of RecA 

and the exchange of one single strand 

of DNA for another, thereby forming a 
heteroduplex. (PDB code: 3CMX.) 


HOMOLOGOUS RECOMBINATION 


Figure 5-50 Repair of a broken replication fork by homologous 
recombination. When a moving replication fork encounters a single-strand 
break, it will collapse, but can be repaired by homologous recombination. 
The process uses many of the same reactions shown in Figure 5-48 and 
proceeds through the same basic steps. Green strands represent the new 
DNA synthesis that takes place after the replication fork has broken. This 
pathway allows the fork to move past the site that was nicked on the original 
template by using the undamaged duplex as a template to synthesize DNA. 
(Adapted from M.M. Cox, Proc. Natl Acad. Sci. USA 98:8173-8180, 2001. 
With permission from National Academy of Sciences.) 


some dangers to the cell as it sometimes “repairs” damage using the wrong bit 
of the genome as the template. For example, sometimes a broken human chro- 
mosome is “repaired” using the homolog from the other parent instead of the 
sister chromatid as the template. Because maternal and paternal chromosomes 
differ in DNA sequence at many positions along their lengths, this type of repair 
can convert the sequence of the repaired DNA from the maternal to the paternal 
sequence or vice versa. The result of this type of errant recombination is known as 
loss of heterozygosity. It can have severe consequences if the homolog used for 
repair contains a deleterious mutation, because the recombination event destroys 
the “good” copy. Loss of heterozygosity, although rare, is a critical step in the for- 
mation of many cancers (discussed in Chapter 20). 

Cells go to great lengths to minimize the risk of mishaps of these types; indeed, 
nearly every step of homologous recombination is carefully regulated. For exam- 
ple, the first step, processing of the broken ends, is coordinated with the cell cycle: 
the nuclease enzymes that carry out this process are activated (in part, by phos- 
phorylation) only in the S and G2 phases of the cell cycle, when a daughter duplex 
(either as a partially replicated chromosome or a fully replicated sister chromatid) 
can serve as a template for repair (see Figure 5-50). The close proximity of the two 
daughter chromosomes disfavors the use of other genome sequences in the repair 
process. 

The loading of RecA or Rad52 onto the processed DNA ends and the subse- 
quent strand-exchange reaction are also tightly controlled. Although these pro- 
teins alone can carry out these steps in vitro, a series of accessory proteins, includ- 
ing Rad52, is needed in eukaryotic cells to ensure that homologous recombination 
is efficient and accurate (Figure 5-51). There are many such accessory proteins, 
and exactly how they coordinate and control homologous recombination remains 
a mystery. We do know that the enzymes that catalyze recombinational repair are 
made at relatively high levels in eukaryotes and are dispersed throughout the 
nucleus in an inactive form. In response to DNA damage, they rapidly converge 
on the sites of DNA damage, become activated, and form “repair factories” where 
many lesions are apparently brought together and repaired (Figure 5-52). 

In Chapter 20, we shall see that both too much and too little homologous 
recombination can lead to cancer in humans, the former through repair using the 
“wrong” template (as described above) and the latter through an increased muta- 
tion rate caused by inefficient DNA repair. Clearly, a delicate balance has evolved 
that keeps this process in check on undamaged DNA, while still allowing it to act 
efficiently and rapidly on DNA lesions as soon as they arise. 

Not surprisingly, mutations in the components that carry out and regulate 
homologous recombination are responsible for several inherited forms of can- 
cer. Two of these, the Brcal and Brca2 proteins, were first discovered because 


Figure 5-51 Structure of a portion of the Rad52 protein. This doughnut- 
shaped structure is composed of 11 subunits. Single-strand DNA has been 
modeled into the deep groove running along the protein surface. Rad52 
helps load Rad51 onto single-strand DNA to form the nucleoprotein filament 
that carries out strand exchange. Rad®52 also acts later to re-form the double 
helix and complete the homologous recombination reaction. (From 

M.R. Singleton et al., Proc. Natl Acad. Sci. USA 99:13492-13497, 2002. 
With permission from National Academy of Sciences.) 
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mutations in their genes lead to a greatly increased frequency of breast cancer. 
Because these mutations cause inefficient repair by homologous recombination, 
accumulation of DNA damage can, in a small proportion of cells, give rise to a 
cancer. Brcal regulates an early step in broken-end processing; without it, such 
ends are not processed correctly for homologous recombination and instead are 
repaired inaccurately by the nonhomologous end-joining pathway (see Figure 
5-45). Brca2 binds to the Rad51 protein, preventing its polymerization on DNA, 
and thereby maintaining it in an inactive form until it is needed. Normally, upon 
DNA damage, Brca2 helps to bring Rad51 protein rapidly to sites of damage and, 
once in place, to release it in its active form onto single-strand DNA. 


Homologous Recombination Is Crucial for Meiosis 


We have seen that homologous recombination comprises a group of reactions— 
including broken-end processing, strand exchange, limited DNA synthesis, and 
ligation—to exchange DNA sequences between two double helices of similar 
nucleotide sequence. Having discussed its role in accurately repairing damaged 
DNA, we now turn to homologous recombination as a means to generate DNA 
molecules that carry novel combinations of genes as a result of the deliberate 
exchange of material between different chromosomes. Although this occasionally 
occurs by accident in mitotic cells (and is often detrimental), it is a frequent and 
necessary part of meiosis, which occurs in sexually reproducing organisms such 
as fungi, plants, and animals. 

Here, homologous recombination occurs as an integral part of the process 
whereby chromosomes are parceled out to germ cells (sperm and eggs in ani- 
mals). We discuss the process of meiosis in detail in Chapter 17; in the following 
sections, we discuss how homologous recombination during meiosis produces 
chromosome crossing-over and gene conversion, resulting in hybrid chromosomes 
that contain genetic information from both the maternal and paternal homologs 
(Figure 5-53). Crossing-over and gene conversion are both generated by homolo- 
gous recombination mechanisms that, at their core, resemble those used to repair 
double-strand breaks. 


Meiotic Recombination Begins with a Programmed Double-Strand 
Break 


Homologous recombination in meiosis starts with a bold stroke: a specialized 
protein (called Spoll in budding yeast) breaks both strands of the DNA double 
helix in one of the recombining chromosomes (Figure 5-54). Like a topoisomer- 
ase, Spol1, after catalyzing this reaction, remains covalently bound to the broken 
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Figure 5-52 Experiment demonstrating 
the rapid localization of repair proteins 
to DNA double-strand breaks. Human 
fibroblasts were x-irradiated to produce DNA 
double-strand breaks. Before the x-rays 
struck the cells, they were passed through 
a microscopic grid with x-ray-absorbing 
“pars” spaced 1 um apart. This produced a 
striped pattern of DNA damage, allowing a 
comparison of damaged and undamaged 
DNA in the same nucleus. (A) Total DNA in 
a fibroblast nucleus stained with the dye 
DAPI. (B) Sites of new DNA synthesis due 
to repair of DNA damage, indicated by 
incorporation of BudR (a thymidine analog) 
and subsequent staining with fluorescently 
labeled antibodies to BudR (green). 

(C) Localization of the Mre11 complex to 
damaged DNA as visualized by antibodies 
against the Mre11 subunit (red). Mre11 is a 
nuclease that processes damaged DNA in 
preparation for homologous recombination 
(see Figure 5-48). (A), (B), and (C) were 
processed 30 minutes after x-irradiation. 
(From B.E. Nelms et al., Science 280:590- 
592, 1998. With permission from AAAS.) 


Figure 5-53 Chromosome crossing-over 
occurs in meiosis. Meiosis is the process 
by which a diploid cell gives rise to four 
haploid germ cells, as described in detail 

in Chapter 17. Meiosis produces germ 
cells in which the paternal and maternal 
genetic information (red and blue) has 
been reassorted through chromosome 
crossovers. In addition, many short regions 
of gene conversion occur, as indicated. 
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Figure 5-54 Homologous recombination 
during meiosis can generate 
chromosome crossovers. Once the 
meiosis-specific protein Spo11 and the 
Mre11 complex break the duplex DNA 

and process the ends, homologous 
recombination can proceed along 
alternative pathways. One (right side of 
figure) closely resembles the double-strand 
break repair reaction shown in Figure 5—48 
and results in chromosomes that have been 
“repaired” but have not crossed over. The 
other (left side with strand breaks as shown 
by the blue arrows) proceeds through a 
double Holliday junction and produces 

two chromosomes that have crossed over. 
During meiosis, homologous recombination 
takes place between maternal and paternal 
chromosome homologs when they are held 
tightly together (see Figure 17-54). 
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DNA (see Figure 5-21). A specialized nuclease then rapidly degrades the ends 
bound by Spo11, removing the protein along with the DNA and leaving protrud- 
ing 3’ single-strand ends. 

At this point, many of the recombination reactions resemble those described 
above for the repair of double-strand breaks; indeed, some of the same proteins are 
used for both processes. However, several meiosis-specific proteins direct them 
to perform their tasks somewhat differently, resulting in the distinctive outcomes 
observed for meiosis. Another important difference is that, in meiosis, recombi- 
nation occurs preferentially between maternal and paternal chromosomal homo- 
logs rather than between the newly replicated, identical DNA duplexes that pair 
in double-strand break repair. In the sections that follow, we describe in more 
detail those aspects of homologous recombination that are especially important 
for meiosis. 


Holliday Junctions Are Formed During Meiosis 


Of special importance in meiosis is an intermediate known as a Holliday junc- 
tion or cross-strand exchange (Figure 5-55). Each Holliday junction can adopt 
multiple conformations and a special set of recombination proteins binds to, and 
thereby stabilizes, the open, symmetric isomer. 

Specialized proteins that bind to Holliday junctions can catalyze a reaction 
known as branch migration (Figure 5-56), whereby DNA is spooled through 
the Holliday junction by continually breaking and re-forming base pairs (Figure 
5-57). In this way, the Holliday junction proteins use ATP hydrolysis to expand the 
region of heteroduplex DNA initially created by the strand-exchange reaction. In 
meiosis, heteroduplex regions often “migrate” thousands of nucleotides from the 
original site of the double-strand break. As shown in Figure 5-54, Holliday junc- 
tions usually occur in pairs, known as double Holliday junctions. 


Homologous Recombination Produces Both Crossovers and 
Non-Crossovers During Meiosis 


As shown in Figure 5-54, there are two basic outcomes of homologous recom- 
bination during meiosis. In humans, approximately 90% of the double-strand 
breaks produced during meiosis are resolved as non-crossovers (see right side of 
Figure 5-54). Here, the two original DNA duplexes separate from each other in a 
form unaltered except for a region of heteroduplex that formed near the site of the 
original double-strand break. This set of reactions resembles that described above 
for the repair of double-strand breaks (see Figure 5-48). 

The other outcome is more profound: a double Holliday junction is formed 
and is cleaved by specialized enzymes to create a crossover (see left side of Figure 
5-54). The two original portions of each chromosome upstream and downstream 
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Figure 5-55 A Holliday junction. The initially 
formed structure (A) is usually drawn with 
two strands crossing, as in Figure 5-54. 

An isomerization of the Holliday junction 

(B) produces an open, symmetrical structure 
that is bound by specialized proteins. 

(C) These proteins “move” the Holliday 
junctions by a coordinated set of branch- 
migration reactions (See Figure 5-57 and 
Movie 5.8). (D) Structure of the Holliday 
junction in the open form depicted in (B). 
The Holliday junction is named for the 
scientist who first proposed its formation. 
(PDB code: 1DCW.) 
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Figure 5-56 Simplified view of branch 
migration. In branch migration, base pairs 
are continually broken and formed as the 
branch point moves. Although branch 
migration can happen spontaneously on 
naked DNA molecules, the process is 
inefficient and the branch moves back and 
forth at random. In the cell, branch migration 
is carried out using specialized proteins and 
ATP hydrolysis to ensure that, as shown, the 
branch moves rapidly and in one direction. 
As shown in Figure 5-57, branch migrations 
often occur at Holliday junctions, where two 
branch-migration reactions are coupled. 
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from the two Holliday junctions are thereby swapped, creating two chromosomes 
that have crossed over. 

How does the cell decide which Spoll-induced double-strand breaks to 
resolve as crossovers? The answer is not yet known, but we know the decision 
is an important one. The relatively few crossovers that do form are distributed 
along chromosomes in such a way that a crossover in one position inhibits cross- 
ing-over in neighboring regions. Termed crossover control, this fascinating but 
poorly understood regulatory mechanism ensures the roughly even distribution 
of crossover points along chromosomes. It also ensures that each chromosome— 
no matter how small—undergoes at least one crossover every meiosis. For many 
organisms, roughly two crossovers per chromosome occur during each meiosis, 
one on each arm. As discussed in detail in Chapter 17, these crossovers play an 
important mechanical role in the proper segregation of chromosomes during 
meiosis. 

Whether a meiotic recombination event is resolved as a crossover or a 
non-crossover, the recombination machinery leaves behind a heteroduplex region 
where a strand with the DNA sequence of the paternal homolog is base-paired 
with a strand from the maternal homolog (Figure 5-58). These heteroduplex 
regions can tolerate a small percentage of mismatched base pairs, and because of 
branch migration, they often extend for thousands of nucleotide pairs. The many 
non-crossover events that occur in meiosis thereby produce scattered sites in the 
germ cells where short DNA sequences from one homolog have been pasted into 
the other homolog. Heteroduplex regions mark sites of potential gene conver- 
sion—where the four haploid chromosomes produced by meiosis contain three 
copies of a DNA sequence from one homolog and only one copy of this sequence 
from the other homolog (see Figure 5-53), as explained next. 
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Figure 5-58 Heteroduplexes formed during meiosis. Heteroduplex DNA is present at sites 
of recombination that are resolved either as crossovers or non-crossovers. Because the DNA 
sequences of maternal and paternal chromosomes differ at many positions along their lengths, 
heteroduplexes often contain a small number of base-pair mismatches. 
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Figure 5-57 Enzyme-catalyzed branch 
movement at a Holliday junction by 
branch migration. In E. coli, a tetramer of 
the RuvA protein (green) and two hexamers 
of the RuvB protein (yellow) bind to the 
open form of the junction. The RuvB 
protein, which resembles the hexameric 
helicases used in DNA replication (Figure 
5-14), uses the energy of ATP hydrolysis 

to spool DNA rapidly through the Holliday 
junction, extending the heteroduplex region 
as shown. The RuvA protein coordinates 
this movement, threading the DNA strands 
to avoid tangling. (PDB codes: 1IXR, 1C7Y.) 
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Figure 5-59 Gene conversion caused by mismatch correction. In 

this process, heteroduplex DNA is formed at the sites of homologous 
recombination between maternal and paternal chromosomes. If the maternal 
and paternal DNA sequences are slightly different, the heteroduplex region 
will include some mismatched base pairs, which may then be corrected 

by the DNA mismatch repair machinery (See Figure 5-19). Such repair can 
“erase” nucleotide sequences on either the paternal or the maternal strand. 
The consequence of this mismatch repair is gene conversion, detected as 

a deviation from the segregation of equal copies of maternal and paternal 
alleles that normally occurs in meiosis. 


Homologous Recombination Often Results in Gene Conversion 


In sexually reproducing organisms, it is a fundamental law of genetics that—aside 
from mitochondrial DNA, which is inherited only through the mother—each 
parent makes an equal genetic contribution to an offspring. One complete set of 
nuclear genes is inherited from the father and one complete set is inherited from 
the mother. Underlying this law is the accurate parceling out of chromosomes to 
the germ cells (eggs and sperm) that takes place during meiosis. Thus, when a dip- 
loid cell in a parent undergoes meiosis to produce four haploid germ cells, exactly 
half of the genes distributed among these four cells should be maternal (genes 
inherited from the mother of this parent) and the other half paternal (genes inher- 
ited from the father of this parent). In some organisms (fungi, for example), it is 
possible to recover and analyze all four of the haploid gametes produced from a 
single cell by meiosis. Studies in such organisms have revealed rare cases in which 
the parceling out of genes violates the standard genetic rules. Occasionally, for 
example, meiosis yields three copies of the maternal version of a gene and only 
one copy of the paternal allele. Alternative versions of the same gene are called 
alleles, and it is the divergence from their expected distribution during meiosis 
that is known as gene conversion. Genetic studies show that only small sections 
of DNA typically undergo gene conversion, and in many cases only a part of a 
gene is changed. 

Several pathways in the cell can lead to gene conversion, but one of the most 
important arises from a particular consequence of recombination during meio- 
sis. We have seen that both crossovers and non-crossovers produce heteroduplex 
regions of DNA. If the two strands that make up a heteroduplex region do not have 
identical nucleotide sequences, mismatched base pairs are formed, and these are 
often repaired by the cell’s mismatch repair system (see Figure 5-19). However, the 
mismatch repair system cannot distinguish between the paternal and maternal 
strands and will randomly choose the strand to be used as a template. As a con- 
sequence, one allele will be lost and the other duplicated (Figure 5-59), resulting 
in net “conversion” of one allele to the other. Thus, gene conversion, originally 
regarded as a mysterious deviation from the rules of genetics, can be seen as a 
straightforward consequence of the mechanisms of homologous recombination. 


Summary 


Homologous recombination describes a flexible set of reactions resulting in the 
exchange of DNA sequences between a pair of identical or nearly identical duplex 
DNA molecules. In all cells, this process is essential for the error-free repair of chro- 
mosome damage, particularly double-strand breaks and broken or stalled replica- 
tion forks. Homologous recombination is also responsible for the crossing-over of 
chromosomes that occurs during meiosis. Homologous recombination takes place 
through a variety of pathways, but they have in common a strand-exchange step 
whereby a single strand from one DNA duplex invades a second duplex and base- 
pairs with one strand while displacing the other. This reaction, catalyzed by the 
RecA/Rad51 family of proteins, can only occur if the invading strand can form a 
short stretch of consecutive nucleotide pairs with one of the strands of the duplex. 
This requirement ensures that homologous recombination occurs only between 
identical or very similar DNA sequences. 
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TRANSPOSITION AND CONSERVATIVE SITE-SPECIFIC RECOMBINATION 


When used as a repair mechanism, homologous recombination occurs between 
a damaged DNA molecule and its recently duplicated sister molecule, with the 
undamaged duplex acting as a template to repair the damaged copy flawlessly. 

In meiosis, homologous recombination is initiated by deliberate, carefully regu- 
lated double-strand breaks and occurs preferentially between the homologous chro- 
mosomes rather than the newly replicated sister chromatids. The outcome can be 
either two chromosomes that have crossed over (that is, chromosomes in which the 
DNA on either side of the site of DNA pairing originates from two different homo- 
logs) or two non-crossover chromosomes. In the latter case, the two chromosomes 
that result are identical to the original two homologs, except for relatively minor 
DNA sequence changes at the site of recombination. 


TRANSPOSITION AND CONSERVATIVE SITE-SPECIFIC 
RECOMBINATION 


We have seen that homologous recombination can result in the exchange of DNA 
sequences between chromosomes. However, the order of genes on the interact- 
ing chromosomes typically remains the same following homologous recom- 
bination, inasmuch as the recombining sequences must be very similar for the 
process to occur. In this section, we describe two very different types of recombi- 
nation—transposition (also called transpositional recombination) and conserva- 
tive site-specific recombination—that do not require substantial regions of DNA 
homology. These two types of recombination reactions can alter gene order along 
a chromosome and can cause unusual types of mutations that introduce whole 
blocks of DNA sequence into the genome. 

Transposition and conservative site-specific recombination are largely dedi- 
cated to moving a wide variety of specialized segments of DNA—collectively 
termed mobile genetic elements—from one position in a genome to another. We 
will see that mobile genetic elements can range in size from a few hundred to tens 
of thousands of nucleotide pairs, and each typically carries a unique set of genes. 
Often, one of these genes encodes a specialized enzyme that catalyzes the move- 
ment of only that element, thereby making this type of recombination possible. 

Virtually all cells contain mobile genetic elements (known informally as 
“jumping genes”). As explained in Chapter 4, over evolutionary time scales, they 
have had a profound effect on the shaping of modern genomes. For example, 
nearly half of the human genome can be traced to these elements (see Figure 
4-62). Over time, random mutation has altered their nucleotide sequences, and, 
as a result, only a few of the many copies of these elements in our DNA are still 
active and capable of movement. The remainder are molecular fossils whose exis- 
tence provides striking clues to our evolutionary history. 

Mobile genetic elements are often considered to be molecular parasites (they 
are also termed “selfish DNA”) that persist because cells cannot get rid of them; 
they certainly have come close to overrunning our own genome. However, mobile 
DNA elements can provide benefits to the cell. For example, the genes they carry 
are sometimes advantageous, as in the case of antibiotic resistance in bacterial 
cells, discussed below. The movement of mobile genetic elements also produces 
many of the genetic variants upon which evolution depends, because, in addition 
to moving themselves, mobile genetic elements occasionally rearrange neighbor- 
ing sequences of the host genome. Thus, spontaneous mutations observed in Dro- 
sophila, humans, and other organisms are often due to the movement of mobile 
genetic elements. While many of these mutations will be deleterious to the organ- 
ism, some will be advantageous and may spread throughout the population. It is 
almost certain that much of the variety of life we see around us originally arose 
from the movement of mobile genetic elements. 

In this section, we introduce mobile genetic elements and describe the mech- 
anisms that enable them to move around a genome. We shall see that some of 
these elements move through transposition mechanisms and others through con- 
servative site-specific recombination. We begin with transposition, as there are 
many more known examples of this type of movement. 
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Through Transposition, Mobile Genetic Elements Can Insert Into 
Any DNA Sequence 


Mobile elements that move by way of transposition are called transposons, or 
transposable elements. In transposition, a specific enzyme, usually encoded 
by the transposon itself and typically called a transposase, acts on specific DNA 
sequences at each end of the transposon, causing it to insert into a new target 
DNA site. Most transposons are only modestly selective in choosing their target 
site, and they can therefore insert themselves into many different locations in a 
genome. In particular, there is no general requirement for sequence similarity 
between the ends of the element and the target sequence. Most transposons move 
only rarely. In bacteria, where it is possible to measure the frequency accurately, 
transposons typically move once every 10° cell divisions. More frequent move- 
ment would probably destroy the host cell’s genome. 

On the basis of their structure and transposition mechanism, transposons 
can be grouped into three large classes: DNA-only transposons, retroviral-like ret- 
rotransposons, and nonretroviral retrotransposons. The differences among them 
are briefly outlined in Table 5-4, and each class will be discussed in turn. 


DNA-Only Transposons Can Move by a Cut-and-Paste 
Mechanism 


DNA-only transposons, so named because they exist only as DNA during their 
movement, predominate in bacteria, and they are largely responsible for the 
spread of antibiotic resistance in bacterial strains. When antibiotics like penicillin 
and streptomycin first became widely available in the 1950s, most bacteria that 
caused human disease were susceptible to them. Now, the situation is different— 
antibiotics such as penicillin (and its modern derivatives) are no longer effective 
against many modern bacterial strains, including those causing gonorrhea and 
bacterial pneumonia. The spread of antibiotic resistance is due largely to genes 


TABLE 5-4 


DNA-only transposons 


Short inverted repeats at each end | Transposase Moves as DNA, either P element (Drosophila), 
by cut-and-paste or Ac-Ds (maize), Tn3 and Tn10 
replicative pathways (E. coli), Tam3 (snapdragon) 


Retroviral-like retrotransposons 


Directly repeated long terminal Reverse transcriptase and Moves via an RNA Copia (Drosophila), 
repeats (LTRs) at each end integrase intermediate whose Ty1 (yeast), THE1 (human), 
production is driven by a Bs1 (maize) 
promoter in the LTR 


Nonretroviral retrotransposons 


AAAA 
TTTT 


Poly A at 3’ end of RNA transcript; Reverse transcriptase and Moves via an RNA F element (Drosophila), 

5’ end is often truncated endonuclease intermediate that is L1 (human), Cin4 (maize) 
often synthesized from a 
neighboring promoter 

These elements range in length from 1000 to about 12,000 nucleotide pairs. Each family contains many members, only a few of which are listed 


here. Some viruses can also move in and out of host-cell chromosomes by transpositional mechanisms. These viruses are related to the first two 
classes of transposons. 
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that encode antibiotic-inactivating enzymes that are carried on transposons (Fig- 
ure 5-60). Although these mobile elements can transpose only within cells that 
already carry them, they can be moved from one cell to another through other 
mechanisms known collectively as horizontal gene transfer (see Figure 1-19). 
Once introduced into a new cell, a transposon can insert itself into the genome 
and be faithfully passed on to all progeny cells through the normal processes of 
DNA replication and cell division. 

DNA-only transposons can relocate from a donor site to a target site by cut- 
and-paste transposition (Figure 5-61). Here, the transposon is literally excised 
from one spot on a genome and inserted into another. This reaction produces a 
short duplication of the target DNA sequence at the insertion site; these direct 
repeat sequences that flank the transposon serve as convenient records of prior 
transposition events. Such “signatures” often provide valuable clues in identifying 
transposons in genome sequences. 

When a cut-and-paste DNA-only transposon is excised from its original loca- 
tion, it leaves behind a “hole” in the chromosome. This lesion can be perfectly 
healed by recombinational double-strand break repair (see Figure 5-48), pro- 
vided that the chromosome has just been replicated and an identical copy of the 
damaged host sequence is available. Alternatively, a nonhomologous end-join- 
ing reaction can reseal the break; in this case, the DNA sequence that originally 
flanked the transposon is altered, producing a mutation at the chromosomal site 
from which the transposon was excised (see Figure 5-45). 

Remarkably, the same mechanism used to excise cut-and-paste trans- 
posons from DNA has been found to operate in developing immune systems of 
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Figure 5-60 Three of the many DNA-only 
transposons found in bacteria. Each 

of these mobile DNA elements contains 

a gene that encodes a transposase, an 
enzyme that carries out the DNA breakage 
and joining reactions needed for the 
element to move. Each transposon also 
carries short DNA sequences (indicated 

in red) that are recognized only by the 
transposase encoded by that element 

and are necessary for movement of the 
element. In addition, two of the three 
mobile elements shown carry genes 

that encode enzymes that inactivate the 
antibiotics ampicillin (AmpR)—a penicillin 
derivative—and tetracycline (TetR). The 
transposable element Tn10, shown in the 
bottom diagram, is thought to have evolved 
from the chance landing of two much 
shorter mobile elements on either side of a 
tetracycline-resistance gene. 


Figure 5-61 Cut-and-paste 
transposition. DNA-only transposons can 
be recognized in chromosomes by the 
“inverted repeat DNA sequences” (red) 
present at their ends. These sequences, 
which can be as short as 20 nucleotides, 
are all that is necessary for the DNA 
between them to be transposed by the 
particular transposase enzyme associated 
with the element. The cut-and-paste 
movement of a DNA-only transposable 
element from one chromosomal site to 
another begins when the transposase 
brings the two inverted DNA sequences 
together, forming a DNA loop. Insertion 
into the target chromosome, also catalyzed 
by the transposase, occurs at a random 
site through the creation of staggered 
breaks in the target chromosome (purple 
arrowheads). Following the transposition 
reaction, the single-strand gaps created by 
the staggered breaks are repaired by DNA 
polymerase and ligase (black). As a result, 
the insertion site is marked by a short direct 
repeat of the target DNA sequence, as 
shown. Although the break in the donor 
chromosome (green) is repaired, this 
process often alters the DNA sequence, 
causing a mutation at the original site of the 
excised transposable element (not shown). 
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vertebrates, catalyzing the DNA rearrangements that produce antibody and T cell 
receptor diversity. Known as V(D)J recombination, this process will be discussed 
in Chapter 24. Found only in vertebrates, V(D)J recombination is a relatively 
recent evolutionary novelty, but it is believed to be derived from the much more 
ancient cut-and-paste transposons. 


Some Viruses Use a Transposition Mechanism to Move 
Themselves Into Host-Cell Chromosomes 


Certain viruses are considered mobile genetic elements because they use trans- 
position mechanisms to integrate their genomes into that of their host cell. How- 
ever, unlike transposons, these viruses encode proteins that package their genetic 
information into virus particles that can infect other cells. Many of the viruses that 
insert themselves into a host chromosome do so by employing one of the first two 
mechanisms listed in Table 5-4; namely, by behaving like DNA-only transposons 
or like retroviral-like retrotransposons. Indeed, much of our knowledge of these 
mechanisms has come from studies of particular viruses that employ them. 

Transposition has a key role in the life cycle of many viruses. Most notable are 
the retroviruses, which include the human AIDS virus, HIV. Outside the cell, a ret- 
rovirus exists as a single-strand RNA genome packed into a protein shell or capsid 
along with a virus-encoded reverse transcriptase enzyme. During the infection 
process, the viral RNA enters a cell and is converted to a double-strand DNA mol- 
ecule by the action of this crucial enzyme, which is able to polymerize DNA on 
either an RNA or a DNA template (Figure 5-62). The term retrovirus refers to the 
virus’s ability to reverse the usual flow of genetic information, which normally is 
from DNA to RNA (see Figure 1-4). 

Once the reverse transcriptase has produced a double-strand DNA mole- 
cule, specific sequences near its two ends are recognized by a virus-encoded 


INTEGRATION 
OF DNA COPY 
INTO HOST 
DNA CHROMOSOME integrated DNA 


= 
DNA 
REVERSE TRANSCRIPTASE RNA 
MAKES DNA/RNA AND 
THEN DNA/DNA DNA 


DOUBLE HELIX 


| 


i 


| 


l 


TRANSCRIPTION 
RNA 


3 
D 
5 
< 


RNA RNA = 


reverse Ce 
transcriptase TRANSLATION | 






envelope 


capsid 
: : ene 
@ capsid protein ot N ASSEMBLY OF MANY 
NEW INFECTIOUS 
+ VIRUS PARTICLES 
ENTRY INTO _ ot ae 
CELL AND LOSS envelope protein ro > 
OF ENVELOPE A 4 
f © © 
reverse transcriptase 
p o% 


Figure 5-62 The life cycle of a retrovirus. The retrovirus genome consists of an RNA molecule (blue) that is typically between 7000 and 12,000 
nucleotides in length. It is packaged inside a protein capsid, which is surrounded by a lipid-based envelope that contains virus-encoded envelope 
proteins (green). Inside an infected cell, the enzyme reverse transcriptase (red circle) first makes a DNA copy of the viral RNA molecule and then a 
second DNA strand, generating a double-strand DNA copy of the RNA genome. The integration of this DNA double helix into the host chromosome 
is then catalyzed by a virus-encoded integrase enzyme. This integration is required for the synthesis of new viral RNA molecules by the host-cell RNA 
polymerase, the enzyme that transcribes DNA into RNA (discussed in Chapter 6). 
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transposase called integrase. Integrase then inserts the viral DNA into the chro- 
mosome by a mechanism similar to that used by the cut-and-paste DNA-only 
transposons (see Figure 5-61). 


Retroviral-like Retrotransoosons Resemble Retroviruses, but Lack 
a Protein Coat 


A large family of transposons called retroviral-like retrotransposons (see Table 
5-4) move themselves in and out of chromosomes by a mechanism that is similar 
to that used by retroviruses. These elements are present in organisms as diverse as 
yeasts, flies, and mammals; unlike viruses, they have no intrinsic ability to leave 
their resident cell but are passed along to all descendants of that cell through the 
normal processes of DNA replication and cell division. The first step in their trans- 
position is the transcription of the entire transposon, producing an RNA copy of 
the element that is typically several thousand nucleotides long. This transcript, 
which is translated as a messenger RNA by the host cell, encodes a reverse tran- 
scriptase enzyme. This enzyme makes a double-strand DNA copy of the RNA mol- 
ecule via an RNA-DNA hybrid intermediate, precisely mirroring the early stages 
of infection by a retrovirus (see Figure 5-62). Like a retrovirus, the linear, dou- 
ble-strand DNA molecule then integrates into a site on the chromosome using an 
integrase enzyme that is also encoded by the element. The structure and mecha- 
nisms of these integrases closely resemble those of the transposases of DNA-only 
transposons. 


A Large Fraction of the Human Genome Is Composed of 
Nonretroviral Retrotransposons 


A significant fraction of many vertebrate chromosomes is made up of repeated 
DNA sequences. In human chromosomes, these repeats are mostly mutated and 
truncated versions of nonretroviral retrotransposons, the third major type of 
transposon (see Table 5-4). Although most of these transposons in the human 
genome are immobile, a few retain the ability to move. Relatively recent move- 
ments of the L1 element (sometimes referred to as a LINE or long interspersed 
nuclear element) have been identified, some of which result in human disease; 
for example, a particular type of hemophilia results from an L1 insertion into the 
gene encoding the blood-clotting protein Factor VIII (see Figure 6-24). 

Nonretroviral retrotransposons are found in many organisms and move via a 
distinct mechanism that requires a complex of an endonuclease and a reverse 
transcriptase. As illustrated in Figure 5-63, the RNA and reverse transcriptase 
have a much more direct role in the recombination event than they do in the ret- 
roviral-like retrotransposons described above. 

Inspection of the human genome sequence reveals that the bulk of nonretro- 
viral retrotransposons—for example, the many copies of the Alu element, a mem- 
ber of the SINE (short interspersed nuclear element) family—do not carry their 
own endonuclease or reverse transcriptase genes. Nonetheless, they have suc- 
cessfully amplified themselves to become major constituents of our genome, pre- 
sumably by pirating enzymes encoded by other transposons. Together the LINEs 
and SINEs make up over 30% of the human genome (see Figure 4-62); there are 
500,000 copies of the former and over a million of the latter. 


Figure 5-63 Transposition by a nonretroviral retrotransposon. 
Transposition of the L7 element (red) begins when an endonuclease attached 
to the L7 reverse transcriptase (green) and the L7 RNA (blue) nick the target 
DNA at the point at which insertion will occur. This cleavage releases a 3’-OH 
DNA end in the target DNA, which is then used as a primer for the reverse 
transcription step shown. This generates a single-strand DNA copy of the 
element that is directly linked to the target DNA. In subsequent reactions, 
further processing of the single-strand DNA copy results in the generation of 
a new double-strand DNA copy of the L1 element that is inserted at the site 
of the initial nick. 
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Different Transposable Elements Predominate in Different 
Organisms 


We have described several types of transposable elements: (1) DNA-only transpo- 
sons, the movement of which is based on DNA breaking and joining reactions; (2) 
retroviral-like retrotransposons, which also move via DNA breakage and joining, 
but where RNA has a key role as a template to generate the DNA recombination 
substrate; and (3) nonretroviral retrotransposons, in which an RNA copy of the 
element is central to the incorporation of the element into the target DNA, acting 
as a direct template for a DNA target-primed reverse transcription event. 

Intriguingly, different types of transposons predominate in different organ- 
isms. For example, the vast majority of bacterial transposons are DNA-only types, 
with a few related to the nonretroviral retrotransposons also present. In yeasts, the 
main mobile elements are retroviral-like retrotransposons. In Drosophila, DNA- 
based, retroviral, and nonretroviral transposons are all found. Finally, the human 
genome contains all three types of transposon, but as discussed below, their evo- 
lutionary histories are strikingly different. 


Genome Sequences Reveal the Approximate Times at Which 
Transposable Elements Have Moved 


The nucleotide sequence of the human genome provides a rich fossil record of 
the activity of transposons over evolutionary time spans. By carefully comparing 
the nucleotide sequences of the approximately 3 million transposable element 
remnants in the human genome, it has been possible to broadly reconstruct the 
movements of transposons in our ancestors’ genomes over the past several hun- 
dred million years. For example, the DNA-only transposons appear to have been 
very active well before the divergence of humans and Old World monkeys (25-35 
million years ago), but because they gradually accumulated inactivating muta- 
tions, they have been dormant in the human lineage since that time. Likewise, 
although our genome is littered with relics of retroviral-like retrotransposons, 
none appear to be active today. Only a single family of retroviral-like retrotrans- 
posons is believed to have transposed in the human genome since the divergence 
of human and chimpanzee approximately 6 million years ago. The nonretroviral 
retrotransposons are also ancient, but in contrast to other types, some are still 
moving in our genome, as mentioned previously. For example, it is estimated 
that de novo movement of an Alu element is seen once in every 100-200 human 
births. The movement of nonretroviral retrotransposons is responsible for a small 
but significant fraction of new human mutations—perhaps two mutations out of 
every thousand. 

The situation in mice is significantly different. Although the mouse and human 
genomes contain roughly the same density of the three types of transposons, both 
types of retrotransposons are still actively transposing in the mouse genome, 
being responsible for approximately 10% of new mutations. 

Although we are only beginning to understand how the movements of trans- 
posons have shaped the genomes of present-day mammals, it has been proposed 
that bursts in transposition activity could have been responsible for critical spe- 
ciation events during the radiation of the mammalian lineages from a common 
ancestor, a process that began approximately 170 million years ago. At present, we 
can only wonder how many of our uniquely human qualities arose from the past 
activity of the many mobile genetic elements whose remnants are found today 
scattered throughout our chromosomes. 


Conservative Site-Specific Recombination Can Reversibly 
Rearrange DNA 
A different kind of recombination mechanism, known as conservative site-specific 


recombination, rearranges other types of mobile DNA elements. In this pathway, 
breakage and joining occur at two special sites, one on each participating DNA 
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Figure 5-64 Two types of DNA rearrangement produced by conservative site-specific 
recombination. The only difference between the reactions in (A) and (B) is the relative orientation of 
the two short DNA sites (indicated by arrows) at which a site-specific recombination event occurs. 
(A) Through an integration reaction, a circular DNA molecule can become incorporated into a 
second DNA molecule; by the reverse reaction (excision), it can exit to re-form the original DNA 
circle. Many bacterial viruses move in and out of their host chromosomes in this way. 

(B) Conservative site-specific recombination can also invert a specific segment of DNA in a 
chromosome. A well-studied example of DNA inversion through site-specific recombination occurs 
in the bacterium Salmonella typhimurium, an organism that is a major cause of food poisoning in 
humans; as described in the following section, the inversion of a DNA segment changes the type of 
flagellum that is produced by the bacterium. 


molecule. Depending on the positions and relative orientations of the two recom- 
bination sites, DNA integration, DNA excision, or DNA inversion can occur (Fig- 
ure 5-64). Conservative site-specific recombination is carried out by specialized 
enzymes that break and rejoin two DNA double helices at specific sequences on 
each DNA molecule. The same enzyme system that joins two DNA molecules can 
often take them apart again, precisely restoring the sequence of the two original 
DNA molecules (see Figure 5-64A). 

Conservative site-specific recombination is often used by DNA viruses to move 
their genomes in and out of the genomes of their host cells. When integrated into 
its host genome, the viral DNA is replicated along with the host DNA and is faith- 
fully passed on to all descendent cells. If the host cell suffers damage (for example, 
by UV irradiation), the virus can reverse the site-specific recombination reaction, 
excise its genome, and package it into a virus particle. In this way, many viruses 
can replicate themselves passively as a component of the host genome, but can 
also “leave the sinking ship” by excising their genomes and packaging them in a 
protective coat until a new, healthy host cell is encountered. 

Several features distinguish conservative site-specific recombination from 
transposition. First, conservative site-specific recombination requires specialized 
DNA sequences on both the donor and recipient DNA (hence the term site-spe- 
cific). These sequences contain recognition sites for the particular recombinase 
that will catalyze the rearrangement. In contrast, transposition requires only that 
the transposon have a specialized sequence; for most transposons, the recipient 
DNA can be of any sequence. Second, the reaction mechanisms are fundamen- 
tally different. The recombinases that catalyze conservative site-specific recombi- 
nation resemble topoisomerases in the sense that they form transient high-energy 
covalent bonds with the DNA and use this energy to complete the DNA rearrange- 
ments (see Figure 5-21). Thus, all the phosphate bonds that are broken during a 
recombination event are restored upon its completion (hence the term conser- 
vative). Transposition, in contrast, does not proceed through a covalently joined 
protein-DNA intermediate, and this process leaves gaps in the DNA that must be 
repaired by DNA polymerases. 
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Conservative Site-Specific Recombination Can Be Used to Turn 
Genes On or Off 


Many bacteria use conservative site-specific recombination to control the expres- 
sion of particular genes. A well-studied example occurs in Salmonella bacteria 
and is known as phase variation. The switch in gene expression results from the 
occasional inversion of a specific 1000-nucleotide-pair piece of DNA, brought 
about by a conservative site-specific recombinase encoded in the Salmonella 
genome. This change alters the expression of the cell-surface protein flagellin, for 
which the bacterium has two different genes (Figure 5-65). The DNA inversion 
changes the orientation of a promoter (a DNA sequence that directs transcription 
of a gene) that is located within the inverted DNA segment. With the promoter in 
one orientation, the bacteria synthesize one type of flagellin; with the promoter 
in the other orientation, they synthesize the other type. The recombination reac- 
tion is reversible, allowing bacterial populations to switch back and forth between 
the two types of flagellin. Inversions occur only rarely, and because such changes 
in the genome will be copied faithfully during all subsequent replication cycles, 
entire clones of bacteria will have one type of flagellin or the other. 

Phase variation helps protect the bacterial population against the immune 
response of its vertebrate host. If the host makes antibodies against one type of 
flagellin, a few bacteria whose flagellin has been altered by gene inversion will still 
be able to survive and multiply. 


Bacterial Conservative Site-Specific Recombinases Have Become 
Powerful Tools for Cell and Developmental Biologists 


Like many of the mechanisms used by cells and viruses, site-specific recombina- 
tion has been put to work by scientists to study a wide variety of problems. To 
decipher the roles of specific genes and proteins in complex multicellular organ- 
isms, genetic engineering techniques are used to produce worms, flies, and mice 
carrying a gene encoding a site-specific recombination enzyme plus a carefully 
designed target DNA with the DNA sites that this enzyme recognizes. At an appro- 
priate time, the gene encoding the enzyme can be activated to rearrange the target 
DNA sequence. Such a rearrangement is widely used to delete a specific gene in a 
particular tissue of a multicellular organism (Figure 5-66). It is particularly useful 
when the gene of interest plays a key role in the early development of many tis- 
sues, and a complete deletion of the gene from the germ line would cause death 


Figure 5-65 Switching gene expression 
by DNA inversion in bacteria. 
Alternating transcription of two flagellin 
genes in a Salmonella bacterium is 
caused by a conservative site-specific 
recombination event that inverts a small 
DNA segment containing a promoter. 

(A) In one orientation, the promoter 
activates transcription of the H2 flagellin 
gene as well as that of a repressor protein 
that blocks the expression of the H7 
flagellin gene. Promoters and repressors 
are described in detail in Chapter 7; here 
we note simply that a promoter is needed 
to express a gene into protein and that 

a repressor blocks this from happening. 
(B) When the promoter is inverted, it no 
longer turns on H2 or the repressor, and 
the H1 gene, which is thereby released 
from repression, is expressed instead. 
The inversion reaction requires specific 
DNA sequences (red) and a recombinase 
enzyme that is encoded in the invertible 
DNA segment. This site-specific 
recombination mechanism is activated 
only rarely (about once in every 10° cell 
divisions). Therefore, the production of one 
or the other flagellin tends to be faithfully 
inherited in each clone of cells. 
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Figure 5-66 How a conservative site-specific recombination enzyme from bacteria is used 
to delete specific genes from particular mouse tissues. This approach requires the insertion 

of two specially engineered DNA molecules into the animal’s germ line. The first contains the gene 
for a recombinase (in this case, the Cre recombinase from the bacteriophage P1) under the control 
of a tissue-specific promoter, which ensures that the recombinase is expressed only in that tissue. 
The second DNA molecule contains the gene of interest flanked by recognition sites (in this case, 
LoxP sites) for the recombinase. The mouse is engineered so that this is the only copy of this gene. 
Therefore, if the recombinase is expressed only in the liver, the gene of interest will be deleted there, 
and only there. The reaction that excises the gene is the same as that shown in Figure 5-64A. 

As described in Chapter 7, many tissue-specific promoters are known; moreover, many of these 
promoters are active only at specific times in development. Thus, it is possible to study the effect 
of deleting specific genes at different times during the develooment of each tissue. 


very early in development. The same strategy can also be used to inappropriately 
express any specific gene in a tissue of interest; here, the triggered deletion joins 
a strong transcriptional promoter to the gene of interest. With this tool one can in 
principle determine the influence of any protein in any desired tissue of an intact 
animal. 


Summary 


The genomes of nearly all organisms contain mobile genetic elements that can move 

from one position in the genome to another by either transpositional or conserva- 
tive site-specific recombination processes. In most cases, this movement is random 
and happens at a very low frequency. Mobile genetic elements include transposons, 
which move within a single cell (and its descendants), plus those viruses whose 
genomes can integrate into the genomes of their host cells. 

There are three classes of transposons: the DNA-only transposons, the retrovi- 
ral-like retrotransposons, and the nonretroviral retrotransposons. All but the last 
have close relatives among the viruses. Although viruses and transposable elements 
can be viewed as parasites, many of the new arrangements of DNA sequences that 
their site-specific recombination events produce have played an important part in 
creating the genetic variation crucial for the evolution of cells and organisms. 
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WHAT WE DON’T KNOW 


e How does DNA replication contend 
with all the other processes that occur 
simultaneously on chromosomes, 
including DNA repair and gene 
transcription? 


e What is the basis for the low 
frequency of errors in DNA replication 
observed in all cells? Is this the best 
that cells can do given the speed of 
replication and the limits of molecular 
diffusion? Was this mutation rate 
selected in evolution to provide genetic 
variation? 


e Cells have only one fundamental way 
of replicating DNA but many different 
ways of repairing it. Are there still 

other, undiscovered ways that cells 
have for repairing DNA? 


e Do the many “dead” transposons 
in the human genome provide any 
benefits to humans? 
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PROBLEMS 


Which statements are true? Explain why or why not. 


5-1 The different cells in your body rarely have 
genomes with the identical nucleotide sequence. 


5-2 In. coli, where the replication fork travels at 500 
nucleotide pairs per second, the DNA ahead of the fork— 
in the absence of topoisomerase—would have to rotate at 
nearly 3000 revolutions per minute. 


5-3 In a replication bubble, the same parental DNA 
strand serves as the template strand for leading-strand 
synthesis in one replication fork and as the template for 
lagging-strand synthesis in the other fork. 


5-4 When bidirectional replication forks from adja- 
cent origins meet, a leading strand always runs into a lag- 
ging strand. 


5-5 DNA repair mechanisms all depend on the exis- 
tence of two copies of the genetic information, one in each 
of the two homologous chromosomes. 


Discuss the following problems. 


5-6 To determine the reproducibility of mutation fre- 
quency measurements, you do the following experiment. 
You inoculate each of 10 cultures with a single E. coli bac- 
terium, allow the cultures to grow until each contains 10° 
cells, and then measure the number of cells in each culture 
that carry a mutation in your gene of interest. You were so 
surprised by the initial results that you repeated the experi- 
ment to confirm them. Both sets of results display the same 
extreme variability, as shown in Table Q5-1. Assuming that 
the rate of mutation is constant, why do you suppose there 
is so much variation in the frequencies of mutant cells in 
different cultures? 


TABLE Q5-1 
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5-7 DNA repair enzymes preferentially repair mis- 
matched bases on the newly synthesized DNA strand, 
using the old DNA strand as a template. If mismatches 
were instead repaired without regard for which strand 
served as template, would mismatch repair reduce repli- 
cation errors? Would such a mismatch repair system result 
in fewer mutations, more mutations, or the same number 
of mutations as there would have been without any repair 
at all? Explain your answers. 


5-8 Discuss the following statement: “Primase is a 
sloppy enzyme that makes many mistakes. Eventually, the 





RNA primers it makes are replaced with DNA made by a 
polymerase with higher fidelity. This is wasteful. It would 
be more energy-efficient if a DNA polymerase made an 
accurate copy in the first place.” 


5-9 If DNA polymerase requires a perfectly paired 
primer in order to add the next nucleotide, how is it that 
any mismatched nucleotides “escape” this requirement 
and become substrates for mismatch repair enzymes? 


5-10 ‘The laboratory you joined is studying the life 
cycle of an animal virus that uses circular, double-strand 
DNA as its genome. Your project is to define the location 
of the origin(s) of replication and to determine whether 
replication proceeds in one or both directions away from 
an origin (unidirectional or bidirectional replication). To 
accomplish your goal, you broke open cells infected with 
the virus, isolated replicating viral genomes, cleaved them 
with a restriction nuclease that cuts the genome at only 
one site to produce a linear molecule from the circle, and 
examined the resulting molecules in the electron micro- 
scope. Some of the molecules you observed are illustrated 
schematically in Figure Q5-1. (Note that it is impossible to 
distinguish the orientation of one DNA molecule relative 
to another in the electron microscope. ) 

You must present your conclusions to the rest of 
the lab tomorrow. How will you answer the two questions 
your advisor posed for you? Is there a single, unique origin 
of replication or several origins? Is replication unidirec- 
tional or bidirectional? 


original molecule 


bubbles æ 


“H"-forms 
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> _ 


5-11 You are investigating DNA synthesis in tissue-cul- 
ture cells, using *H-thymidine to radioactively label the 
replication forks. By breaking open the cells in a way that 
allows some of the DNA strands to be stretched out, very 
long DNA strands can be isolated intact and examined. 
You overlay the DNA with a photographic emulsion, and 
expose it for 3 to 6 months, a procedure known as auto- 
radiography. Because the emulsion is sensitive to radioac- 
tive emissions, the ĉH-labeled DNA shows up as tracks of 
silver grains. Because the stretching collapses replication 


Figure Q5-1 Parental 
and replicating forms of 
an animal virus (Problem 
5-10). 
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Figure Q5-2 Autoradiographic investigation of DNA replication in cultured 


cells (Problem 5-11). (A) Addition of $H-labeled thymidine immediately 
after release from the synchronizing block. (B) Addition of SH-labeled 
thymidine 30 minutes after release from the synchronizing block. 


bubbles, the daughter duplexes lie side by side and cannot 
be distinguished from each other. 

You pretreat the cells to synchronize them at the 
beginning of S phase. In the first experiment, you release 
the synchronizing block and add *H-thymidine immedi- 
ately. After 30 minutes, you wash the cells and change the 
medium so that the total concentration of thymidine is the 
same as it was, but only one-third of it is radioactive. After 
an additional 15 minutes, you prepare DNA for autoradi- 
ography. The results of this experiment are shown in Fig- 
ure Q5-2A. In the second experiment, you release the syn- 
chronizing block and then wait 30 minutes before adding 
3H-thymidine. After 30 minutes in the presence of *H-thy- 
midine, you once again change the medium to reduce the 
concentration of radioactive thymidine and incubate the 
cells for an additional 15 minutes. The results of the second 
experiment are shown in Figure Q5-2B. 

A. Explain why, in both experiments, some regions of 
the tracks are dense with silver grains (dark), whereas oth- 
ers are less dense (light). 

B. In the first experiment, each track has a central 
dark section with light sections at each end. In the second 
experiment, the dark section of each track has a light sec- 
tion at only one end. Explain the reason for this difference. 
C. Estimate the rate of fork movement (um/min) in 
these experiments. Do the estimates from the two exper- 
iments agree? Can you use this information to gauge how 
long it would take to replicate the entire genome? 


5-12 If you compare the frequency of the sixteen pos- 
sible dinucleotide sequences in the E. coli and human 
genomes, there are no striking differences except for one 
dinucleotide, 5'-CG-3’. The frequency of CG dinucleotides 
in the human genome is significantly lower than in E. coli 
and significantly lower than expected by chance. Why do 
you suppose that CG dinucleotides are underrepresented 
in the human genome? 


5-13 With age, somatic cells are thought to accumulate 
genomic “scars” as a result of the inaccurate repair of dou- 
ble-strand breaks by nonhomologous end joining (NHEJ). 
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Estimates based on the frequency of breaks in primary 
human fibroblasts suggest that by age 70, each human 
somatic cell may carry some 2000 NHEJ-induced muta- 
tions due to inaccurate repair. If these mutations were 
distributed randomly around the genome, how many pro- 
tein-coding genes would you expect to be affected? Would 
you expect cell function to be compromised? Why or why 
not? (Assume that 2% of the gnome—1.5% protein-cod- 
ing and 0.5% regulatory—is crucial information.) 


5-14 Draw the structure of the double Holliday junction 
that would result from strand invasion by both ends of the 
broken duplex into the intact homologous duplex shown 
in Figure Q5-3. Label the left end of each strand in the Hol- 
liday junction 5’ or 3’ so that the relationship to the paren- 
tal and recombinant duplexes is clear. Indicate how DNA 
synthesis would be used to fill in any single-strand gaps in 
your double Holliday junction. 








5' = 
Figure Q5-3 A broken duplex with 
single-strand tails ready to invade 

si 3 an intact homologous duplex 

5-15 In addition to correcting DNA mismatches, the 


mismatch repair system functions to prevent homologous 
recombination from taking place between similar but not 
identical sequences. Why would recombination between 
similar, but nonidentical sequences pose a problem for 
human cells? 


5-16 Cre recombinase is a site-specific enzyme that 
catalyzes recombination between two LoxP DNA sites. 
Cre recombinase pairs two LoxP sites in the same orienta- 
tion, breaks both duplexes at the same point in each LoxP 
site, and joins the ends with new partners so that each 
LoxP site is regenerated, as shown schematically in Figure 
Q5-4A. Based on this mechanism, predict the arrange- 
ment of sequences that will be generated by Cre-medi- 
ated site-specific recombination for each of the two DNAs 
shown in Figure Q5-4B. 
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Figure Q5-4 Cre recombinase-mediated site-specific recombination 
(Problem 5-16). (A) Schematic representation of Cre/LoxP site-specific 
recombination. The LoxP sequences in the DNA are represented by 
triangles that are colored so that the site-specific recombination event 
can be followed more readily. In reality their DNA sequences 

are identical. (B) DNA substrates containing two arrangements of 
LoxP sites. 


298 Chapter 5: DNA Replication, Repair, and Recombination 


REFERENCES 


General 

Brown TA (2007) Genomes 3. New York: Garland Science. 

Friedberg EC, Walker GC, Siede W et al. (2005) DNA Repair and 
Mutagenesis. Washington, DC: ASM Press. 

Haber JE (2013) Genome Stability: DNA Repair and Recombination. 
New York: Garland Science. 

Hartwell L, Hood L, Goldberg ML et al. (2010) Genetics: from Genes to 
Genomes. Boston: McGraw Hill. 

Stent GS (1971) Molecular Genetics: An Introductory Narrative. San 
Francisco: WH Freeman. 


Watson J, Baker T, Bell S et al. (2013) Molecular Biology of the Gene, 
7th ed. Menlo Park, CA: Benjamin Cummings. 


The Maintenance of DNA Sequences 


Conrad DF, Keebler J, DePristo M et al. (2011) Variation in genome- 
wide mutation rates within and between human families. 
Nat. Genet. 43, 712-714. 

Catarina D & Eichler EE (2013) Properties and rates of germline 
mutations in humans. Trends Genet. 29, 575-584. 

Cooper GM, Brudno M, Stone ES et al. (2004) Characterization of 
evolutionary rates and constraints in three mammalian genomes. 
Genome Res. 14, 539-548. 

Hedges SB (2002) The origin and evolution of model organisms. Nat. 
Rev. Genet. 3, 838-849. 

King MC & Wilson AC (1965) Evolution at two levels in humans and 
chimpanzees. Science 188, 107-116. 


DNA Replication Mechanisms 


Alberts B (1998) The cell as a collection of protein machines: preparing 
the next generation of molecular biologists. Cell 92, 291-294. 

Kelch BA, Makino DL, O’Donnell M et al. (2011) How a DNA 
polymerase clamp loader opens a sliding clamp. Science 334, 
1675-1680. 

Kornberg A (1960) Biological synthesis of DNA. Science 131, 1503- 
1508. 

Li JJ & Kelly TJ (1984) SV40 DNA replication in vitro. Proc. Natl. Acad. 
Sci. USA 81, 6973-6977. 

Meselson M & Stahl FW (1958) The replication of DNA in E. coli. 

Proc. Natl. Acad. Sci. USA 44, 671-682. 

Modrich P & Lahue R (1996) Mismatch repair in replication fidelity, 
genetic recombination, and cancer biology. Annu. Rev. Biochem. 
65, 101-133. 

O’Donnell M, Langston L & Stillman B (2013) Principals and concepts 
of DNA replication in Bacteria, Archaea, and Eukarya. Cold Spring 
Harb. Lab. Perspect. Biol. 195, 1231-1240. 

Okazaki R, Okazaki T, Sakabe K et al. (1968) Mechanism of DNA chain 
growth. |. Possible discontinuity and unusual Secondary structure of 
newly synthesized chains. Proc. Natl. Acad. Sci. USA 59, 598-605. 

Raghuraman Mk, Winzeler EA, Collingwood D et al. (2001) Replication 
dynamics of the yeast genome. Science 294, 115-121. 

Rao PN & Johnson RT (1970) Mammalian cell fusion: studies on the 
regulation of DNA synthesis and mitosis. Nature 225, 159. 

Vos SM, Tretter EM, Schmidt BH et al. (2011) All tangled up: how cells 
direct, manage and exploit topoisomerase function. Nat. Rev. Mol. 
Cell Biol. 12, 827-841. 


The Initiation and Completion of DNA Replication in 
Chromosomes 


Chan SR & Blackburn EH (2004) Telomeres and telomerase. Philos. 


Trans. R. Soc. Lond. B Bio. Sci. 359, 109-121. 


Gilbert DM (2010) Evaluating genome-scale approaches to eukaryotic 
DNA replication. Nat. Rev. Genet. 11, 673-684. 


deLang T (2009) How telomeres solve the end-protection problem. 
Science 326, 948-952. 

Mechali M (2010) Eukaryotic DNA replication origins: many choices for 
appropriate answers. Nat. Rev. Mol. Cell Biol. 11, 728-738. 

Nandakumar J & Cech T (2013) Finding the end: recruitment of 
telomerase to telomeres. Nat. Rev. Mol. Cell Biol. 14, 69-82. 


DNA Repair 


Goodman MF & Woodgate, R (2013) Translesion DNA polymerases. 
Cold Spring Harb. Perspect. Biol. 5, a010363. 

Hanawalt PC & Spivak G (2008) Transcription-coupled DNA repair: 
two decades of progress and surprises. Nat. Rev. Mol. Cell Biol. 
9, 958-970. 

Lindahl T (1998) Instability and decay of the primary structure of DNA. 
Nature 362, 709-715. 

Malkova A & Haber JE (2012) Mutations arising during repair of 
chromosome breaks. Annu. Rev. Genet. 46, 455-478. 

Prakash S, Johnson RE & Prakash L (2005) Eukaryotic translesion 
synthesis DNA polymerases: specificity of structure and function. 
Annu. Rev. Biochem. 74, 317-358. 

Reardon JT & Sancar A (2005) Nucleotide excision repair. Prog. Nucleic 
Acid Res. Mol. Biol. 79, 183-235. 


Homologous Recombination 


Chen Z, Yang H & Pavietich NP (2008) Mechanism of homologous 
recombination from the RecA-ssDNA/dsDNA structures. Nature 
453, 489-494. 

Cox MM (2001) Historical overview: searching for replication help in all 
of the rec places. Proc. Natl. Acad. Sci. USA 98, 8173-8180. 

Heyer WD, Ehmsen KT & Liu J (2010) Regulation of homologous 
recombination in eukaryotes. Annu. Rev. Genet. 44, 113-139. 

Holliday R (1990) The history of the DNA heteroduplex. BioEssays 
12, 133-142. 

Hunter N (2006) Meiotic recombination. In Topics in Current Genetics, 
Molecular Genetics of Recombination, Aguilera A & Rothstein R 
(eds), pp. 8381-422. Springer-Verlag: Heidelberg. 

de Massy B (2013) Initiation of meiotic recombination: how and where? 
Conservation and specificities among eukaryotes. Annu. Rev. 
Genet. 47, 563-599. 

Michel B, Gromponee G, Flores MJ & Bidnenko V (2004) Multiple 
pathways process stalled replication forks. Proc. Natl. Acad. Sci. 
USA 101, 12783-12788. 

Moynahan ME & Jasin M (2010) Mitotic homologous recombination 
maintains genomic stability and suppresses tumorigenesis. Nat. 
Rev. Mol. Cell Biol. 11, 196-207. 

Szostak JW, Orr-Weaver TK, Rothstein Ru et al. (1983) The double- 
strand break repair model for recombination. Cell 33, 25-35. 

West SC (2003) Molecular views of recombination proteins and their 
control. Nat. Rev. Mol. Cell Biol. 4(6), 485-445. 

Yeeles JY, Poli J, Marians KJ et al. (2013) Rescuing stalled or damaged 
replication forks. Cold Spring Harb. Perspect. Biol. 5, a012815. 

Zickler D & Kleckner N (1999) Meiotic chromosomes: integrating 
structure and function. Annu. Rev. Genet. 33, 603-754. 


Transposition and Conservative Site-specific 
Recombination 


Comfort NC (2001) From controlling elements to transposons: 
Barbara McClintock and the Nobel Prize. Trends Biochem. Sci. 
26, 454-457. 

Grindley ND, Whiteson KL & Rice PA (2006) Mechanisms of site- 
specific recombination. Annu. Rev. Biochem. 75, 567-605. 

Huang, CR, Burns KH & Boeke JD (2012) Active transposition in 
genomes. Annu. Rev. Genet. 46, 651-675. 


Varmus H (1988) Retroviruses. Science 240, 1427-1435. 


How Cells Read the Genome: 
From DNA to Protein 


Since the structure of DNA was discovered in the early 1950s, progress in cell and 
molecular biology has been astounding. We now know the complete genome 
sequences for thousands of different organisms, revealing fascinating details of 
their biochemistry as well as important clues as to how these organisms evolved. 
Complete genome sequences have also been obtained for thousands of individ- 
ual humans, as well as for a few of our now-extinct relatives, such as the Neander- 
thals. Knowing the maximum amount of information that is required to produce 
a complex organism like ourselves puts constraints on the biochemical and struc- 
tural features of cells and makes it clear that biology is not infinitely complex. 

As discussed in Chapter 1, the DNA in genomes does not direct protein synthe- 
sis itself, but instead uses RNA as an intermediary. When the cell needs a particu- 
lar protein, the nucleotide sequence of the appropriate portion of the immensely 
long DNA molecule in a chromosome is first copied into RNA (a process called 
transcription). It is these RNA copies of segments of the DNA that are used directly 
as templates to direct the synthesis of the protein (a process called translation). 
The flow of genetic information in cells is therefore from DNA to RNA to protein 
(Figure 6-1). All cells, from bacteria to humans, express their genetic informa- 
tion in this way—a principle so fundamental that it is termed the central dogma 
of molecular biology. Despite the universality of the central dogma of molecular 
biology, there are important variations between organisms in the way in which 
information flows from DNA to protein. Principal among these is that RNA tran- 
scripts in eukaryotic cells are subject to a series of processing steps in the nucleus, 
including RNA splicing, before they are permitted to exit from the nucleus and be 
translated into protein. As we discuss in this chapter, these processing steps can 
critically change the “meaning” of an RNA molecule and are therefore crucial for 
understanding how eukaryotic cells read their genome. 

Although we focus on the production of the proteins encoded by the genome 
in this chapter, we see that for many genes, RNA is the final product. Like pro- 
teins, some of these RNAs fold into precise three-dimensional structures that have 
structural and catalytic roles in the cell. Other RNAs, as we discuss in the next 
chapter, act primarily as regulators of gene expression. But the roles of many non- 
coding RNAs are not yet known. 

One might have predicted that the information present in genomes would 
be arranged in an orderly fashion, resembling a dictionary or a telephone direc- 
tory. But it turns out that the genomes of most multicellular organisms are sur- 
prisingly disorderly, reflecting their chaotic evolutionary histories. The genes in 
these organisms largely consist of a long string of alternating short exons and long 
introns, as discussed in Chapter 4 (see Figure 4-15D). Moreover, small bits of DNA 
sequence that code for protein are interspersed with large blocks of seemingly 
meaningless DNA. Some sections of the genome contain many genes and oth- 
ers lack genes altogether. Proteins that work closely with one another in the cell 
often have their genes located on different chromosomes, and adjacent genes typ- 
ically encode proteins that have little to do with each other in the cell. Decoding 
genomes is therefore no simple matter. Even with the aid of powerful computers, 
it is difficult for researchers to locate definitively the beginning and end of genes, 
much less to decipher when and where each gene is expressed in the life of the 


299 


CHAPTER 


IN THIS CHAPTER 
FROM DNA TO RNA 


FROM RNA TO PROTEIN 


THE RNA WORLD AND THE 
ORIGINS OF LIFE 


DNA replication 


DNA repair 
(= sereen) 
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RNA synthesis 
(transcription) 


RNA 
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protein synthesis 
(translation) 
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Figure 6-1 The pathway from DNA to 
protein. The flow of genetic information 
from DNA to RNA (transcription) and from 
RNA to protein (translation) occurs in all 
living cells. 
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human X chromosome: 155 million nucleotide base pairs (~5% of genome) 





total length of this section = 1.25 million nucleotide pairs 
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Figure 6-2 Schematic depiction of a small portion of the human X chromosome. As summarized in the key, the Known 
protein-coding genes (starting with Abcd7 and ending with F8) are shown in dark gray, with coding regions (exons) indicated by 
bars that extend above and below the central line. Noncoding RNAs with known functions are indicated by purple diamonds. 
Yellow triangles indicate positions within protein-coding regions where the Neanderthal genome sequences codes for a different 
amino acid than the human genome. The stretch of yellow triangles in the Txt/7 gene appear to have been positively selected 
for since the divergence of Homo sapiens from Neanderthals some 200,000 years ago. Note that most of the proteins are 
identical between us and our extinct relative. The blue histogram indicates the extent to which portions of the human genome 
are conserved with other vertebrate species. It is likely that additional genes, currently unrecognized, also lie within this portion 


of the human genome. 


Genes whose mutation causes an inherited human condition are indicated by red brackets. The Abcd7 gene codes for a 
protein that imports fatty acids into the peroxisome; mutations in the gene cause demylination of nerves which can result in 
cognition and movement disorders. /ncontinentia pigmenti is a disease of the skin, hair, nails, teeth, and eyes. Hemophilia A is 
a bleeding disorder caused by mutations in the Factor VIII gene, which codes for a blood-clotting protein. Because males have 
only a single copy of the X chromosome, most of the conditions shown here affect only males; females that inherit one of these 
defective genes are often asymptomatic because a functional protein is made from their other X chromosome. (Courtesy of Alex 
Williams, obtained from the University of California, Genome Browser, http://genome.ucsc.edu) 


organism. Yet the cells in our body do this automatically, thousands of times a 
second. 

The problems that cells face in decoding genomes can be appreciated by con- 
sidering a tiny portion of the human genome (Figure 6-2). The region illustrated 
represents less than 1/2000th of our genome and includes at least 48 genes that 
encode proteins and 6 genes for noncoding RNAs. When we consider the entire 
human genome, we can only marvel at the capacity of our cells to rapidly and 
accurately handle such large amounts of information. 

In this chapter, we explain how cells decode and use the information in their 
genomes. Much has been learned about how the genetic instructions written in 
an alphabet of just four “letters” —the four different nucleotides in DNA—direct 
the formation of a bacterium, a fruit fly, or a human. Nevertheless, we still have a 
great deal to discover about how the information stored in an organism’s genome 
produces even the simplest unicellular bacterium with 500 genes, let alone how 
it directs the development of a human with approximately 30,000 genes. An enor- 
mous amount of ignorance remains; many fascinating challenges therefore await 
the next generation of cell biologists. 


FROM DNA TO RNA 


Transcription and translation are the means by which cells read out, or express, 
the genetic instructions in their genes. Because many identical RNA copies can 
be made from the same gene, and each RNA molecule can direct the synthesis of 
many identical protein molecules, cells can synthesize a large amount of protein 
from a gene when necessary. But genes can be transcribed and translated with 
different efficiencies, allowing the cell to make vast quantities of some proteins 
and tiny amounts of others (Figure 6-3). Moreover, as we see in the next chapter, 
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Figure 6-3 Genes can be expressed with 
different efficiencies. In this example, 

gene A is transcribed much more efficiently 
than gene B and each RNA molecule that it 
produces is also translated more frequently. 
This causes the amount of protein A in the cell 
to be much greater than that of protein B. 
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a cell can change (or regulate) the expression of each of its genes according to its 
needs—most commonly by controlling the production of its RNA. 


RNA Molecules Are Single-Stranded 


The first step a cell takes in reading out a needed part of its genetic instructions 
is to copy a particular portion of its DNA nucleotide sequence—a gene—into an 
RNA nucleotide sequence (Figure 6-4). The information in RNA, although copied 
into another chemical form, is still written in essentially the same language as it 
is in DNA—the language of a nucleotide sequence. Hence the name given to pro- 
ducing RNA molecules on DNA is transcription. 

Like DNA, RNA is a linear polymer made of four different types of nucleotide 
subunits linked together by phosphodiester bonds (see Figure 6-4). It differs from 
DNA chemically in two respects: (1) the nucleotides in RNA are ribonucleotides— 
that is, they contain the sugar ribose (hence the name ribonucleic acid) rather 
than deoxyribose; (2) although, like DNA, RNA contains the bases adenine (A), 
guanine (G), and cytosine (C), it contains the base uracil (U) instead of the thy- 
mine (T) in DNA (Figure 6-5). Since U, like T, can base-pair by hydrogen-bonding 
with A (Figure 6-6), the complementary base-pairing properties described for 
DNA in Chapters 4 and 5 apply also to RNA (in RNA, G pairs with C, and A pairs 
with U). We also find other types of base pairs in RNA: for example, G occasionally 
pairs with U. 

Although these chemical differences are slight, DNA and RNA differ quite 
dramatically in overall structure. Whereas DNA always occurs in cells as a dou- 
ble-stranded helix, RNA is single-stranded. An RNA chain can therefore fold up 
into a particular shape, just as a polypeptide chain folds up to form the final shape 
of a protein (Figure 6-7). As we see later in this chapter, the ability to fold into 
complex three-dimensional shapes allows some RNA molecules to have precise 
structural and catalytic functions. 


Transcription Produces RNA Complementary to One Strand 
of DNA 


The RNA in a cell is made by DNA transcription, a process that has certain sim- 
ilarities to the process of DNA replication discussed in Chapter 5. Transcription 
begins with the opening and unwinding of a small portion of the DNA double 
helix to expose the bases on each DNA strand. One of the two strands of the DNA 
double helix then acts as a template for the synthesis of an RNA molecule. As in 
DNA replication, the nucleotide sequence of the RNA chain is determined by 
the complementary base-pairing between incoming nucleotides and the DNA 
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Figure 6-4 A short length of RNA. The 
phosphodiester chemical linkage between 
nucleotides in RNA is the same as that in DNA. 


Figure 6-5 The chemical structure of 
RNA. (A) RNA contains the sugar ribose, 
which differs from deoxyribose, the sugar 
used in DNA, by the presence of an 
additional -OH group. (B) RNA contains the 
base uracil, which differs from thymine, the 
equivalent base in DNA, by the absence of 
a -CH3 group. 
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Figure 6-6 Uracil forms base pairs with adenine. The absence of a 
methyl group in U has no effect on base-pairing; thus, U-A base pairs closely 
resemble T-A base pairs (See Figure 4—4). 


template. When a good match is made (A with T, U with A, G with C, and C with 
G), the incoming ribonucleotide is covalently linked to the growing RNA chain in 
an enzymatically catalyzed reaction. The RNA chain produced by transcription— 
the transcript—is therefore elongated one nucleotide at a time, and it has a nucle- 
otide sequence that is exactly complementary to the strand of DNA used as the 
template (Figure 6-8). 

Transcription, however, differs from DNA replication in several crucial ways. 
Unlike a newly formed DNA strand, the RNA strand does not remain hydro- 
gen-bonded to the DNA template strand. Instead, just behind the region where 
the ribonucleotides are being added, the RNA chain is displaced and the DNA 
helix re-forms. Thus, the RNA molecules produced by transcription are released 
from the DNA template as single strands. In addition, because they are copied 
from only a limited region of the DNA, RNA molecules are much shorter than 
DNA molecules. A DNA molecule in a human chromosome can be up to 250 mil- 
lion nucleotide-pairs long; in contrast, most RNAs are no more than a few thou- 
sand nucleotides long, and many are considerably shorter. 


RNA Polymerases Carry Out Transcription 


The enzymes that perform transcription are called RNA polymerases. Like the 
DNA polymerase that catalyzes DNA replication (discussed in Chapter 5), RNA 
polymerases catalyze the formation of the phosphodiester bonds that link the 
nucleotides together to form a linear chain. The RNA polymerase moves step- 
wise along the DNA, unwinding the DNA helix just ahead of the active site for 
polymerization to expose a new region of the template strand for complementary 
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Figure 6-7 RNA can fold into specific 
structures. RNA is largely single-stranded, 
but it often contains short stretches of 
nucleotides that can form conventional 
base pairs with complementary sequences 
found elsewhere on the same molecule. 
These interactions, along with additional 
“nonconventional” base-pair interactions, 
allow an RNA molecule to fold into a three- 
dimensional structure that is determined by 
its sequence of nucleotides (Movie 6.1). 
(A) Diagram of a folded RNA structure 
showing only conventional base-pair 
interactions. (B) Structure with both 
conventional (red) and nonconventional 
(green) base-pair interactions. 

(C) Structure of an actual RNA, one that 
catalyzes its own splicing (see p. 324). 
Each conventional base-pair interaction is 
indicated by a “rung” in the double helix. 
Bases in other configurations are indicated 
by broken rungs. 
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base-pairing. In this way, the growing RNA chain is extended by one nucleotide 
at a time in the 5’-to-3’ direction (Figure 6-9). The substrates are ribonucleoside 
triphosphates (ATP, CTP, UTP, and GTP); as in DNA replication, the hydrolysis of 
high-energy bonds provides the energy needed to drive the reaction forward (see 
Figure 5-4 and Movie 6.2). 

The almost immediate release of the RNA strand from the DNA as it is synthe- 
sized means that many RNA copies can be made from the same gene in a relatively 
short time, with the synthesis of additional RNA molecules being started before 
the previous RNA molecules are completed (Figure 6-10). When RNA polymerase 
molecules follow hard on each other’s heels in this way, each moving at about 
50 nucleotides per second, over a thousand transcripts can be synthesized in an 
hour from a single gene. 

Although RNA polymerase catalyzes essentially the same chemical reaction as 
DNA polymerase, there are some important differences between the activities of 
the two enzymes. First, and most obviously, RNA polymerase catalyzes the link- 
age of ribonucleotides, not deoxyribonucleotides. Second, unlike the DNA poly- 
merases involved in DNA replication, RNA polymerases can start an RNA chain 
without a primer. This difference is thought possible because transcription need 
not be as accurate as DNA replication (see Table 5-1, p. 244). RNA polymerases 
make about one mistake for every 10* nucleotides copied into RNA (compared 
with an error rate for direct copying by DNA polymerase of about one in 10’ nucle- 
otides); and the consequences of an error in RNA transcription are much less sig- 
nificant as RNA does not permanently store genetic information in cells. Finally, 
unlike DNA polymerases, which make their products in segments that are later 
stitched together, RNA polymerases are absolutely processive; that is, the same 
RNA polymerase that begins an RNA molecule must finish it without dissociating 
from the DNA template. 

Although not nearly as accurate as the DNA polymerases that replicate DNA, 
RNA polymerases nonetheless have a modest proofreading mechanism. If an 
incorrect ribonucleotide is added to the growing RNA chain, the polymerase can 
back up, and the active site of the enzyme can perform an excision reaction that 
resembles the reverse of the polymerization reaction, except that a water mole- 
cule replaces the pyrophosphate and a nucleoside monophosphate is released. 

Given that DNA and RNA polymerases both carry out template-dependent 
nucleotide polymerization, it might be expected that the two types of enzymes 
would be structurally related. However, x-ray crystallographic studies reveal that, 
other than containing a critical Mg** ion at the catalytic site, the two enzymes are 
quite different. Template-dependent nucleotide-polymerizing enzymes seem to 
have arisen at least twice during the early evolution of cells. One lineage led to the 


short region of 
DNA/RNA helix 








newly synthesized 


RNA transcript downstream 


DNA double helix 


QKA 
5 5' 


direction of 
template transcription 


DNA strand 


Mg”* at 
active site 


ribonucleoside 
triphosphate uptake 


RNA polymerase channel 


DNA 


5 M > 





EEHEHE HENHEN 5 


template strand 


| TRANSCRIPTION 

5! HHHNHH: 
Ze cf SESE” 

RNA 


Figure 6-8 DNA transcription produces 
a single-stranded RNA molecule that 

is complementary to one strand of the 
DNA double helix. Note that the sequence 
of bases in the RNA molecule produced is 
the same as the sequence of bases in the 
non-template DNA strand, except that a U 
replaces every T base in the DNA. 


Figure 6-9 DNA is transcribed by the 
enzyme RNA polymerase. The RNA 
polymerase (pale blue) moves stepwise 
along the DNA, unwinding the DNA helix 
at its active site indicated by the Mg?t 
(red), which is required for catalysis. 

As it progresses, the polymerase adds 
nucleotides one by one to the RNA chain at 
the polymerization site, using an exposed 
DNA strand as a template. The RNA 
transcript is thus a complementary copy of 
one of the two DNA strands. A short region 
of DNA/RNA helix (approximately nine 
nucleotide pairs in length) is formed only 
transiently, and a “window” of DNA/RNA 
helix therefore moves along the DNA with 
the polymerase as the DNA double helix 
reforms behind it. The incoming nucleotides 
are in the form of ribonucleoside 
triphosphates (ATP, UTP, CTP, and GTP), 
and the energy stored in their phosphate- 
phosphate bonds provides the driving 
force for the polymerization reaction (See 
Figure 5—4). The figure, based on an x-ray 
crystallographic structure, shows a cut- 
away view of the polymerase: the part 
facing the viewer has been sliced away 

to reveal the interior (Movie 6.3). 

(Adapted from P. Cramer et al., Science 
288:640-649, 2000; PDB code: 1HQM.) 
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modern DNA polymerases and reverse transcriptases discussed in Chapter 5, as 
well as to a few RNA polymerases from viruses. The other lineage formed all of the 
modern RNA polymerases that we discuss in this chapter. 


Cells Produce Different Categories of RNA Molecules 


The majority of genes carried in a cell’s DNA specify the amino acid sequence of 
proteins; the RNA molecules that are copied from these genes (which ultimately 
direct the synthesis of proteins) are called messenger RNA (mRNA) molecules. The 
final product of other genes, however, is the RNA molecule itself. These RNAs are 
known as noncoding RNAs because they do not code for protein. In a well-stud- 
ied, single-celled eukaryote, the yeast Saccharomyces cerevisiae, over 1200 genes 
(more than 15% of the total) produce RNA as their final product. Humans may 
produce on the order of ten thousand noncoding RNAs. These RNAs, like proteins, 
serve as enzymatic, structural, and regulatory components for a wide variety of 
processes in the cell. In Chapter 5, we encountered one of them as the template 
carried by the enzyme telomerase. Although many of the noncoding RNAs are still 
mysterious, we shall see in this chapter that small nuclear RNA (snRNA) mole- 
cules direct the splicing of pre-mRNA to form mRNA, that ribosomal RNA (rRNA) 
molecules form the core of ribosomes, and that transfer RNA (tRNA) molecules 
form the adaptors that select amino acids and hold them in place on a ribosome 
for incorporation into protein. In Chapter 7, we shall see that microRNA (miRNA) 
molecules and small interfering RNA (siRNA) molecules serve as key regulators 
of eukaryotic gene expression, and that piwi-interacting RNAs (piRNAs) pro- 
tect animal germ lines from transposons; we also discuss the long noncoding 
RNAs (IncRNAs), a diverse set of RNAs whose functions are just being discovered 
(Table 6-1). 


TABLE 6-1 
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Figure 6-10 Transcription of two 

genes as observed under the electron 
microscope. The micrograph shows 

many molecules of RNA polymerase 
simultaneously transcribing each of 

two adjacent genes. Molecules of RNA 
polymerase are visible as a series of dots 
along the DNA with the newly synthesized 
transcripts (fine threads) attached to them. 
The RNA molecules (ribosomal RNAs) 
shown in this example are not translated 
into protein but are instead used directly as 
components of ribosomes, the machines 
on which translation takes place. The 
particles at the 5’ end (the free end) of each 
rRNA transcript are believed to reflect the 
beginnings of ribosome assembly. From the 
relative lengths of the newly synthesized 
transcripts, it can be deduced that the RNA 
polymerase molecules are transcribing from 
left to right. (Courtesy of Ulrich Scheer.) 


Small nucleolar RNAs, help to process and chemically modify rRNAs 
MicroRNAs, regulate gene expression by blocking translation of specific mRNAs and cause their degradation 


siRNAs Small interfering RNAs, turn off gene expression by directing the degradation of selective mRNAs and the 
establishment of compact chromatin structures 


pIRNAS Piwi-interacting RNAs, bind to piwi proteins and protect the germ line from transposable elements 


INCRNAs 
X-chromosome inactivation 


Long noncoding RNAs, many of which serve as scaffolds; they regulate diverse cell processes, including 
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Each transcribed segment of DNA is called a transcription unit. In eukaryotes, 
a transcription unit typically carries the information of just one gene, and there- 
fore codes for either a single RNA molecule or a single protein (or group of related 
proteins if the initial RNA transcript is spliced in more than one way to produce 
different mRNAs). In bacteria, a set of adjacent genes is often transcribed as a 
unit; the resulting mRNA molecule therefore carries the information for several 
distinct proteins. 

Overall, RNA makes up a few percent of a cell’s dry weight, whereas proteins 
comprise about 50%. Most of the RNA in cells is rRNA; mRNA comprises only 
3-5% of the total RNA in a typical mammalian cell. The mRNA population is made 
up of tens of thousands of different species, and there are on average only 10-15 
molecules of each species of mRNA present in each cell. 


Signals Encoded in DNA Tell RNA Polymerase Where to Start and 
Stop 


To transcribe a gene accurately, RNA polymerase must recognize where on the 
genome to start and where to finish. The way in which RNA polymerases perform 
these tasks differs somewhat between bacteria and eukaryotes. Because the pro- 
cesses in bacteria are simpler, we discuss them first. 

The initiation of transcription is an especially important step in gene expres- 
sion because it is the main point at which the cell regulates which proteins are to 
be produced and at what rate. The bacterial RNA polymerase core enzyme is a 
multisubunit complex that synthesizes RNA using the DNA template as a guide. 
An additional subunit called sigma (o) factor associates with the core enzyme and 
assists it in reading the signals in the DNA that tell it where to begin transcribing 
(Figure 6-11). Together, © factor and core enzyme are known as the RNA poly- 
merase holoenzyme; this complex adheres only weakly to bacterial DNA when the 
two collide, and a holoenzyme typically slides rapidly along the long DNA mole- 
cule and then dissociates. However, when the polymerase holoenzyme slides into 
a special sequence of nucleotides indicating the starting point for RNA synthesis 
called a promoter, the polymerase binds tightly, because its o factor makes spe- 
cific contacts with the edges of bases exposed on the outside of the DNA double 
helix (step 1 in Figure 6-11A). 

The tightly bound RNA polymerase holoenzyme at a promoter opens up the 
double helix to expose a short stretch of nucleotides on each strand (step 2 in 
Figure 6-11A). The region of unpaired DNA (about 10 nucleotides) is called the 
transcription bubble and it is stabilized by the binding of o factor to the unpaired 
bases on one of the exposed strands. The other exposed DNA strand then acts as 
a template for complementary base-pairing with incoming ribonucleotides, two 
of which are joined together by the polymerase to begin an RNA chain (step 3 
in Figure 6-11A). The first ten or so nucleotides of RNA are synthesized using a 
“scrunching” mechanism, in which RNA polymerase remains bound to the pro- 
moter and pulls the upstream DNA into its active site, thereby expanding the 
transcription bubble. This process creates considerable stress and the short RNAs 
are often released, thereby relieving the stress and forcing the polymerase, which 
remains in place, to begin synthesis over again. Eventually this process of abor- 
tive initiation is overcome and the stress generated by scrunching helps the core 
enzyme to break free of its interactions with the promoter DNA (step 4 in Figure 
6-11A) and discard the o factor (step 5 in Figure 6-11A). At this point, the poly- 
merase begins to move down the DNA, synthesizing RNA, in a stepwise fashion: 
the polymerase moves forward one base pair for every nucleotide added. During 
this process, the transcription bubble continually expands at the front of the poly- 
merase and contracts at its rear. Chain elongation continues (at a speed of approx- 
imately 50 nucleotides/sec for bacterial RNA polymerases) until the enzyme 
encounters a second signal, the terminator (step 6 in Figure 6-11A), where the 
polymerase halts and releases both the newly made RNA molecule and the DNA 
template (step 7 in Figure 6-11A). The free polymerase core enzyme then reas- 
sociates with a free o factor to form a holoenzyme that can begin the process of 
transcription again (step 8 in Figure 6-11A). 
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The process of transcription initiation is complicated and requires that the 
RNA polymerase holoenzyme and the DNA undergo a series of conformational 
changes. We can view these changes as opening up and positioning the DNA in 
the active site followed by a successive tightening of the enzyme around the DNA 
and RNA to ensure that it does not dissociate before it has finished transcribing a 
gene. If an RNA polymerase does dissociate prematurely, it must start over again 
at the promoter. 

How do the termination signals in the DNA stop the elongating polymerase? 
For most bacterial genes, a termination signal consists of a string of A-T nucleotide 
pairs preceded by a twofold symmetric DNA sequence, which, when transcribed 
into RNA, folds into a “hairpin” structure through Watson-Crick base-pairing (see 
Figure 6-11A). As the polymerase transcribes across a terminator, the formation 
of the hairpin helps to disengage the RNA transcript from the active site (step 7 
in Figure 6-11A). The process of termination provides an example of a common 
theme in this chapter: the folding of RNA into specific structures affects many 
steps in decoding the genome. 


Transcription Start and Stop Signals Are Heterogeneous in 
Nucleotide Sequence 


As we have just seen, the processes of transcription initiation and termination 
involve a complicated series of structural transitions in protein, DNA, and RNA 
molecules. The signals encoded in DNA that specify these transitions are often dif- 
ficult for researchers to recognize. Indeed, a comparison of many different bacte- 
rial promoters reveals a surprising degree of variation. Nevertheless, they all con- 
tain related sequences, reflecting aspects of the DNA that are recognized directly 
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Figure 6-11 The transcription cycle 

of bacterial RNA polymerase. (A) In 
step 1, the RNA polymerase holoenzyme 
(polymerase core enzyme plus o factor) 
assembles and then locates a promoter 
DNA sequence (see Figure 6-12). The 
polymerase opens (unwinds) the DNA at 
the position at which transcription is to 
begin (step 2) and begins transcribing 
(step 3). This initial RNA synthesis 
(abortive initiation) is relatively inefficient 
as short, unproductive transcripts are 
often released. However, once RNA 
polymerase has managed to synthesize 
about 10 nucleotides of RNA, it breaks its 
interactions with the promoter DNA (step 
4) and eventually releases o factor—as 
the polymerase tightens around the DNA 
and shifts to the elongation mode of RNA 
synthesis, moving along the DNA (step 5). 
During the elongation mode, transcription 
is highly processive, with the polymerase 
leaving the DNA template and releasing 
the newly transcribed RNA only when it 
encounters a termination signal (steps 6 
and 7). Termination signals are typically 
encoded in DNA, and many function by 
forming an RNA hairpin-like structure that 
destabilizes the polymerase’s hold on the 
RNA. 

In bacteria, all RNA molecules are 
synthesized by a single type of RNA 
polymerase, and the cycle depicted in the 
figure therefore applies to the production of 
mRNAs as well as structural and catalytic 
RNAs. (B) Two-dimensional image of an 
elongating bacterial RNA polymerase, as 
determined by atomic force microscopy 
(see Figure 9-33). (C) Interpretation of the 
image in (B). (Adapted from K.M. Herbert 
et al., Annu. Rev. Biochem. 77:149-176, 
2008.) 
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by the o factor. These common features are often summarized in the form of a 
consensus sequence (Figure 6-12). A consensus nucleotide sequence is derived 
by comparing many sequences with the same basic function and tallying up the 
most common nucleotides found at each position. It therefore serves as a sum- 
mary or “average” of a large number of individual nucleotide sequences. A more 
accurate way of displaying the range of DNA sequences recognized by a protein is 
through the use of a sequence logo, which reveals the relative frequencies of each 
nucleotide at each position (Figure 6-12C). 

The DNA sequences of individual bacterial promoters differ in ways that deter- 
mine their strength (the number of initiation events per unit time of the promoter). 
Evolutionary processes have fine-tuned each to initiate as often as necessary and 
have thereby created a wide spectrum of promoter strengths. Promoters for genes 
that code for abundant proteins are much stronger than those associated with 
genes that encode rare proteins, and the nucleotide sequences of their promoters 
are responsible for these differences. 

Like bacterial promoters, transcription terminators also have a wide range of 
sequences, with the potential to form a simple hairpin RNA structure being the 
most important common feature. Since an almost unlimited number of nucleo- 
tide sequences have this potential, terminator sequences are even more heteroge- 
neous than promoter sequences. 

We have discussed bacterial promoters and terminators in some detail to illus- 
trate an important point regarding the analysis of gnome sequences. Although 
we know a great deal about bacterial promoters and terminators and can con- 
struct consensus sequences that summarize their most salient features, their vari- 
ation in nucleotide sequence makes it difficult to definitively locate them simply 
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Figure 6-12 Consensus nucleotide 
sequence and sequence logo for the 
major class of E. coli promoters. 

(A) On the basis of a comparison of 300 
promoters, the frequencies of each of 

the four nucleotides at each position in 

the promoter are given. The consensus 
sequence, shown below the graph, reflects 
the most common nucleotide found at 
each position in the collection of promoters. 
These promoters are characterized by 

two hexameric DNA sequences — the -35 
sequence and the -10 sequence, named 
for their approximate location relative to 
the start point of transcription (designated 
+1). The sequence of nucleotides between 
the -35 and -10 hexamers shows no 
significant similarities among promoters. 
For convenience, the nucleotide sequence 
of a single strand of DNA is shown; in 
reality, promoters are double-stranded 
DNA. The nucleotides shown in the figure 
are recognized by o factor, a subunit of 
the RNA polymerase holoenzyme. 

(B) The distribution of spacing between 
the -35 and -10 hexamers found in E. coli 
promoters. (C) A sequence logo displaying 
the same information as in panel (A). Here, 
the height of each letter is proportional to 
the frequency at which that base occurs 

at that position across a wide variety of 
promoter sequences. The total height of all 
the letters at each position is proportional 
to the information content (expressed in 
bits) at that position. For example, the total 
information content of a position that can 
tolerate several different bases is small (see 
the last three bases of the -35 sequences), 
but statistically greater than random. 
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RNA transcripts Figure 6-13 Directions of transcription 
DNA of E. coli chromosome along a short portion of a bacterial 
œ genea hS gene d gene e 3 chromosome. Some genes are 


transcribed using one DNA strand as a 
template, while others are transcribed 

3' gene b pense genef geneg 5" using the other DNA strand. The direction 
of transcription is determined by the 
promoter at the beginning of each gene 
(green arrowheads). This diagram shows 
approximately 0.2% (9000 base pairs) 

of the E. coli chromosome. The genes 
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by analysis of the nucleotide sequence of a genome. It is even more difficult to bottom DNA strand as the template; those 


locate analogous sequences in eukaryotic genomes, due in part to the excess DNA transcribed from right to left use the top 
carried in these genomes. Often we need additional information, some of it from strand as the template. 
direct experimentation, to locate and accurately interpret the short DNA signals 
in genomes. 
As shown in Figure 6-11, promoter sequences are asymmetric, ensuring that 
RNA polymerase can bind in only one orientation. Because the polymerase can 
synthesize RNA only in the 5’-to-3' direction, the promoter orientation specifies 
the strand to be used as a template. Genome sequences reveal that the DNA strand 
that is used as the template for RNA synthesis varies from gene to gene, depending 
on the orientation of the promoter (Figure 6-13). 
Having considered transcription in bacteria, we now turn to the situation in 
eukaryotes, where the synthesis of RNA molecules is a much more elaborate affair. 
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Transcription Initiation in Eukaryotes Requires Many Proteins 


In contrast to bacteria, which contain a single type of RNA polymerase, eukary- 
otic nuclei have three: RNA polymerase I, RNA polymerase II, and RNA polymerase 
III. The three polymerases are structurally similar to one another and share some 
common subunits, but they transcribe different categories of genes (Table 6-2). 
RNA polymerases I and III transcribe the genes encoding transfer RNA, ribo- 
somal RNA, and various small RNAs. RNA polymerase II transcribes most genes, 
including all those that encode proteins, and our subsequent discussion therefore 
focuses on this enzyme. 

Eukaryotic RNA polymerase II has many structural similarities to bacterial 
RNA polymerase (Figure 6-14). But there are several important differences in the 
way in which the bacterial and eukaryotic enzymes function, two of which con- 
cern us immediately. 


1. While bacterial RNA polymerase requires only a single transcription- initi- 
ation factor (c) to begin transcription, eukaryotic RNA polymerases require 
many such factors, collectively called the general transcription factors. 


2. Eukaryotic transcription initiation must take place on DNA that is pack- 
aged into nucleosomes and higher-order forms of chromatin structure 
(described in Chapter 4), features that are absent from bacterial chromo- 
somes. 


TABLE 6-2 


5.85, 185, and 28S rRNA genes 


All protein-coding genes, plus snoRNA genes, miRNA 
genes, siRNA genes, IncRNA genes, and most snRNA genes 


tRNA genes, 5S rRNA genes, some snRNA genes, and 
genes for other small RNAs 


The rRNAs were named according to their “S” values, which refer to their rate of sedimentation 
in an ultracentrifuge. The larger the S value, the larger the rRNA. 
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RNA Polymerase II Requires a Set of General Transcription Factors 


The general transcription factors help to position eukaryotic RNA polymerase 
correctly at the promoter, aid in pulling apart the two strands of DNA to allow 
transcription to begin, and release RNA polymerase from the promoter to start its 
elongation mode. The proteins are “general” because they are needed at nearly all 
promoters used by RNA polymerase II. They consist of a set of interacting proteins 
denoted arbitrarily as TFIIA, TFIIB, TFIIC, TFIID, and so on (TFII standing for 
“transcription factor for polymerase II)” In a broad sense, the eukaryotic general 
transcription factors carry out functions equivalent to those of the o factor in bac- 
teria; indeed, portions of TFIIF have the same three-dimensional structure as the 
equivalent portions of ©. 

Figure 6-15 illustrates how the general transcription factors assemble at 
promoters used by RNA polymerase II, and Table 6-3 summarizes their activ- 
ities. The assembly process begins when TFIID binds to a short double-helical 
DNA sequence primarily composed of T and A nucleotides. For this reason, this 
sequence is known as the TATA sequence, or TATA box, and the subunit of TFIID 
that recognizes it is called TBP (for TATA-binding protein). The TATA box is typ- 
ically located 25 nucleotides upstream from the transcription start site. It is not 
the only DNA sequence that signals the start of transcription (Figure 6-16), but 
for most polymerase II promoters it is the most important. The binding of TFIID 


Figure 6-15 Initiation of transcription of a eukaryotic gene by RNA 
polymerase Il. To begin transcription, RNA polymerase requires several 
general transcription factors. (A) The promoter contains a DNA sequence 
called the TATA box, which is located 25 nucleotides away from the site at 
which transcription is initiated. (B) Through its subunit TBP, TFIID recognizes 
and binds the TATA box, which then enables the adjacent binding of TFIIB 
(C). For simplicity the DNA distortion produced by the binding of TFIID (see 
Figure 6-17) is not shown. (D) The rest of the general transcription factors, 
as well as the RNA polymerase itself, assemble at the promoter. (E) TFIIH 
then uses energy from ATP hydrolysis to pry apart the DNA double helix at 
the transcription start point, locally exposing the template strand. TFIIH also 
phosphorylates RNA polymerase ll, changing its conformation so that the 
polymerase is released from the general factors and can begin the elongation 
phase of transcription. As shown, the site of phosphorylation is a long 
C-terminal polypeptide tail, also called the C-terminal domain (CTD), that 
extends from the polymerase molecule. The assembly scheme shown in the 
figure was deduced from experiments performed in vitro, and the exact order 
in which the general transcription factors assemble on promoters probably 
varies from gene to gene in vivo. The general transcription factors are highly 
conserved; some of those from human cells can be replaced in biochemical 
experiments by the corresponding factors from simple yeasts. 











Figure 6-14 Structural similarity between 
a bacterial RNA polymerase and a 
eukaryotic RNA polymerase II. Regions 
of the two RNA polymerases that have 
similar structures are indicated in green. 
The eukaryotic polymerase is larger than 
the bacterial enzyme (12 subunits instead 
of 5), and some of the additional regions 
are shown in gray. The blue spheres 
represent Zn atoms that serve as structural 
components of the polymerases, and the 
red sphere represents the Mg atom present 
at the active site, where polymerization 
takes place. The RNA polymerases in all 
modern-day cells (bacteria, archaea, and 
eukaryotes) are closely related, indicating 
that the basic features of the enzyme were 
in place before the divergence of the three 
major branches of life. (Courtesy of 

P. Cramer and R. Kornberg.) 
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TABLE 6-3 


TFIID 


TBP subunit 
TAF subunits 


Recognizes TATA box 
Recognizes other DNA sequences near the 
transcription start point; regulates DNA-binding by TBP 


Recognizes BRE element in promoters; accurately 


positions RNA polymerase at the start site of 
transcription 


Stabilizes RNA polymerase interaction with TBP and 
TFIIB; helps attract TFIIE and TFIIH 
Attracts and regulates TFIIH 


Unwinds DNA at the transcription start point, 
phosphorylates Serd of the RNA polymerase CTD; 
releases RNA polymerase from the promoter 


TFIID is composed of TBP and ~11 additional subunits called TAFs (TBP-associated factors); 
CTD, C-terminal domain. 





causes a large distortion in the DNA of the TATA box (Figure 6-17). This distortion 
is thought to serve as a physical landmark for the location of an active promoter 
in the midst of a very large genome, and it brings DNA sequences on both sides 
of the distortion closer together to allow for subsequent protein assembly steps. 
Other factors then assemble, along with RNA polymerase II, to form a complete 
transcription initiation complex (see Figure 6-15). The most complicated of the 
general transcription factors is TFIIH. Consisting of nine subunits, it is nearly as 
large as RNA polymerase II itself and, as we shall see shortly, performs several 
enzymatic steps needed for the initiation of transcription. 

After forming a transcription initiation complex on the promoter DNA, RNA 
polymerase II must gain access to the template strand at the transcription start 
point. TFIIH, which contains a DNA helicase as one of its subunits, makes this 
step possible by hydrolyzing ATP and unwinding the DNA, thereby exposing the 
template strand. Next, RNA polymerase II, like the bacterial polymerase, remains 
at the promoter synthesizing short lengths of RNA until it undergoes a series of 
conformational changes that allow it to move away from the promoter and enter 
the elongation phase of transcription. A key step in this transition is the addi- 
tion of phosphate groups to the “tail” of the RNA polymerase (known as the CTD 
or C-terminal domain). In humans, the CTD consists of 52 tandem repeats of a 


transcription 


start point 
-35 -30 7 +30 
| ES |_| 
BRE TATA INR DPE 
| general 
consensus ae 
element j transcription 
sequence ae 





311 


Figure 6-16 Consensus sequences 

found in the vicinity of eukaryotic RNA 
polymerase II start points. The name given 
to each consensus sequence (first column) 
and the general transcription factor that 
recognizes it (last column) are indicated. 

N indicates any nucleotide, and two 
nucleotides separated by a slash indicate an 
equal probability of either nucleotide at the 
indicated position. In reality, each consensus 
sequence is a shorthand representation of a 
histogram similar to that of Figure 6-12. 

For most RNA polymerase II transcription 
start points, only two or three of the four 
sequences are present. For example, many 
polymerase II promoters have a TATA box 
sequence, but those that do not typically 
have a “strong” INR sequence. Although 
most of the DNA sequences that influence 
transcription initiation are located upstream 
of the transcription start point, a few, such 
as the DPE shown in the figure, are located 
in the transcribed region. 
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seven-amino-acid sequence, which extend from the RNA polymerase core struc- 
ture. During transcription initiation, the serine located at the fifth position in the 
repeat sequence (Ser5) is phosphorylated by TFIIH, which contains a protein 
kinase in one of its subunits (see Figure 6-15D and E). The polymerase can then 
disengage from the cluster of general transcription factors. During this process, 
it undergoes a series of conformational changes that tighten its interaction with 
DNA, and it acquires new proteins that allow it to transcribe for long distances, in 
some cases for many hours, without dissociating from DNA. 

Once the polymerase II has begun elongating the RNA transcript, most of the 
general transcription factors are released from the DNA so that they are available 
to initiate another round of transcription with a new RNA polymerase molecule. 
As we see shortly, the phosphorylation of the tail of RNA polymerase II has an 
additional function: it causes components of the RNA-processing machinery to 
load onto the polymerase and thus be positioned to modify the newly transcribed 
RNA as it emerges from the polymerase. 


Polymerase I| Also Requires Activator, Mediator, and Chromatin- 
Modifying Proteins 


Studies of RNA polymerase II and its general transcription factors acting on DNA 
templates in purified in vitro systems established the model for transcription 
initiation just described. However, as discussed in Chapter 4, DNA in eukaryotic 
cells is packaged into nucleosomes, which are further arranged in higher-order 
chromatin structures. As a result, transcription initiation in a eukaryotic cell is 
more complex and requires more proteins than it does on purified DNA. First, 
gene regulatory proteins known as transcriptional activators must bind to spe- 
cific sequences in DNA (called enhancers) and help to attract RNA polymerase 
II to the start point of transcription (Figure 6-18). We discuss the role of these 
activators in Chapter 7, because they are one of the main ways in which cells reg- 
ulate expression of their genes. Here we simply note that their presence on DNA 
is required for transcription initiation in a eukaryotic cell. Second, eukaryotic 
transcription initiation in vivo requires the presence of a large protein complex 
known as Mediator, which allows the activator proteins to communicate prop- 
erly with the polymerase II and with the general transcription factors. Finally, 
transcription initiation in a eukaryotic cell typically requires the recruitment of 
chromatin-modifying enzymes, including chromatin remodeling complexes and 


Figure 6-17 Three-dimensional structure 
of TBP (TATA-binding protein) bound to 
DNA. The TBP is the subunit of the general 
transcription factor TFIID that is responsible 
for recognizing and binding to the TATA 
box sequence in the DNA (red). The unique 
DNA bending caused by TBP—kinks in the 
double helix separated by partly unwound 
DNA—is thought to serve as a landmark 
that helps to attract the other general 
transcription factors (Movie 6.4). TBP is a 
single polypeptide chain that is folded into 
two very similar domains (blue and green). 
(Adapted from J.L. Kim et al., Nature 
365:520-527, 1993. With permission from 
Macmillan Publishers Ltd.) 
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histone-modifying enzymes. As discussed in Chapter 4, both types of enzymes 
can increase access to the DNA in chromatin, and by doing so they facilitate the 
assembly of the transcription initiation machinery onto DNA. 

As illustrated in Figure 6-18, many proteins (well over 100 individual sub- 
units) must assemble at the start point of transcription to initiate transcription in 
a eukaryotic cell. The order of assembly of these proteins does not seem to follow 
a prescribed pathway; rather, the order differs from gene to gene. Indeed, some of 
these different protein complexes may be brought to DNA as preformed subas- 
semblies. 

To begin transcribing, RNA polymerase II must be released from this large 
complex of proteins. In addition to the steps described in Figure 6-14, this release 
often requires the in situ proteolysis of the activator protein. We shall return to 
some of these issues, including the role of chromatin remodeling complexes and 
histone-modifying enzymes, in Chapter 7, where we discuss how eukaryotic cells 
regulate the process of transcription initiation. 


Transcription Elongation in Eukaryotes Requires Accessory 
Proteins 


Once RNA polymerase has initiated transcription, it moves jerkily, pausing at 
some DNA sequences and rapidly transcribing through others. Elongating RNA 
polymerases, both bacterial and eukaryotic, are associated with a series of elonga- 
tion factors, proteins that decrease the likelihood that RNA polymerase will disso- 
ciate before it reaches the end ofa gene. These factors typically associate with RNA 
polymerase shortly after initiation and help the polymerase move through the 
wide variety of different DNA sequences that are found in genes. Eukaryotic RNA 
polymerases must also contend with chromatin structure as they move along a 
DNA template, and they are typically aided by ATP-dependent chromatin remod- 
eling complexes that either move with the polymerase or may simply seek out and 
rescue the occasional stalled polymerase. In addition, histone chaperones help 
by partially disassembling nucleosomes in front of a moving RNA polymerase and 
assembling them behind. 

As RNA polymerase moves along a gene, some of the enzymes bound to it 
modify the histones, leaving behind a record of where the polymerase has been. 
Although it is not clear exactly how the cell uses this information, it may aid in 
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Figure 6-18 Transcription initiation by 
RNA polymerase Il in a eukaryotic cell. 
Transcription initiation in vivo requires 

the presence of transcription activator 
proteins. As described in Chapter 7, these 
proteins bind to specific short sequences 
in DNA. Although only one is shown 

here, a typical eukaryotic gene utilizes 
many transcription activator proteins, 
which in combination determine its rate 
and pattern of transcription. Sometimes 
acting from a distance of several thousand 
nucleotide pairs (indicated by the dashed 
DNA molecule), these proteins helo RNA 
polymerase, the general transcription 
factors, and Mediator all to assemble at 
the promoter. In addition, activators attract 
ATP-dependent chromatin remodeling 
complexes and histone-modifying 
enzymes. One of the main roles of 
Mediator is to coordinate the assembly of 
all these proteins at the promoter so that 
transcription can begin. As discussed in 
Chapter 4, the “default” state of chromatin 
is a condensed fiber (See Figure 4—28), and 
this is likely to be the form of DNA upon 
which most transcription is initiated. For 
simplicity, the chromatin is not shown in 
this figure. 
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transcribing a gene over and over again once it has become active for the first 
time. It may also be used to coordinate transcription elongation with the process- 
ing of RNA as it emerges from RNA polymerase, a topic we discuss later in this 
chapter. 


Transcription Creates Superhelical Tension 


There is yet another barrier to elongating RNA polymerases, both bacterial and 
eukaryotic, one that also applies to DNA polymerases, as discussed in Chapter 5 
(see Figure 5-20). To describe this issue in more detail, we need first to consider a 
subtle property inherent in the DNA double helix called DNA supercoiling. DNA 
supercoiling is the name given to a conformation that DNA adopts in response 
to superhelical tension; alternatively, creating loops or coils in a double-helical 
DNA molecule can create such tension. Figure 6-19 illustrates why. There are 
approximately 10 nucleotide pairs for every helical turn in a DNA double helix. If 
we imagine a helix whose two ends are fixed with respect to each other (as they 
are ina DNA circle, such as a bacterial chromosome, or in a tightly clamped loop, 
as is thought to exist in eukaryotic chromosomes), one large DNA supercoil will 
form to compensate for each 10 nucleotide pairs that are opened (unwound). The 
formation of this supercoil is energetically favorable because it restores a normal 
helical twist to the base-paired regions that remain, which would otherwise need 
to be overwound because of the fixed ends. 

RNA polymerase creates superhelical tension as it moves along a stretch of 
DNA that is anchored at its ends (see Figure 6-19C). As long as the polymerase 
is not free to rotate rapidly (and such rotation is unlikely given the size of RNA 
polymerases and their attached transcripts), a moving polymerase generates pos- 
itive superhelical tension in the DNA in front of it and negative helical tension 
behind it. For eukaryotes, this situation is thought to provide a bonus: although 
the positive superhelical tension ahead of the polymerase makes the DNA helix 
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Figure 6-19 Superhelical tension in 

DNA causes DNA supercoiling. (A) For 

a DNA molecule with one free end (or a 
nick in one strand that serves as a swivel), 
the DNA double helix rotates by one turn 
for every 10 nucleotide pairs opened. 

(B) If rotation is prevented, superhelical 
tension is introduced into the DNA by helix 
opening. In the example shown, the DNA 
helix contains 10 helical turns, one of which 
is opened. One way of accommodating 
the tension created would be to increase 
the helical twist from 10 to 11 nucleotide 
pairs per turn in the double helix that 
remains. The DNA helix, however, resists 
such a deformation in a springlike fashion, 
preferring to relieve the superhelical tension 
by bending into supercoiled loops. As a 
result, one DNA supercoil forms in the DNA 
double helix for every 10 nucleotide pairs 
opened. The supercoil formed in this case 
is a positive supercoil. (C) Supercoiling 

of DNA is induced by a protein tracking 
through the DNA double helix. The two 
ends of the DNA shown here are unable 

to rotate freely relative to each other, and 
the protein molecule is assumed also to be 
prevented from rotating freely as it moves. 
Under these conditions, the movement of 
the protein causes an excess of helical turns 
to accumulate in the DNA helix ahead of the 
protein and a deficit of helical turns to arise 
in the DNA behind the protein, as shown. 
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more difficult to open, the tension should facilitate the partial unwrapping of the 
DNA in nucleosomes, inasmuch as the release of DNA from the histone core helps 
to relax this tension. 

Any protein that propels itself alone along a DNA strand of a double helix, such 
as a DNA helicase or an RNA polymerase, tends to generate superhelical tension. 
In eukaryotes, DNA topoisomerase enzymes rapidly remove this superhelical ten- 
sion (see pp. 251-253). But in bacteria a specialized topoisomerase called DNA 
gyrase uses the energy of ATP hydrolysis to pump supercoils continuously into the 
DNA, thereby maintaining the DNA under constant tension. These are negative 
supercoils, having the opposite handedness from the positive supercoils that form 
when a region of DNA helix opens (see Figure 6-19B). Whenever a region of helix 
opens, it removes these negative supercoils from bacterial DNA, reducing the 
superhelical tension. DNA gyrase therefore makes the opening of the DNA helix 
in bacteria energetically favorable compared with helix opening in DNA that is 
not supercoiled. For this reason, it facilitates those genetic processes in bacteria, 
such as the initiation of transcription by bacterial RNA polymerase, that require 
helix opening (see Figure 6-11). 


Transcription Elongation in Eukaryotes Is Tightly Coupled to RNA 
Processing 


We have seen that bacterial mRNAs are synthesized by the RNA polymerase start- 
ing and stopping at specific spots on the genome. The situation in eukaryotes is 
substantially different. In particular, transcription is only the first of several steps 
needed to produce a mature mRNA molecule. Other critical steps are the covalent 
modification of the ends of the RNA and the removal of intron sequences that are 
discarded from the middle of the RNA transcript by the process of RNA splicing 
(Figure 6-20). 

Both ends of eukaryotic mRNAs are modified: by capping on the 5’ end and by 
polyadenylation of the 3’ end (Figure 6-21). These special ends allow the cell to 
assess whether both ends of an mRNA molecule are present (and if the message 
is therefore intact) before it exports the RNA from the nucleus and translates it 
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Figure 6-20 Comparison of the 

steps leading from gene to protein in 
eukaryotes and bacteria. The final level 
of a protein in the cell depends on the 
efficiency of each step and on the rates 

of degradation of the RNA and protein 
molecules. (A) In eukaryotic cells, the 
mRNA molecule resulting from transcription 
contains both coding (exon) and noncoding 
(intron) sequences. Before it can be 
translated into protein, the two ends of the 
RNA are modified, the introns are removed 
by an enzymatically catalyzed RNA 
splicing reaction, and the resulting mRNA 
is transported from the nucleus to the 
cytoplasm. For convenience, the steps in 
this figure are depicted as occurring one at 
a time; in reality, many occur concurrently. 
For example, the RNA cap is added and 
splicing begins before transcription has 
been completed. Because of the coupling 
between transcription and RNA processing, 
intact primary transcripts —the full-length 
RNAs that would, in theory, be produced 

if no processing had occurred —are 

found only rarely. (B) In prokaryotes, the 
production of mRNA is much simpler. The 
5’ end of an MRNA molecule is produced 
by the initiation of transcription, and the 

3’ end is produced by the termination of 
transcription. Since prokaryotic cells lack 

a nucleus, transcription and translation 
take place in a common compartment, 
and the translation of a bacterial MRNA 
often begins before its synthesis has been 
completed. 
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Figure 6-21 A comparison of the structures of prokaryotic and eukaryotic MRNA molecules. (A) The 5’ and 3’ 

ends of a bacterial MRNA are the unmodified ends of the chain synthesized by the RNA polymerase, which initiates 

and terminates transcription at those points, respectively. The corresponding ends of a eukaryotic MRNA are formed by 
adding a 5’ cap and by cleavage of the pre-mRNA transcript near the 3’ end and the addition of a poly-A tail, respectively. 
The figure also illustrates another difference between the prokaryotic and eukaryotic mRNAs: bacterial mRNAs can 
contain the instructions for several different proteins, whereas eukaryotic mRNAs nearly always contain the information 
for only a single protein. (B) The structure of the cap at the 5’ end of eukaryotic MRNA molecules. Note the unusual 
5'-to-5' linkage of the 7-methyl G to the remainder of the RNA. Many eukaryotic mRNAs carry an additional modification: 
methylation of the 2’-hydroxyl group of the ribose sugar at the 5’ end of the primary transcript (see Figure 6-23). 
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into protein. RNA splicing joins together the different portions of a protein-coding 
sequence, and it provides eukaryotes with the ability to synthesize several differ- 
ent proteins from the same gene. 

A simple strategy has evolved to couple all of the above RNA processing steps 
to transcription elongation. As discussed previously, a key step in transcription 
initiation by RNA polymerase II is the phosphorylation of the RNA polymerase II 
tail, also called the CTD (C-terminal domain). This phosphorylation, which pro- 
ceeds gradually as the RNA polymerase initiates transcription and moves along 
the DNA, not only helps dissociate the RNA polymerase II from other proteins 
present at the start point of transcription, but also allows a new set of proteins 
to associate with the RNA polymerase tail that function in transcription elonga- 
tion and RNA processing. As discussed next, some of these processing proteins 
are thought to “hop” from the polymerase tail onto the nascent RNA molecule 
to begin processing it as it emerges from the RNA polymerase. Thus, we can view 
RNA polymerase II in its elongation mode as an RNA factory that not only moves 
along the DNA synthesizing an RNA molecule, but also processes the RNA that it 
produces (Figure 6-22). Fully extended, the CTD is nearly 10 times longer than 
the remainder of RNA polymerase. As a flexible protein domain, it serves as a scaf- 
fold or tether, holding a variety of proteins close by so that they can rapidly act 
when needed. This strategy, which greatly speeds up the overall rate of a series 
of consecutive reactions, is one that is commonly utilized in the cell (see Figures 
4-58 and 16-18). 


RNA Capping Is the First Modification of Eukaryotic Pre-mRNAs 


As soon as RNA polymerase II has produced about 25 nucleotides of RNA, the 5’ 
end of the new RNA molecule is modified by addition of a cap that consists of a 
modified guanine nucleotide (see Figure 6-21B). Three enzymes, acting in suc- 
cession, perform the capping reaction: one (a phosphatase) removes a phosphate 
from the 5’ end of the nascent RNA, another (a guanyl transferase) adds a GMP in 
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Figure 6-22 Eukaryotic RNA polymerase II as an “RNA factory.” As the 
polymerase transcribes DNA into RNA, it carries RNA-processing proteins on 
its tail that are transferred to the nascent RNA at the appropriate time. The 
tail contains 52 tandem repeats of a seven-amino-acid sequence, and there 
are two serines in each repeat. The capping proteins first bind to the RNA 
polymerase tail when it is phosphorylated on Ser5 of the heptad repeat late in 
the process of transcription initiation (see Figure 6-15). This strategy ensures 
that the RNA molecule is efficiently capped as soon as its 5’ end emerges 
from the RNA polymerase. As the polymerase continues transcribing, its tail 
is extensively phosphorylated on the Ser2 positions by a kinase associated 
with the elongating polymerase and is eventually dephosphorylated at Ser5 
positions. These further modifications attract splicing and 3’-end processing 
proteins to the moving polymerase, positioning them to act on the newly 
synthesized RNA as it emerges from the RNA polymerase. There are many 
RNA-processing enzymes, and not all travel with the polymerase. For RNA 
splicing, for example, the tail carries only a few critical components; once 
transferred to an RNA molecule, they serve as a nucleation site for the 
remaining components. 

When RNA polymerase II finishes transcribing a gene, It is released 
from DNA, soluble phosphatases remove the phosphates on its tail, and 
it can reinitiate transcription. Only the fully dephosphorylated form of RNA 
polymerase Il is competent to begin RNA synthesis at a promoter. 


a reverse linkage (5’ to 5’ instead of 5’ to 3’), and a third (a methyl transferase) adds 
a methyl group to the guanosine (Figure 6-23). Because all three enzymes bind to 
the RNA polymerase tail phosphorylated at the Ser5 position—the modification 
added by TFIIH during transcription initiation— they are poised to modify the 5’ 
end of the nascent transcript as soon as it emerges from the polymerase. 

The 5'-methyl cap signifies the 5’ end of eukaryotic mRNAs, and this land- 
mark helps the cell to distinguish mRNAs from the other types of RNA molecules 
present in the cell. For example, RNA polymerases I and HI produce uncapped 
RNAs during transcription, in part because these polymerases lack a CTD. In the 
nucleus, the cap binds a protein complex called CBC (cap-binding complex), 
which, as we discuss in subsequent sections, helps a future MRNA be further pro- 
cessed and exported. The 5’-methyl cap also has an important role in the transla- 
tion of mRNAs in the cytosol, as we discuss later in the chapter. 


RNA Splicing Removes Intron Sequences from Newly Transcribed 
Pre-mRNAs 


As discussed in Chapter 4, the protein-coding sequences of eukaryotic genes are 
typically interrupted by noncoding intervening sequences (introns). Discovered 
in 1977, this feature of eukaryotic genes came as a Surprise to scientists, who had 
been, until that time, familiar only with bacterial genes, which typically consist 
of a continuous stretch of coding DNA that is directly transcribed into mRNA. In 
marked contrast, eukaryotic genes were found to be broken up into small pieces 
of coding sequence (expressed sequences or exons) interspersed with much longer 
intervening sequences or introns; thus, the coding portion of a eukaryotic gene is 
often only a small fraction of the length of the gene (Figure 6-24). 

Both intron and exon sequences are transcribed into RNA. The intron 
sequences are removed from the newly synthesized RNA through the process 
of RNA splicing. The vast majority of RNA splicing that takes place in cells func- 
tions in the production of mRNA, and our discussion of splicing focuses on this 
so-called precursor-mRNA (or pre-mRNA) splicing. Only after 5'- and 3’-end pro- 
cessing and splicing have taken place is such RNA termed mRNA. 


Figure 6-23 The reactions that cap the 5’ end of each RNA molecule 
synthesized by RNA polymerase II. The final cap contains a novel 5’-to-5' 
linkage between the positively charged 7-methyl G residue and the 5’ end 

of the RNA transcript (See Figure 6-21B). The letter N represents any one of 
the four ribonucleotides, although the nucleotide that starts an RNA chain is 
usually a purine (an A or a G). (After A.J. Shatkin, BioEssays 7:275-277, 1987. 
With permission from Wiley-Liss, Inc., a subsidiary of John Wiley & Sons, Inc.) 
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Each splicing event removes one intron, proceeding through two sequential 
phosphoryl-transfer reactions known as transesterifications; these join two exons 
together while removing the intron between them as a “lariat” (Figure 6-25). The 
machinery that catalyzes pre-mRNA splicing is complex, consisting of five addi- 
tional RNA molecules and several hundred proteins, and it hydrolyzes many ATP 
molecules per splicing event. This complexity ensures that splicing is accurate, 
while at the same time being flexible enough to deal with the enormous variety of 
introns found in a typical eukaryotic cell. 

It may seem wasteful to remove large numbers of introns by RNA splicing. In 
attempting to explain why it occurs, scientists have pointed out that the exon- 
intron arrangement would seem to facilitate the emergence of new and useful 
proteins over evolutionary time scales. Thus, the presence of numerous introns 
in DNA allows genetic recombination to readily combine the exons of different 
genes, enabling genes for new proteins to evolve more easily by the combination 
of parts of preexisting genes. The observation, described in Chapter 3, that many 
proteins in present-day cells resemble patchworks composed from a common set 
of protein domains, supports this idea (see pp. 121-122). 

RNA splicing also has a present-day advantage. The transcripts of many 
eukaryotic genes (estimated at 95% of genes in humans) are spliced in more than 
one way, thereby allowing the same gene to produce a corresponding set of dif- 
ferent proteins (Figure 6-26). Rather than being the wasteful process it may have 
seemed at first sight, RNA splicing enables eukaryotes to increase the coding 
potential of their genomes. We shall return to this idea again in this chapter and 
the next, but we first need to describe the cellular machinery that performs this 
remarkable task. 
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Figure 6-24 Structure of two human 
genes showing the arrangement of 
exons and introns. (A) The relatively small 
B-globin gene, which encodes a subunit of 
the oxygen-carrying protein hemoglobin, 
contains 3 exons (see also Figure 4-7). 

(B) The much larger Factor VIII gene 
contains 26 exons; it codes for a protein 
(Factor VIII) that functions in the blood- 
clotting pathway. The most prevalent form 
of hemophilia results from mutations in 
this gene. 


Figure 6-25 The pre-mRNA splicing 
reaction. (A) In the first step, a specific 
adenine nucleotide in the intron sequence 
(indicated in red) attacks the 5’ splice site 
and cuts the sugar-phosphate backbone 
of the RNA at this point. The cut 5’ end 

of the intron becomes covalently linked to 
the adenine nucleotide, as shown in detail 
in (B), thereby creating a loop in the RNA 
molecule. The released free 3’-OH end of 
the exon sequence then reacts with the 
start of the next exon Sequence, joining 
the two exons together and releasing the 
intron sequence in the shape of a lariat. 
The two exon sequences thereby become 
joined into a continuous coding sequence. 
The released intron sequence is eventually 
broken down into single nucleotides, which 
are recycled. 
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Nucleotide Sequences Signal Where Splicing Occurs 


The mechanism of pre-mRNA splicing shown in Figure 6-24 requires that the 
splicing machinery recognize three portions of the precursor RNA molecule: the 
5’ splice site, the 3’ splice site, and the branch point in the intron sequence that 
forms the base of the excised lariat. Not surprisingly, each site has a consensus 
nucleotide sequence that is similar from intron to intron and provides the cell 
with cues for where splicing is to take place (Figure 6-27). However, these con- 
sensus sequences are relatively short and can accommodate extensive sequence 
variability; as we shall see shortly, the cell incorporates additional types of infor- 
mation to ultimately choose exactly where, on each RNA molecule, splicing is to 
take place. 

The high variability of the splicing consensus sequences presents a special 
challenge for scientists attempting to decipher genome sequences. Introns range 
in size from about 10 nucleotides to over 100,000 nucleotides, and choosing the 
precise borders of each intron is a difficult task even with the aid of powerful com- 
puters. The possibility of alternative splicing compounds the problem of predict- 
ing protein sequences solely from a genome sequence. This difficulty is one of the 
main barriers to identifying all of the genes in a complete genome sequence, and 
it is one of the primary reasons why we know only the approximate number of 
different proteins produced by the human genome. 


RNA Splicing Is Performed by the Spliceosome 


Unlike the other steps of mRNA production we have discussed, key steps in RNA 
splicing are performed by RNA molecules rather than proteins. Specialized RNA 
molecules recognize the nucleotide sequences that specify where splicing is to 
occur and also catalyze the chemistry of splicing. These RNA molecules are rela- 
tively short (less than 200 nucleotides each), and there are five of them, U1, U2, U4, 
U5, and U6. Known as snRNAs (small nuclear RNAs), each is complexed with at 
least seven protein subunits to form an snRNP (small nuclear ribonucleoprotein). 
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Figure 6-26 Alternative splicing of 

the a-tropomyosin gene from rat. 
a-Tropomyosin is a coiled-coil protein (see 
Figure 3-9) that carries out several tasks, 
most notably the regulation of contraction 
in muscle cells. The primary transcript can 
be spliced in different ways, as indicated 
in the figure, to produce distinct MRNAs, 
which then give rise to variant proteins. 
Some of the splicing patterns are specific 
for certain types of cells. For example, the 
a-tropomyosin made in striated muscle 

is different from that made from the same 
gene in smooth muscle. The arrowheads 
in the top part of the figure mark the sites 
where cleavage and poly-A addition form 
the 3’ ends of the mature mRNAs. 


Figure 6-27 The consensus nucleotide 
sequences in an RNA molecule that 
signal the beginning and the end of 
most introns in humans. The three 
blocks of nucleotide sequences shown are 
required to remove an intron sequence. 
Here A, G, U, and C are the standard RNA 
nucleotides; R stands for purines (A or 

G); and Y stands for pyrimidines (C or U). 
The A highlighted in red forms the branch 
point of the lariat produced by splicing (see 
Figure 6-25). Only the GU at the start of the 
intron and the AG at its end are invariant 
nucleotides in the splicing consensus 
sequences. Several different nucleotides 
can occupy the remaining positions, 
although the indicated nucleotides are 
preferred. The distances along the RNA 
between the three splicing consensus 
sequences are highly variable; however, 
the distance between the branch point and 
3’ splice junction is typically much shorter 
than that between the 5’ splice junction 
and the branch point. 
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These snRNPs form the core of the spliceosome, the large assembly of RNA and 
protein molecules that performs pre-mRNA splicing in the cell. During the splic- 
ing reaction, recognition of the 5’ splice junction, the branch-point site, and the 
3’ splice junction is performed largely through base-pairing between the snRNAs 
and the consensus RNA sequences in the pre-mRNA substrate. 

The spliceosome is a complex and dynamic machine. When studied in vitro, a 
few components of the spliceosome assemble on pre-mRNA and, as the splicing 
reaction proceeds, new components enter and those that have already performed 
their tasks are jettisoned (Figure 6-28). However, many scientists believe that, 
inside the cell, the spliceosome is a preexisting, loose assembly of all the compo- 
nents—capturing, splicing, and releasing RNA as a coordinated unit, and under- 
going extensive rearrangements each time a splice is made. 
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Figure 6-28 The pre-mRNA splicing 
mechanism. RNA splicing is catalyzed by 

an assembly of snRNPs (Shown as colored 
circles) plus other proteins (most of which are 
not shown), which together constitute the 
spliceosome. The spliceosome recognizes the 
splicing signals on a pre-mRNA molecule, brings 
the two ends of the intron together, and provides 
the enzymatic activity for the two reaction steps 
required (see Figure 6—25A and Movie 6.5). 

As indicated, a set of proteins called the exon 
junction complex (EJC) remains on the spliced 
mRNA molecule; its subsequent role will be 
discussed shortly. 
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The Spliceosome Uses ATP Hydrolysis to Produce a Complex 
Series of RNA-RNA Rearrangements 


ATP hydrolysis is not required for the chemistry of RNA splicing per se since the 
two transesterification reactions preserve the high-energy phosphate bonds. 
However, extensive ATP hydrolysis is required for the assembly and rearrange- 
ments of the spliceosome. Some of the additional proteins that make up the spli- 
ceosome use the energy of ATP hydrolysis to break existing RNA-RNA interactions 
to allow the formation of new ones. Each successful splice requires approximately 
200 proteins, if we include those that form the snRNPs. 

What is the purpose of these rearrangements? First, they allow the splicing sig- 
nals on the pre-RNA to be examined by snRNPs several times during the course of 
splicing. For example, the U1 snRNP initially recognizes the 5’ splice site through 
conventional base-pairing; as splicing proceeds, these base pairs are broken 
(using the energy of ATP hydrolysis) and U1 is replaced by U6 (Figure 6-29). This 
type of RNA-RNA rearrangement (in which the formation of one RNA-RNA inter- 
action requires the disruption of another) occurs several times during splicing 
and allows the spliceosomes to check and recheck the splicing signals, thereby 
increasing the overall accuracy of splicing. Second, the rearrangements that take 
place in the spliceosome create the active sites for the two transesterification 
reactions. These two active sites are created, one after the other, and only after the 
splicing signals on the pre-mRNA have been checked several times. This orderly 
progression ensures that splicing accidents occur only rarely. 

One of the most surprising features of the spliceosome is the nature of the cat- 
alytic sites: they are formed by both protein and RNA molecules, although the 
RNA molecules catalyze the actual chemistry of splicing. In the last section of this 
chapter, we discuss in general terms the structural and chemical properties of 
RNA molecules that allow them to act as catalysts. 

Once the splicing chemistry is completed, the snRNPs remain bound to the 
lariat. The disassembly of these snRNPs from the lariat (and from each other) 
requires another series of RNA-RNA rearrangements that require ATP hydroly- 
sis, thereby returning the snRNAs to their original configuration so that they can 
be used again in a new reaction. At the completion of a splice, the spliceosome 
directs a set of proteins to bind to the mRNA near the position formerly occupied 
by the intron. Called the exon junction complex (EJC), these proteins mark the site 
of a successful splicing event and, as we shall see later in this chapter, influence 
the subsequent fate of the mRNA. 


Other Properties of Pre-mRNA and Its Synthesis Help to Explain 
the Choice of Proper Splice Sites 


As we have seen, intron sequences vary enormously in size, with some being in 
excess of 100,000 nucleotides. If splice-site selection were determined solely by 
the snRNPs acting on a preformed, protein-free RNA molecule, we would expect 
frequent splicing mistakes—such as exon skipping and the use of “cryptic” splice 
sites (Figure 6-30). The fidelity mechanisms built into the spliceosome to sup- 
press errors, however, are supplemented by two additional strategies that further 
increase the accuracy of splicing. The first is a simple consequence of splicing 
being coupled to transcription. As transcription proceeds, the phosphorylated 
tail of RNA polymerase carries several components of the spliceosome (see Figure 
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Figure 6-29 One of the many 
rearrangements that take place in the 
spliceosome during pre-mRNA splicing. 
This example comes from the yeast 
Saccharomyces cerevisiae, in which the 
nucleotide sequences involved are slightly 
different from those in human cells. The 
exchange of U1 snRNP for U6 snRNP 
occurs just before the first phosphoryl- 
transfer reaction (see Figure 6-28). This 
exchange requires the 5’ splice site to be 
read by two different snRNPs, thereby 
increasing the accuracy of 5’ splice-site 
selection by the spliceosome. 
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6-22), and these components are transferred directly from the polymerase to the 
RNA as the RNA emerges from the polymerase. This strategy helps the cell keep 
track of introns and exons: for example, the snRNPs that assemble at a 5’ splice 
site are initially presented only with the single 3’ splice site that emerges next from 
the polymerase; the potential sites further downstream have not yet been synthe- 
sized. The coordination of transcription with splicing is especially important in 
preventing inappropriate exon skipping. 

A strategy called “exon definition” also helps cells choose the appropriate 
splice sites. Exon size tends to be much more uniform than intron size, averaging 
about 150 nucleotide pairs across a wide variety of eukaryotic organisms (Figure 
6-31). Through exon definition, the splicing machinery can seek out the relatively 
homogeneously sized exon sequences. As RNA synthesis proceeds, a group of 
additional components (most notably SR proteins, so-named because they con- 
tain a domain rich in serines and arginines) assemble on exon sequences and 
help to mark off each 3’ and 5’ splice site, starting at the 5’ end of the RNA (Figure 
6-32). These proteins, in turn, recruit U1 snRNA, which marks the downstream 
exon boundary, and U2 snRNA, which specifies the upstream one. By specifically 
marking the exons in this way and thereby taking advantage of the relatively uni- 
form size of exons, the cell increases the accuracy with which it deposits the initial 
splicing components on the nascent RNA and thereby avoids “near miss” splice 
sites. How the SR proteins discriminate exon sequences from intron sequences is 
not understood in detail; however, it is known that some of the SR proteins bind 
preferentially to specific RNA sequences in exons, termed splicing enhancers. 
In principle, since any one of several different codons can be used to code for a 
given amino acid, there is freedom to evolve the exon nucleotide sequence so as 
to form a binding site for an SR protein, without necessarily affecting the amino 
acid sequence that the exon specifies. 

Both the marking of exon and intron boundaries and the assembly of the spli- 
ceosome begin on an RNA molecule while it is still being elongated by RNA poly- 
merase at its 3’ end. However, the actual chemistry of splicing can take place later. 
This delay means that intron sequences are not necessarily removed from a pre- 
mRNA molecule in the order in which they occur along the RNA chain. 
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Figure 6-30 Two types of splicing errors. 
(A) Exon skipping. (B) Cryptic splice- 

site selection. Cryptic splicing signals 

are nucleotide sequences of RNA that 
closely resemble true splicing signals and 
are sometimes mistakenly used by the 
spliceosome. 


Figure 6-31 Variation in intron and exon 
lengths in the human, worm, and fly 
genomes. (A) Size distribution of exons. 
(B) Size distribution of introns. Note that 
exon length is much more uniform than 
intron length. (Adapted from International 
Human Genome Sequencing Consortium, 
Nature 409:860-921, 2001. With 
permission from Macmillan Publishers Ltd.) 
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Chromatin Structure Affects RNA Splicing 


Although it may seem at first counterintuitive, the way a gene is packaged into 
chromatin can affect how the RNA transcript of that gene is ultimately spliced. 
Nucleosomes tend to be positioned over exons (which are, on average, close to the 
length of DNA in a nucleosome), and it has been proposed that these act as “speed 
bumps,’ allowing the proteins responsible for exon definition to assemble on the 
RNA as it emerges from the polymerase. In addition, changes in chromatin struc- 
ture are used to alter splicing patterns. There are two ways this can happen. First, 
because splicing and transcription are coupled, the rate at which RNA polymerase 
moves along DNA can affect RNA splicing. For example, if polymerase is mov- 
ing slowly, exon skipping (see Figure 6-30A) is minimized: assembly of the ini- 
tial spliceosome may be complete before an alternative choice of splice site even 
emerges from the RNA polymerase. The nucleosomes in condensed chromatin 
can cause polymerase to pause; the pattern of pauses in turn affects the extent of 
RNA exposed at any given time to the splicing machinery. 

There is a second and more direct way that chromatin structure can affect RNA 
splicing. Although the details are not yet understood, specific histone modifica- 
tions attract components of the spliceosome, and, because the chromatin being 
transcribed is in close association with the nascent RNA, these splicing compo- 
nents can easily be transferred to the emerging RNA. In this way, certain types of 
histone modifications can affect the final pattern of splicing. 


RNA Splicing Shows Remarkable Plasticity 


We have seen that the choice of splice sites depends on such features of the pre- 
mRNA transcript as the strength of the three signals on the RNA (the 5’ and 3’ splice 
junctions and the branch point) for the splicing machinery, the co-transcriptional 
assembly of the spliceosome, chromatin structure, and the “bookkeeping” that 
underlies exon definition. We do not know exactly how accurate splicing normally 
is because, as we see later, there are several quality control systems that rapidly 
destroy mRNAs whose splicing goes awry. However, we do know that, compared 
with other steps in gene expression, splicing is unusually flexible. 

Thus, for example, a mutation in a nucleotide sequence critical for splicing of 
a particular intron does not necessarily prevent splicing of that intron altogether. 
Instead, the mutation typically creates a new pattern of splicing (Figure 6-33). 
Most commonly, an exon is simply skipped (Figure 6-33B). In other cases, the 
mutation causes a cryptic splice junction to be efficiently used (Figure 6-33C). 
Apparently, the splicing machinery has evolved to pick out the best possible pat- 
tern of splice junctions, and if the optimal one is damaged by mutation, it will 
seek out the next best pattern, and so on. This inherent plasticity in the process of 
RNA splicing suggests that changes in splicing patterns caused by random muta- 
tions have been important in the evolution of genes and organisms. It also means 
that mutations that affect splicing can be severely detrimental to the organism: 
in addition to the B thalassemia, example presented in Figure 6-33, aberrant 


323 


Figure 6-32 The exon definition 
hypothesis. According to this idea, 

SR proteins bind to each exon sequence 
in the pre-mRNA and thereby help to 
guide the snRNPs to the proper intron/ 
exon boundaries. This demarcation of 
exons by the SR proteins occurs co- 
transcriptionally, beginning at the CBC 
(cap-binding complex) at the 5’ end. 

It has been proposed that a group of 
proteins known as the heterogeneous 
nuclear ribonucleoproteins (AnRNPs) 
may preferentially associate with intron 
sequences, further helping the spliceosome 
distinguish introns from exons. (Adapted 
from R. Reed, Curr. Opin. Cell Biol. 
12:340-345, 2000. With permission from 
Elsevier.) 
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splicing plays important roles in the development of cystic fibrosis, frontotem- 
poral dementia, Parkinson’s disease, retinitis pigmentosa, spinal muscular atro- 
phy, myotonic dystrophy, premature aging, and cancer. It has been estimated that 
of the many point mutations that cause inherited human diseases, 10% produce 
aberrant splicing of the gene containing the mutation. 

The plasticity of RNA splicing also means that the cell can easily regulate the 
pattern of RNA splicing. Earlier in this section we saw that alternative splicing can 
give rise to different proteins from the same gene and that this is a common strat- 
egy to enhance the coding potential of genomes. Some examples of alternative 
splicing are constitutive; that is, the alternatively spliced mRNAs are produced 
continuously by cells of an organism. However, in many cases, the cell regulates 
the splicing patterns so that different forms of the protein are produced at differ- 
ent times and in different tissues (see Figure 6-26). In Chapter 7, we return to this 
issue to discuss some specific examples of regulated RNA splicing. 


Spliceosome-Catalyzed RNA Splicing Probably Evolved from 
Self-splicing Mechanisms 


When the spliceosome was first discovered, it puzzled molecular biologists. Why 
do RNA molecules instead of proteins perform important roles in splice-site 
recognition and in the chemistry of splicing? Why is a lariat intermediate used 
rather than the apparently simpler alternative of bringing the 5’ and 3’ splice sites 
together in a single step, followed by their direct cleavage and rejoining? The 
answers to these questions reflect the way in which the spliceosome has evolved. 

As discussed briefly in Chapter 1 (and in more detail in the final section of this 
chapter), it is likely that early cells used RNA molecules rather than proteins as 
their major catalysts and that they stored their genetic information in RNA rather 
than in DNA sequences. RNA-catalyzed splicing reactions presumably had criti- 
cal roles in these early cells. As evidence, some self-splicing RNA introns (that is, 
intron sequences in RNA whose splicing out can occur in the absence of proteins 
or any other RNA molecules) remain today—for example, in the nuclear rRNA 
genes of the ciliate Tetrahymena, in a few bacteriophage T4 genes, and in some 
mitochondrial and chloroplast genes. In these cases, the RNA molecule folds 
into a specific three-dimensional structure that brings the intron/exon junctions 
together and catalyzes the two transesterification reactions. A self-splicing intron 
sequence can be identified in a test tube by incubating a pure RNA molecule that 
contains the intron sequence and observing the splicing reaction. Because the 
basic chemistry of some self-splicing reactions is so similar to pre-mRNA splicing, 
it has been proposed that the much more involved process of pre-mRNA splicing 
evolved from a simpler, ancestral form of RNA self-splicing. 


RNA-Processing Enzymes Generate the 3’ End of Eukaryotic 
mRNAs 


We have seen that the 5’ end of the pre-mRNA produced by RNA polymerase II 
is capped almost as soon as it emerges from the RNA polymerase. Then, as the 
polymerase continues its movement along a gene, the spliceosome assembles 
on the RNA and delineates the intron and exon boundaries. The long C-terminal 
tail of the RNA polymerase coordinates these processes by transferring capping 
and splicing components directly to the RNA as it emerges from the enzyme. In 
this section, we shall see that, as RNA polymerase II reaches the end of a gene, 
a similar mechanism ensures that the 3’ end of the pre-mRNA is appropriately 
processed. 

The position of the 3’ end of each mRNA molecule is specified by signals 
encoded in the genome (Figure 6-34). These signals are transcribed into RNA as 
the RNA polymerase II moves through them, and they are then recognized (as 
RNA) by a series of RNA-binding proteins and RNA-processing enzymes (Figure 
6-35). Two multisubunit proteins, called CstF (cleavage stimulation factor) and 
CPSF (cleavage and polyadenylation specificity factor), are of special importance. 
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Figure 6-33 Abnormal processing of 
the B-globin primary RNA transcript in 
humans with the disease B thalassemia. 
In the examples shown, the disease (a 
severe anemia due to aberrant hemoglobin 
synthesis) is caused by splice-site 
mutations found in the genomes of affected 
patients. The dark blue boxes represent 
the three normal exon sequences; the red 
lines connect the 5’ and 3’ splice sites that 
are used. In (B), (C), and (D), the /ight blue 
boxes depict new nucleotide sequences 
included in the final MRNA molecule as a 
result of the mutation denoted by the 
black arrowhead. Note that when a 
mutation leaves a normal splice site 
without a partner, an exon is skipped (B) 
or one or more abnormal cryptic splice 
sites nearby is used as the partner site (C). 
[Adapted in part from S.H. Orkin, in The 
Molecular Basis of Blood Diseases 

(G. Stamatoyannopoulos et al., eds.), 

pp. 106-126. Philadelphia: Saunders, 
1987.] 


FROM DNA TO RNA 


< 30 nucleotides 


10-30 nucleotides 
i © 4i 








| CLEAVAGE 
“ARUBA CAH 
\ l 
l degraded in 
Poly-A the nucleus 
ADDITION 


| 
-AAUAAA CA. AAAAA- ==- ---A Oh 


~200 


Both of these proteins travel with the RNA polymerase tail and are transferred to 
the 3’-end processing sequence on an RNA molecule as it emerges from the RNA 
polymerase. 

Once CstF and CPSF bind to their recognition sequences on the emerging 
RNA molecule, additional proteins assemble with them to create the 3’ end of the 
mRNA. First, the RNA is cleaved from the polymerase (see Figure 6-35). Next an 
enzyme called poly-A polymerase (PAP) adds, one at a time, approximately 200 A 
nucleotides to the 3’ end produced by the cleavage. The nucleotide precursor for 
these additions is ATP, and the same type of 5’-to-3’ bonds are formed as in con- 
ventional RNA synthesis. But unlike other RNA polymerases, poly-A polymerase 
does not require a template; hence the poly-A tail of eukaryotic mRNAs is not 
directly encoded in the genome. As the poly-A tail is synthesized, proteins called 
poly-A-binding proteins assemble onto it and, by a poorly understood mecha- 
nism, help determine the final length of the tail. 

After the 3'-end of a eukaryotic pre-mRNA molecule has been cleaved, the RNA 
polymerase II continues to transcribe, in some cases for hundreds of nucleotides. 
Once 3’-end cleavage has occurred, the newly synthesized RNA that emerges from 
the polymerases lacks a 5’ cap; this unprotected RNA is rapidly degraded by a 5’ 
— 3' exonuclease carried along on the polymerase tail. Apparently, it is this con- 
tinued RNA degradation that eventually causes the RNA polymerase to release its 
grip on the template and terminate transcription. 


Mature Eukaryotic mRNAs Are Selectively Exported from the 
Nucleus 


Eukaryotic pre-mRNA synthesis and processing take place in an orderly fashion 
within the cell nucleus. But of the pre-mRNA that is synthesized, only a small frac- 
tion—the mature mRNA—is of further use to the cell. Most of the rest—excised 
introns, broken RNAs, and aberrantly processed pre-mRNAs—is not only useless 
but potentially dangerous. How does the cell distinguish between the relatively 
rare mature MRNA molecules it wishes to keep and the overwhelming amount of 
debris created by RNA processing? 

The answer is that, as an RNA molecule is processed, it loses certain proteins 
and acquires others. For example, we have seen that acquisition of cap-binding 
complexes, exon junction complexes, and poly-A-binding proteins mark the com- 
pletion of capping, splicing, and poly-A addition, respectively. A properly com- 
pleted mRNA molecule is also distinguished by the proteins it lacks. For example, 
the presence of an snRNP protein would signify incomplete or aberrant splicing. 
Only when the proteins present on an mRNA molecule collectively signify that pro- 
cessing was successfully completed is the mRNA exported from the nucleus into 
the cytosol, where it can be translated into protein. Improperly processed mRNAs 


Figure 6-35 Some of the major steps in generating the 3’ end of a 
eukaryotic MRNA. This process is much more complicated than the 
analogous process in bacteria, where the RNA polymerase simply stops at a 
termination signal and releases both the 3’ end of its transcript and the DNA 
template (see Figure 6-11). 
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Figure 6-34 Consensus nucleotide 
sequences that direct cleavage and 
polyadenylation to form the 3’ end of 
a eukaryotic MRNA. These sequences 
are encoded in the genome, and specific 
proteins recognize them—as RNA—after 
they are transcribed. As shown in Figure 
6-35, the hexamer AAUAAA is bound by 
CPSF and the GU-rich element beyond 
the cleavage site is bound by CstF; the 
CA sequence is bound by a third protein 
factor required for the cleavage step. Like 
other consensus nucleotide sequences 
discussed in this chapter (See Figure 
6-12), the sequences shown in the figure 
represent a variety of individual cleavage 
and polyadenylation signals. 
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and other RNA debris (excised intron sequences, for example) are retained in the 
nucleus, where they are eventually degraded by the nuclear exosome, a large pro- 
tein complex whose interior is rich in 3'-to-5' RNA exonucleases (Figure 6-36). 
Eukaryotic cells thus export only useful RNA molecules to the cytoplasm, while 
debris is disposed of in the nucleus. 

Of all the proteins that assemble on pre-mRNA molecules as they emerge from 
transcribing RNA polymerases, the most abundant are the hnRNPs (heteroge- 
neous nuclear ribonuclear proteins). Some of these proteins (there are approx- 
imately 30 different ones in humans) unwind the hairpin helices in the RNA so 
that splicing and other signals on the RNA can be read more easily. Others pref- 
erentially package the RNA contained in the very long intron sequences typical 
in complex organisms (see Figure 6-31) and these may play an important role in 
distinguishing mature mRNA from the debris left over from RNA processing. 

Successfully processed mRNAs are guided through the nuclear pore 
complexes (NPCs)—aqueous channels in the nuclear membrane that directly 
connect the nucleoplasm and cytosol (Figure 6-37). Small molecules (less than 
60,000 daltons) can diffuse freely through these channels. However, most of the 
macromolecules in cells, including mRNAs complexed with proteins, are far too 
large to pass through the channels without a special process. The cell uses energy 
to actively transport such macromolecules in both directions through the nuclear 
pore complexes. 

As explained in detail in Chapter 12, macromolecules are moved through 
nuclear pore complexes by nuclear transport receptors, which, depending on the 
identity of the macromolecule, escort it from the nucleus to the cytoplasm or vice 
versa. For mRNA export to occur, a specific nuclear transport receptor must be 
loaded onto the mRNA, a step that, in many organisms, takes place in concert 
with 3’ cleavage and polyadenylation. Once it helps to move an RNA molecule 
through the nuclear pore complex, the transport receptor dissociates from the 
mRNA, re-enters the nucleus, and is then used to export anew mRNA molecule. 

The export of mRNA-protein complexes from the nucleus can be readily 
observed with the electron microscope for the unusually abundant mRNA of the 
insect Balbiani Ring genes. As these genes are transcribed, the newly formed RNA 
is seen to be packaged by proteins, including hnRNPs, SR proteins, and compo- 
nents of the spliceosome. This protein-RNA complex undergoes a series of struc- 
tural transitions, probably reflecting RNA processing events, culminating in a 
curved fiber (see Figure 6-37). This curved fiber moves through the nucleoplasm 
and enters the nuclear pore complex (with its 5’ cap proceeding first), and it then 
undergoes another series of structural transitions as it moves through the pore. 
These and other observations reveal that the pre-mRNA-protein and mRNA-pro- 
tein complexes are dynamic structures that gain and lose numerous specific pro- 
teins during RNA synthesis, processing, and export (Figure 6-38). 

The analysis just described has been complemented by new methods that 
allow researchers to track the fate of more typical mRNA molecules, which can 
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Figure 6-36 Structure of the core of 
human RNA exosome. RNA is fed 

into one end of the central pore and is 
degraded by RNAses that associate 

with the other end. Nine different protein 
subunits (each represented by a different 
color) make up this large ring structure. 
Eukaryotic cells have both a nuclear 
exosome and a cytoplasmic exosome; both 
forms include the core exosome shown 
here and additional subunits (including 
specialized RNAses) that differentiate the 
two forms. The nuclear exosome degrades 
aberrant RNAs before they are exported to 
the cytosol. It also processes certain types 
of RNA (for example, the ribosomal RNAs) 
to produce their final form. The cytoplasmic 
form of the exosome is responsible for 
degrading mRNAs in the cytosol, and is 
thus crucial in determining the lifetime of 
each mRNA molecule. (PDB code: 2NNG6.) 
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Figure 6-37 Transport of a large mRNA molecule through the nuclear pore complex. (A) The maturation of an mRNA molecule as it is 
synthesized by RNA polymerase and packaged by a variety of nuclear proteins. This drawing of an unusually large and abundant insect RNA, called 
the Balbiani Ring mRNA, is based on electron microscope micrographs such as that shown in (B). (A, adapted from B. Daneholt, Cell 88:585-588, 
1997. With permission from Elsevier; B, from B.J. Stevens and H. Swift, J. Cell Biol. 31:55-77, 1966. With permission from The Rockefeller 
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be fluorescently labeled and observed individually. A typical RNA molecule is 
released from its site of transcription and spends several minutes diffusing to a 
nuclear pore complex. During this time itis likely that RNA processing events con- 
tinue and that the RNA sheds previously bound proteins and acquires new ones. 
Once it arrives at the entrance to the pore, the “export-ready” mRNA hovers for 
several seconds, during which time the completion of processing may occur, and 
then is transported through the pore very rapidly, in tens of milliseconds. Some 
mRNA-protein complexes are very large, and how they move through the nuclear 
pore complexes so rapidly remains a mystery. 

Some of the proteins deposited on the mRNA while itis still in the nucleus can 
affect the fate of the RNA after it is transported to the cytosol. Thus, the stability of 
an MRNA in the cytosol, the efficiency with which it is translated into protein, and 
its ultimate destination in the cytosol can all be determined by proteins acquired 
in the nucleus that remain bound to the RNA after it leaves the nucleus. 

But before discussing what happens to mRNAs in the cytosol, we briefly 
consider how the synthesis and processing of some noncoding RNA molecules 
occurs. There are many types of noncoding RNAs produced by cells (see Table 
6-1, p. 305), but here we focus on the rRNAs, which are critically important for the 
translation of mRNAs into protein. 


Noncoding RNAs Are Also Synthesized and Processed in the 
Nucleus 


Only a few percent of the dry weight of a mammalian cell is RNA; of that, only 
about 3-5% is mRNA. The bulk of the RNA in cells performs structural and cata- 
lytic functions (see Table 6-1). The most abundant RNAs in cells are the ribosomal 
RNAs (rRNAs), constituting approximately 80% of the RNA in rapidly dividing 
cells. As discussed later in this chapter, these RNAs form the core of the ribosome. 
Unlike bacteria—in which a single RNA polymerase synthesizes all RNAs in the 
cell—eukaryotes have a separate, specialized polymerase, RNA polymerase I, that 
is dedicated to producing rRNAs. RNA polymerase I is similar structurally to the 
RNA polymerase II discussed previously; however, the absence of a C-terminal 
tail in polymerase I helps to explain why its transcripts are neither capped nor 
polyadenylated. 

Because multiple rounds of translation of each mRNA molecule can provide 
an enormous amplification in the production of protein molecules, many of the 
proteins that are very abundant in a cell can be synthesized from genes that are 
present in a single copy per haploid genome (see Figure 6-3). In contrast, the RNA 
components of the ribosome are final gene products, and a growing mammalian 
cell must synthesize approximately 10 million copies of each type of ribosomal 
RNA in each cell generation to construct its 10 million ribosomes. The cell can 
produce adequate quantities of ribosomal RNAs only because it contains mul- 
tiple copies of the rRNA genes that code for ribosomal RNAs (rRNAs). Even 
E. colineeds seven copies of its rRNA genes to meet the cell’s need for ribosomes. 
Human cells contain about 200 rRNA gene copies per haploid genome, spread 


327 


NONSENSE- 
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Figure 6-38 Schematic illustration of 

an export-ready mRNA molecule and 

its transport through the nuclear pore. 
As indicated, some proteins travel with 

the mRNA as it moves through the pore, 
whereas others remain in the nucleus. The 
nuclear export receptor for mRNAs is a 
complex of proteins that binds to an MRNA 
molecule once it has been correctly spliced 
and polyadenylated. After the mRNA has 
been exported to the cytosol, this export 
receptor dissociates from the mRNA and is 
re-imported into the nucleus, where it can 
be used again. The final check indicated 
here, called nonsense-mediated decay, will 
be described later in the chapter. 
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out in small clusters on five different chromosomes (see Figure 4-11), while cells 
of the frog Xenopus contain about 600 rRNA gene copies per haploid genome in a 
single cluster on one chromosome (Figure 6-39). 

There are four types of eukaryotic rRNAs, each present in one copy per ribo- 
some. Three of the four rRNAs (18S, 5.8S, and 28S) are made by chemically mod- 
ifying and cleaving a single large precursor rRNA (Figure 6-40); the fourth (5S 
RNA) is synthesized from a separate cluster of genes by a different polymerase, 
RNA polymerase III, and does not require chemical modification. 

Extensive chemical modifications occur in the 13,000-nucleotide-long precur- 
sor rRNA before the rRNAs are cleaved out of it and assembled into ribosomes. 
These include about 100 methylations of the 2'-OH positions on nucleotide sugars 
and 100 isomerizations of uridine nucleotides to pseudouridine (Figure 6-41A). 
The functions of these modifications are not understood in detail, but they proba- 
bly aid in the folding and assembly of the final rRNAs, or subtly alter the function 
of ribosomes. Each modification is made at a specific position in the precursor 
rRNA, specified by “guide RNAs,’ which position themselves on the precursor 
rRNA through base-pairing and thereby bring an RNA-modifying enzyme to the 
appropriate position (Figure 6-41B). Other guide RNAs promote cleavage of the 
precursor rRNAs into the mature rRNAs, probably by causing conformational 
changes in the precursor rRNA that expose these sites to nucleases. All of these 
guide RNAs are members of a large class of RNAs called small nucleolar RNAs 
(or snoRNAs), so named because these RNAs perform their functions in a sub- 
compartment of the nucleus called the nucleolus. Many snoRNAs are encoded in 
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Figure 6-39 Transcription from tandemly 
arranged rRNA genes, as seen in 

the electron microscope. The pattern 

of alternating transcribed gene and 
nontranscribed spacer is readily seen. 

A higher-magnification view of rRNA genes 
is shown in Figure 6-10. (From V.E. Foe, 
Cold Spring Harb. Symp. Quant. Biol. 
42:723-740, 1978. With permission from 
Cold Spring Harbor Laboratory Press.) 


Figure 6-40 The chemical modification 
and nucleolytic processing of a 
eukaryotic 45S precursor rRNA molecule 
into three separate ribosomal RNAs. 
Two types of chemical modifications 
(color-coded as indicated in Figure 6-41) 
are made to the precursor rRNA before 

it is cleaved. Nearly half of the nucleotide 
sequences in this precursor rRNA are 
discarded and degraded in the nucleus 
by the exosome. The rRNAs are named 
according to their “S” values, which 

refer to their rate of sedimentation in an 
ultracentrifuge. The larger the S value, the 
larger the rRNA. 
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the introns of other genes, especially those encoding ribosomal proteins. They are 
synthesized by RNA polymerase II and processed from excised intron sequences. 


The Nucleolus Is a Ribosome-Producing Factory 


The nucleolus is the most obvious structure seen in the nucleus of a eukaryotic 
cell when viewed in the light microscope. It was so closely scrutinized by early 
cytologists that an 1898 review could list some 700 references. We now know 
that the nucleolus is the site for the processing of rRNAs and their assembly into 
ribosome subunits. Unlike many of the major organelles in the cell, the nucle- 
olus is not bound by a membrane (Figure 6-42); instead, it is a huge aggregate 
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Figure 6-41 Modifications of the 
precursor rRNA by guide RNAs. (A) Two 
prominent covalent modifications made 

to rRNA; the differences from the initially 
incorporated nucleotide are indicated by 
red atoms. Pseudouridine is an isomer of 
uridine; the base has been “rotated,” and is 
attached to the red C rather than to the red 
N of the sugar (compare to Figure 6-5B). 

(B) As indicated, snoRNAs determine the 
sites of modification by base-pairing to 
complementary sequences on the precursor 
rRNA. The snoRNAs are bound to proteins, 
and the complexes are called snoRNPs 
(small nucleolar ribonucleoproteins). 
SNORNPs contain both the guide sequences 
and the enzymes that modify the rRNA. 


Figure 6-42 Electron micrograph of a 
thin section of a nucleolus in a human 
fibroblast, showing its three distinct 
zones. (A) View of entire nucleus. 

(B) Higher-power view of the nucleolus. It 

is believed that transcription of the rRNA 
genes takes place between the fibrillar center 
and the dense fibrillar component and that 
processing of the rRNAs and their assembly 
into the two subunits of the ribosome 
proceeds outward from the dense fibrillar 
component to the surrounding granular 
components. (Courtesy of E.G. Jordan and 
J. McGovern.) 


fibrillar 
center 


Figure 6-43 Changes in the appearance of the nucleolus in a human cell 
during the cell cycle. Only the cell nucleus is represented in this diagram. In 
most eukaryotic cells, the nuclear envelope breaks down during mitosis, as 
indicated by the dashed circles. 


of macromolecules, including the rRNA genes themselves, precursor rRNAs, 
mature rRNAs, rRNA-processing enzymes, snoRNPs, a large set of assembly fac- 
tors (including ATPases, GTPases, protein kinases, and RNA helicases), ribosomal 
proteins, and partly assembled ribosomes. The close association of all these com- 
ponents allows the assembly of ribosomes to occur rapidly and smoothly. 

Various types of RNA molecules play a central part in the chemistry and struc- 
ture of the nucleolus, suggesting that it may have evolved from an ancient struc- 
ture present in cells dominated by RNA catalysis. In present-day cells, the rRNA 
genes have an important role in forming the nucleolus. In a diploid human cell, 
the rRNA genes are distributed into 10 clusters, located near the tips of five dif- 
ferent chromosome pairs (see Figure 4-11). During interphase, these 10 chromo- 
somes contribute DNA loops (containing the rRNA genes) to the nucleolus; in 
M phase, when the chromosomes condense, the nucleolus fragments and then 
disappears. Then, in the telophase part of mitosis, as chromosomes return to 
their semi-dispersed state, the tips of the 10 chromosomes reform small nucle- 
oli, which progressively coalesce into a single nucleolus (Figure 6-43 and Figure 
6-44). As might be expected, the size of the nucleolus reflects the number of ribo- 
somes that the cell is producing. Its size therefore varies greatly in different cells 
and can change in a single cell, occupying 25% of the total nuclear volume in cells 
that are making unusually large amounts of protein. 

Ribosome assembly is a complex process, the most important features of which 
are outlined in Figure 6-45. In addition to its central role in ribosome biogenesis, 
the nucleolus is the site where other noncoding RNAs are produced and other 
RNA-protein complexes are assembled. For example, the U6 snRNP, which func- 
tions in pre-mRNA splicing (see Figure 6-28), is composed of one RNA molecule 
and at least seven proteins. The U6 snRNA is chemically modified by snoRNAs in 
the nucleolus before its final assembly there into the U6 snRNP. Other important 
RNA-protein complexes, including telomerase (encountered in Chapter 5) and the 
signal-recognition particle (which we discuss in Chapter 12), are assembled at the 
nucleolus. Finally, the tRNAs (transfer RNAs) that carry the amino acids for pro- 
tein synthesis are processed there as well; like the rRNA genes, the genes encoding 
tRNAs are clustered in the nucleolus. Thus, the nucleolus can be thought of as a 
large factory at which different noncoding RNAs are transcribed, processed, and 
assembled with proteins to form a large variety of ribonucleoprotein complexes. 
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Figure 6-44 Nucleolar fusion. These light micrographs of human fibroblasts grown in culture show 
various stages of nucleolar fusion. After mitosis, each of the 10 human chromosomes that carry a 
cluster of rRNA genes begins to form a tiny nucleolus, but these rapidly coalesce as they grow to 
form the single large nucleolus typical of many interphase cells. (Courtesy of E.G. Jordan and 

J. McGovern.) 
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The Nucleus Contains a Variety of Subnuclear Aggregates 


Although the nucleolus is the most prominent structure in the nucleus, sev- 
eral other nuclear bodies have been observed and studied (Figure 6-46). These 
include Cajal bodies (named for the scientist who first described them in 1906) 
and interchromatin granule clusters (also called “speckles”). Like the nucleolus, 
these other nuclear structures lack membranes and are highly dynamic depend- 
ing on the needs of the cell. Their assembly is likely mediated by the association of 
low complexity protein domains, as described in Chapter 3 (see Figure 3-36). Their 
appearance is the result of the tight association of protein and RNA components 
involved in the synthesis, assembly, and storage of macromolecules involved in 
gene expression. Cajal bodies are sites where the snRNPs and snoRNPs undergo 
their final maturation steps, and where the snRNPs are recycled and their RNAs 
are “reset” after the rearrangements that occur during splicing (see p. 321). In 
contrast, the interchromatin granule clusters have been proposed to be stockpiles 
of fully mature snRNPs and other RNA processing components that are ready to 
be used in the production of mRNA. 

Scientists have had difficulties in working out the function of these small sub- 
nuclear structures, in part because their appearances can change dramatically as 
cells traverse the cell cycle or respond to changes in their environment. Moreover, 


331 


Figure 6-45 The function of the 
nucleolus in ribosome and other 
ribonucleoprotein synthesis. The 45S 
precursor rRNA is packaged in a large 
ribonucleoprotein particle containing 
many ribosomal proteins imported from 
the cytoplasm. While this particle remains 
at the nucleolus, selected components 
are added and others discarded as it is 
processed into immature large and small 
ribosomal subunits. The two ribosomal 
subunits attain their final functional form 
only after each is individually transported 
through the nuclear pores into the 
cytoplasm. Other ribonucleoprotein 
complexes, including telomerase shown 
here, are also assembled in the nucleolus. 
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Figure 6-46 Visualization of some prominent nuclear bodies. The 

protein fibrillarin (red), a component of several SNNORNPs, is present at both 
nucleoli and Cajal bodies; the latter are indicated by the arrows. The Cajal 
bodies (but not the nucleoli) are also highlighted by staining one of their 

main components, the protein coilin; the superposition of the snoRNP and 
coilin stains appears pink. Interchromatin granule clusters (green) have been 
revealed by using antibodies against a protein involved in pre-mRNA splicing. 
DNA is stained blue by the dye DAPI. (From J.R. Swediow and A.I. Lamond, 
Gen. Biol. 2:1-7, 2001. With permission from BioMed Central. Micrograph 
courtesy of Judith Sleeman.) 


disrupting a particular type of nuclear body often has little effect on cell viabil- 
ity. It seems that the main function of these aggregates is to bring components 
together at high concentration in order to speed up their assembly. For example, 
it is estimated that assembly of the U4/U6 snRNP (see Figure 6-28) occurs ten 
times more rapidly in Cajal bodies than would be the case if the same number of 
components were dispersed throughout the nucleus. Consequently, Cajal bodies 
appear dispensible in many types of cells but are absolutely required in situations 
where cells must proliferate rapidly, such as in early vertebrate development. 
Here, protein synthesis (which depends on RNA splicing) must be especially 
rapid, and delays can be lethal. 

Given the prominence of nuclear bodies in RNA processing, it might be 
expected that pre-mRNA splicing would occur in a particular location in the 
nucleus, as it requires numerous RNA and protein components. However, as we 
have seen, the assembly of splicing components on pre-mRNA is co-transcrip- 
tional; thus, splicing must occur at many locations along chromosomes. Although 
a typical mammalian cell may be expressing on the order of 15,000 genes, tran- 
scription and RNA splicing takes place in only several thousand sites in the 
nucleus. These sites are highly dynamic and probably result from the association 
of transcription and splicing components to create small factories, the name given 
to specific aggregates containing a high local concentration of selected compo- 
nents that create biochemical assembly lines (Figure 6-47). Interchromatin 
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Figure 6-47 A model for an mRNA production factory. MRNA production is made more efficient in the nucleus by an aggregation of the many 
components needed for transcription and pre-mRNA processing, thereby producing a specialized biochemical factory. In (A), a postulated scaffold 
protein holds various components in the proximity of a transcribing RNA polymerase. Other key components are bound directly to the RNA 
polymerase tail, which likewise serves as a scaffold (see Figure 6-22), but for simplicity these are not shown here. In (B), a large number of such 
scaffolds have been brought together to form an aggregate that is highly enriched in the many components needed for the synthesis and processing 
of pre-mRNAs. Such a scaffold model can account for the several thousand sites of active RNA transcription and processing typically observed 

in the nucleus of a mammalian cell, each of which has a diameter of roughly 100nm and is estimated to contain, on average, about 10 RNA 
polymerase I| molecules in addition to many other proteins. (C) Here, MRNA production factories and DNA replication factories have been visualized 
in the same mammalian cell by briefly incorporating differently modified nucleotides into each nucleic acid and detecting the RNA and DNA produced 
using antibodies, one (green) detecting the newly synthesized DNA and the other (red) detecting the newly synthesized RNA. (C, from D.G. Wansink 
et al., J. Cell Sci. 107:1449-1456, 1994. With permission from The Company of Biologists.) 
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granule clusters—which contain stockpiles of RNA processing components—are 
often observed next to these sites of transcription, as though poised to replen- 
ish supplies. We can thus view the nucleus as organized into subdomains, with 
snRNPs, snoRNPs, and other nuclear components moving among them in an 
orderly fashion according to the needs of the cell. 


Summary 


Before the synthesis of a particular protein can begin, the corresponding mRNA 
molecule must be produced by transcription. Bacteria contain a single type of RNA 
polymerase (the enzyme that carries out the transcription of DNA into RNA). An 
mRNA molecule is produced after this enzyme initiates transcription at a promoter, 
synthesizes the RNA by chain elongation, stops transcription at a terminator, and 
releases both the DNA template and the completed mRNA molecule. In eukary- 
otic cells, the process of transcription is much more complex, and there are three 
RNA polymerases—polymerase I, II, and IlI—that are related evolutionarily to one 
another and to the bacterial polymerase. 

RNA polymerase II synthesizes eukaryotic mRNA. This enzyme requires a set of 
additional proteins, both the general transcription factors, and specific transcrip- 
tional activator proteins, to initiate transcription on a DNA template. It requires 
still more proteins (including chromatin remodeling complexes and histone-mod- 
ifying enzymes) to initiate transcription on its chromatin templates inside the cell. 

During the elongation phase of transcription, the nascent RNA undergoes three 
types of processing events: a special nucleotide is added to its 5' end (capping), 
intron sequences are removed from the middle of the RNA molecule (splicing), and 
the 3' end of the RNA is generated (cleavage and polyadenylation). Each of these 
processes is initiated by proteins that travel along with RNA polymerase II by bind- 
ing to sites on its long, extended C-terminal tail. Splicing is unusual in that many 
of its key steps are carried out by specialized RNA molecules rather than proteins. 
Only properly processed mRNAs are passed through nuclear pore complexes into 
the cytosol, where they are translated into protein. 

For many genes, RNA, rather than protein, is the final product. In eukaryotes, 
these genes are usually transcribed by either RNA polymerase I or RNA polymerase 
III. RNA polymerase I makes the ribosomal RNAs. After their synthesis as a large 
precursor, the rRNAs are chemically modified, cleaved, and assembled into the two 
ribosomal subunits in the nucleolus—a distinct subnuclear structure that also helps 
to process some smaller RNA-protein complexes in the cell. Additional subnuclear 
structures (including Cajal bodies and interchromatin granule clusters) are sites 
where components involved in RNA processing are assembled, stored, and recycled. 
The high concentration of components in such “factories” ensures that the processes 
being catalyzed are rapid and efficient. 


FROM RNA TO PROTEIN 


In the preceding section, we have seen that the final product of some genes is an 
RNA molecule itself, such as the RNAs present in the snRNPs and in ribosomes. 
However, most genes in a cell produce mRNA molecules that serve as intermedi- 
aries on the pathway to proteins. In this section, we examine how the cell converts 
the information carried in an mRNA molecule into a protein molecule. This feat 
of translation was a strong focus of attention for biologists in the late 1950s, when 
it was posed as the “coding problem”: how is the information in a linear sequence 
of nucleotides in RNA translated into the linear sequence of a chemically quite 
different set of units—the amino acids in proteins? This fascinating question stim- 
ulated great excitement. Here was a cryptogram set up by nature that, after more 
than 3 billion years of evolution, could finally be solved by one of the products 
of evolution—human beings. And indeed, not only was the code cracked step by 
step, but in the year 2000 the structure of the elaborate machinery by which cells 
read this code—the ribosome—was finally revealed in atomic detail. 
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GCU CGU GAU AAU UGU GAG CAG GGU CAU AUU CUU AAG AUG UUU CCU 
Ala Arg Asp Asn Cys Glu Gln Gly His Ile Leu Lys Met Phe Pro 
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An MRNA Sequence Is Decoded in Sets of Three Nucleotides 


Once an mRNA has been produced by transcription and processing, the informa- 
tion present in its nucleotide sequence is used to synthesize a protein. Transcrip- 
tion is simple to understand as a means of information transfer: since DNA and 
RNA are chemically and structurally similar, the DNA can act as a direct template 
for the synthesis of RNA by complementary base-pairing. As the term transcrip- 
tion signifies, it is as if a message written out by hand is being converted, say, into 
a typewritten text. The language itself and the form of the message do not change, 
and the symbols used are closely related. 

In contrast, the conversion of the information in RNA into protein represents 
a translation of the information into another language that uses quite different 
symbols. Moreover, since there are only 4 different nucleotides in mRNA and 20 
different types of amino acids in a protein, this translation cannot be accounted 
for by a direct one-to-one correspondence between a nucleotide in RNA and an 
amino acid in protein. The nucleotide sequence of a gene, through the interme- 
diary of mRNA, is instead translated into the amino acid sequence of a protein by 
rules that are known as the genetic code. This code was deciphered in the early 
1960s. 

The sequence of nucleotides in the mRNA molecule is read in consecutive 
groups of three. RNA is a linear polymer of four different nucleotides, so there are 
4 x 4 x 4 = 64 possible combinations of three nucleotides: the triplets AAA, AUA, 
AUG, and so on. However, only 20 different amino acids are commonly found in 
proteins. Either some nucleotide triplets are never used, or the code is redundant 
and some amino acids are specified by more than one triplet. The second possi- 
bility is, in fact, the correct one, as shown by the completely deciphered genetic 
code in Figure 6-48. Each group of three consecutive nucleotides in RNA is called 
a codon, and each codon specifies either one amino acid or a stop to the transla- 
tion process. 

This genetic code is used universally in all present-day organisms. Although a 
few slight differences in the code have been found, these are chiefly in the DNA of 
mitochondria. Mitochondria have their own transcription and protein-synthesis 
systems that operate quite independently from those of the rest of the cell, and it 
is understandable that their tiny genomes have been able to accommodate minor 
changes to the code (discussed in Chapter 14). 

In principle, an RNA sequence can be translated in any one of three different 
reading frames, depending on where the decoding process begins (Figure 6-49). 
However, only one of the three possible reading frames in an MRNA encodes the 
required protein. We see later how a special punctuation signal at the beginning of 
each RNA message sets the correct reading frame at the start of protein synthesis. 


tRNA Molecules Match Amino Acids to Codons in mRNA 


The codons in an mRNA molecule do not directly recognize the amino acids they 
specify: the group of three nucleotides does not, for example, bind directly to the 
amino acid. Rather, the translation of mRNA into protein depends on adaptor 
molecules that can recognize and bind both to the codon and, at another site on 
their surface, to the amino acid. These adaptors consist of a set of small RNA mol- 
ecules known as transfer RNAs (tRNAs), each about 80 nucleotides in length. 


AGC 

AGU 

UCA ACA GUA 

UCC ACC GUC UAA 
UCG ACG UAC GUG UAG 
UCU ACU UGG UAU GUU UGA 


Thr Trp Tyr Val stop 


Figure 6-48 The genetic code. The 
standard one-letter abbreviation for each 
amino acid is presented below its three- 
letter abbreviation (see Panel 3-1, pp. 112- 
113, for the full name of each amino acid 
and its structure). By convention, codons 
are always written with the 5'-terminal 
nucleotide to the left. Note that most amino 
acids are represented by more than one 
codon, and that there are some regularities 
in the set of codons that specifies each 
amino acid: codons for the same amino 
acid tend to contain the same nucleotides 
at the first and second positions, and 

vary at the third position. Three codons 

do not specify any amino acid but act as 
termination sites (stop codons), signaling 
the end of the protein-coding sequence. 
One codon—AUG— acts both as an 
initiation codon, signaling the start of a 
protein-coding message, and also as the 
codon that specifies methionine. 
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Figure 6-49 The three possible 

reading frames in protein synthesis. 

In the process of translating a nucleotide 
sequence (blue) into an amino acid 
sequence (red), the sequence of 
nucleotides in an MRNA molecule is read 
from the 5’ end to the 3’ end in consecutive 
sets of three nucleotides. In principle, 
therefore, the same RNA sequence can 
specify three completely different amino 
acid sequences, depending on the reading 
frame. In reality, however, only one of 

these reading frames contains the actual 
message. 
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Figure 6-50 A tRNA molecule. A tRNA specific for the amino acid phenylalanine (Phe) is depicted in various ways. (A) The 
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cloverleaf structure showing the complementary base-pairing (red lines) that creates the double-helical regions of the molecule. 


The anticodon is the sequence of three nucleotides that base-pairs with a codon in MRNA. The amino acid matching the 
codon/anticodon pair is attached at the 3’ end of the tRNA. tRNAs contain some unusual bases, which are produced by 
chemical modification after the tRNA has been synthesized. For example, the bases denoted y (pseudouridine— see Figure 


6-41) and D (dihydrouridine — see Figure 6-53) are derived from uracil. (B and C) Views of the L-shaped molecule, based on 
x-ray diffraction analysis. Although this diagram shows the tRNA for the amino acid phenylalanine, all other tRNAs have similar 
structures. (D) The tRNA icon we use in this book. (E) The linear nucleotide sequence of the molecule, color-coded to match (A), 


(B), and (C). 


We saw earlier in this chapter that RNA molecules can fold into precise 
three-dimensional structures, and the tRNA molecules provide a striking exam- 
ple. Four short segments of the folded tRNA are double-helical, producing a mol- 
ecule that looks like a cloverleaf when drawn schematically (Figure 6-50). For 
example, a 5'-GCUC-3’ sequence in one part of a polynucleotide chain can form 
a relatively strong association with a 5'-GAGC-3' sequence in another region of 
the same molecule. The cloverleaf undergoes further folding to form a compact 
L-shaped structure that is held together by additional hydrogen bonds between 
different regions of the molecule (see Figure 6-50B and C). 

Two regions of unpaired nucleotides situated at either end of the L-shaped 
molecule are crucial to the function of tRNA in protein synthesis. One of these 
regions forms the anticodon, a set of three consecutive nucleotides that pairs 
with the complementary codon in an mRNA molecule. The other is a short sin- 
gle-stranded region at the 3’ end of the molecule; this is the site where the amino 
acid that matches the codon is attached to the tRNA. 

We saw above that the genetic code is redundant; that is, several different 
codons can specify a single amino acid. This redundancy implies either that there 
is more than one tRNA for many of the amino acids or that some tRNA molecules 
can base-pair with more than one codon. In fact, both situations occur. Some 
amino acids have more than one tRNA and some tRNAs are constructed so that 
they require accurate base-pairing only at the first two positions of the codon 
and can tolerate a mismatch (or wobble) at the third position (Figure 6-51). This 
wobble base-pairing explains why so many of the alternative codons for an amino 
acid differ only in their third nucleotide (see Figure 6-48). In bacteria, wobble 
base-pairings make it possible to fit the 20 amino acids to their 61 codons with as 
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Figure 6-51 Wobble base-pairing between codons and anticodons. If the 
nucleotide listed in the first column is present at the third, or wobble, position 
of the codon, it can base-pair with any of the nucleotides listed in the second 
column. Thus, for example, when inosine (I) is present in the wobble position 
of the tRNA anticodon, the tRNA can recognize any one of three different 
codons in bacteria and either of two codons in eukaryotes. The inosine in 
tRNAs is formed from the deamination of adenosine (see Figure 6-53), a 
chemical modification that takes place after the tRNA has been synthesized. 
The nonstandard base pairs, including those made with inosine, are generally 
weaker than conventional base pairs. Codon—anticodon base-pairing is 

more stringent at positions 1 and 2 of the codon, where only conventional 
base pairs are permitted. The differences in wobble base-pairing interactions 
between bacteria and eukaryotes presumably result from subtle structural 
differences between bacterial and eukaryotic ribosomes, the molecular 
machines that perform protein synthesis. (Adapted from C. Guthrie and 

J. Abelson, in The Molecular Biology of the Yeast Saccharomyces: 
Metabolism and Gene Expression, pp. 487-528. Cold Spring Harbor, New 
York: Cold Spring Harbor Laboratory Press, 1982.) 


few as 31 kinds of tRNA molecules. The exact number of different kinds of tRNAs, 
however, differs from one species to the next. For example, humans have nearly 
500 tRNA genes, and among them 48 different anticodons are represented. 


tRNAs Are Covalently Modified Before They Exit from the Nucleus 


Like most other eukaryotic RNAs, tRNAs are covalently modified before they are 
allowed to exit from the nucleus. Eukaryotic tRNAs are synthesized by RNA poly- 
merase III. Both bacterial and eukaryotic tRNAs are typically synthesized as larger 
precursor tRNAs, which are then trimmed to produce the mature tRNA. In addi- 
tion, some tRNA precursors (from both bacteria and eukaryotes) contain introns 
that must be spliced out. This splicing reaction differs chemically from pre-mRNA 
splicing; rather than generating a lariat intermediate, tRNA splicing uses a cut- 
and-paste mechanism that is catalyzed by proteins (Figure 6-52). Trimming and 
splicing both require the precursor tRNA to be correctly folded in its cloverleaf 
configuration. Because misfolded tRNA precursors will not be processed prop- 
erly, the trimming and splicing reactions serve as quality-control steps in the gen- 
eration of tRNAs. 

All tRNAs are modified chemically—nearly 1 in 10 nucleotides in each mature 
tRNA molecule is an altered version of a standard G, U, C, or A ribonucleotide. 
Over 50 different types of tRNA modifications are known; a few are shown in Fig- 
ure 6-53. Some of the modified nucleotides—most notably inosine, produced by 
the deamination of adenosine—affect the conformation and base-pairing of the 
anticodon and thereby facilitate the recognition of the appropriate mRNA codon 
by the tRNA molecule (see Figure 6-51). Others affect the accuracy with which the 
tRNA is attached to the correct amino acid. 


Specific Enzymes Couple Each Amino Acid to Its Appropriate 
tRNA Molecule 


We have seen that, to read the genetic code in DNA, cells make a series of differ- 
ent tRNAs. We now consider how each tRNA molecule becomes linked to the one 
amino acid in 20 that is its appropriate partner. Recognition and attachment of the 
correct amino acid depends on enzymes called aminoacyl-tRNA synthetases, 
which covalently couple each amino acid to its appropriate set of tRNA molecules 
(Figure 6-54 and Figure 6-55). Most cells have a different synthetase enzyme for 
each amino acid (that is, 20 synthetases in all); one attaches glycine to all tRNAs 
that recognize codons for glycine, another attaches alanine to all tRNAs that rec- 
ognize codons for alanine, and so on. Many bacteria, however, have fewer than 20 
synthetases, and the same synthetase enzyme is responsible for coupling more 
than one amino acid to the appropriate tRNAs. In these cases, a single synthetase 
places the identical amino acid on two different types of tRNAs, only one of which 
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Figure 6-52 Structure of a tRNA-splicing 
endonuclease docked to a precursor 
tRNA. The endonuclease (a four-subunit 
enzyme) removes the tRNA intron (dark 
blue, bottom). A second enzyme, a 
multifunctional tRNA ligase (not shown), 
then joins the two tRNA halves together. 
(Courtesy of Hong Li, Christopher Trotta, 
and John Abelson; PDB code: 2A9L.) 
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has an anticodon that matches the amino acid. A second enzyme then chemically 
modifies each “incorrectly” attached amino acid so that it now corresponds to the 
anticodon displayed by its covalently linked tRNA. 

The synthetase-catalyzed reaction that attaches the amino acid to the 3’ end of 
the tRNA is one of many reactions coupled to the energy-releasing hydrolysis of 
ATP (see pp. 64-65), and it produces a high-energy bond between the tRNA and 
the amino acid. The energy of this bond is used at a later stage in protein synthesis 
to link the amino acid covalently to the growing polypeptide chain. 

The aminoacyl-tRNA synthetase enzymes and the tRNAs are equally important 
in the decoding process (Figure 6-56). This was established by an experiment in 
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Figure 6-54 Amino acid activation by synthetase enzymes. An amino acid is activated for 
protein synthesis by an aminoacyl-tRNA synthetase enzyme in two steps. As indicated, the energy 
of ATP hydrolysis is used to attach each amino acid to its tRNA molecule in a high-energy linkage. 
The amino acid is first activated through the linkage of its carboxyl group directly to AMP, forming 
an adenylated amino acid; the linkage of the AMP, normally an unfavorable reaction, is driven by the 
hydrolysis of the ATP molecule that donates the AMP. Without leaving the synthetase enzyme, the 
AMP-linked carboxyl group on the amino acid is then transferred to a hydroxyl group on the sugar 
at the 3’ end of the tRNA molecule. This transfer joins the amino acid by an activated ester linkage 
to the tRNA and forms the final aminoacyl-tRNA molecule. The synthetase enzyme is not shown in 
this diagram. 
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Figure 6-53 A few of the unusual 
nucleotides found in tRNA molecules. 
These nucleotides are produced by 
covalent modification of a normal 
nucleotide after it has been incorporated 
into an RNA chain. Two other types of 
modified nucleotides are shown in Figure 
6-41. In most tRNA molecules, about 
10% of the nucleotides are modified (see 
Figure 6-50). As shown in Figure 6-51, 
inosine is sometimes present at the wobble 
position in the tRNA anticodon. 
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Figure 6-55 The structure of the 
aminoacyl-tRNA linkage. The carboxy! 
end of the amino acid forms an ester 

bond to ribose. Because the hydrolysis of 
this ester bond is associated with a large 
favorable change in free energy, an amino 
acid held in this way is said to be activated. 
(A) Schematic drawing of the structure. The 
amino acid is linked to the nucleotide at 
the 3’ end of the tRNA (see Figure 6-50). 
(B) Actual structure corresponding to the 
boxed region in (A). There are two major 
classes of synthetase enzymes: one links 
the amino acid directly to the 3’-OH group 
of the ribose, and the other links it initially 
H—C—R_ amino acid to the 2’-OH group. In the latter case, a 
subsequent transesterification reaction 
shifts the amino acid to the 3’ position. 

As in Figure 6-54, the “R group” indicates 
the side chain of the amino acid. 
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which one amino acid (cysteine) was chemically converted into a different amino 
acid (alanine) after it already had been attached to its specific tRNA. When such 
“hybrid” aminoacyl-tRNA molecules were used for protein synthesis in a cell-free 
system, the wrong amino acid was inserted at every point in the protein chain 
where that tRNA was used. Although, as we shall see, cells have several quality 
control mechanisms to avoid this type of mishap, the experiment did establish 
that the genetic code is translated by two sets of adaptors that act sequentially. 
Each matches one molecular surface to another with great specificity, and it is 
their combined action that associates each sequence of three nucleotides in the 
mRNA molecule—that is, each codon—with its particular amino acid. 


Editing by tRNA Synthetases Ensures Accuracy 


Several mechanisms working together ensure that an aminoacyl-tRNA synthe- 
tase links the correct amino acid to each tRNA. Most synthetase enzymes select 
the correct amino acid by a two-step mechanism. The correct amino acid has 
the highest affinity for the active-site pocket of its synthetase and is therefore 
favored over the other 19; in particular, amino acids larger than the correct one 
are excluded from the active site. However, accurate discrimination between two 
similar amino acids, such as isoleucine and valine (which differ by only a methyl 
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Figure 6-56 The genetic code is translated by means of two adaptors that act one after another. The first adaptor is the aminoacyl-tRNA 
synthetase, which couples a particular amino acid to its corresponding tRNA; the second adaptor is the tRNA molecule itself, whose anticodon 
forms base pairs with the appropriate codon on the mRNA. An error in either step would cause the wrong amino acid to be incorporated into a 
protein chain (Movie 6.6). In the sequence of events shown, the amino acid tryptophan (Trp) is selected by the codon UGG on the mRNA. 
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group), is very difficult to achieve in a single step. A second discrimination step 
occurs after the amino acid has been covalently linked to AMP (see Figure 6-54): 
when tRNA binds, the synthetase tries to force the adenylated amino acid into 
a second editing pocket in the enzyme. The precise dimensions of this pocket 
exclude the correct amino acid, while allowing access by closely related amino 
acids. In the editing pocket, an amino acid is removed from the AMP (or from the 
tRNA itself if the aminoacyl-tRNA bond has already formed) by hydrolysis. This 
hydrolytic editing, which is analogous to the exonucleolytic proofreading by DNA 
polymerases, increases the overall accuracy of tRNA charging to approximately 
one mistake in 40,000 couplings (Figure 6-57). 

The tRNA synthetase must also recognize the correct set of tRNAs, and exten- 
sive structural and chemical complementarity between the synthetase and the 
tRNA allows the synthetase to probe various features of the tRNA (Figure 6-58). 
Most tRNA synthetases directly recognize the matching tRNA anticodon; these 
synthetases contain three adjacent nucleotide-binding pockets, each of which is 
complementary in shape and charge to a nucleotide in the anticodon. For other 
synthetases, the nucleotide sequence of the amino acid-accepting arm (acceptor 
stem) is the key recognition determinant. In most cases, however, the synthetase 
“reads” the nucleotides at several different positions on the tRNA. 


Amino Acids Are Added to the C-terminal End of a Growing 
Polypeptide Chain 


Having seen that each amino acid is first coupled to specific tRNA molecules, we 
now turn to the mechanism that joins these amino acids together to form proteins. 
The fundamental reaction of protein synthesis is the formation of a peptide bond 
between the carboxyl group at the end of a growing polypeptide chain and a free 
amino group on an incoming amino acid. Consequently, a protein is synthesized 
stepwise from its N-terminal end to its C-terminal end. Throughout the entire 
process, the growing carboxyl end of the polypeptide chain remains activated 
by its covalent attachment to a tRNA molecule (forming a peptidyl-tRNA). Each 
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Figure 6-57 Hydrolytic editing. 

(A) Aminoacyl tRNA synthetases correct 
their own coupling errors through hydrolytic 
editing of incorrectly attached amino acids. 
As described in the text, the correct amino 
acid is rejected by the editing site. 

(B) The error-correction process performed 
by DNA polymerase has similarities; 
however, it differs because the removal 
process depends strongly on a mispairing 
with the template (see Figure 5-8). 

(P, polymerization site; E, editing site.) 
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Figure 6-58 The recognition of a 

tRNA molecule by its aminoacyl-tRNA 
synthetase. For this tRNA (tRNA@), 
specific nucleotides in both the anticodon 
(dark blue) and the amino acid-accepting 
arm (green) allow the correct tRNA to be 
recognized by the synthetase enzyme 
(yellow-green). A bound ATP molecule is 
yellow. (Courtesy of Tom Steitz; 

PDB code: 1QRS.) 
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Figure 6-59 The incorporation of an amino acid into a protein. A polypeptide chain grows by the stepwise addition of amino 
acids to its C-terminal end. The formation of each peptide bond is energetically favorable because the growing C-terminus 

has been activated by the covalent attachment of a tRNA molecule. The peptidyl-tRNA linkage that activates the growing end 

is regenerated during each addition. The amino acid side chains have been abbreviated as R1, Re, Rs, and Ry; as a reference 
point, all of the atoms in the second amino acid in the polypeptide chain are shaded gray. The figure shows the addition of the 


fourth amino acid (red) to the growing chain. 


addition disrupts this high-energy covalent linkage, but immediately replaces it 
with an identical linkage on the most recently added amino acid (Figure 6-59). 
In this way, each amino acid added carries with it the activation energy for the 
addition of the next amino acid rather than the energy for its own addition—an 
example of the “head growth” type of polymerization described in Figure 2-44. 


The RNA Message Is Decoded in Ribosomes 


The synthesis of proteins is guided by information carried by mRNA molecules. To 
maintain the correct reading frame and to ensure accuracy (about 1 mistake every 
10,000 amino acids), protein synthesis is performed in the ribosome, a complex 
catalytic machine made from more than 50 different proteins (the ribosomal 
proteins) and several RNA molecules, the ribosomal RNAs (rRNAs). A typical 
eukaryotic cell contains millions of ribosomes in its cytoplasm (Figure 6-60). The 
large and small ribosome subunits are assembled at the nucleolus, where newly 
transcribed and modified rRNAs associate with the ribosomal proteins that have 
been transported into the nucleus after their synthesis in the cytoplasm. These two 
ribosomal subunits are then exported to the cytoplasm, where they join together 
to synthesize proteins. 





Figure 6-60 Ribosomes in the cytoplasm 
of a eukaryotic cell. This electron 
micrograph shows a thin section of a small 
region of cytoplasm. The ribosomes appear 
as black dots (red arrows). Some are 

free in the cytosol; others are attached to 
membranes of the endoplasmic reticulum. 
(Courtesy of Daniel S. Friend.) 
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Figure 6-61 A comparison of bacterial and eukaryotic ribosomes. Despite differences in the number and size of their 
rRNA and protein components, both bacterial and eukaryotic ribosomes have nearly the same structure and they function 
similarly. Although the 18S and 28S rRNAs of the eukaryotic ribosome contain many nucleotides not present in their bacterial 
counterparts, these nucleotides are present as multiple insertions that form extra domains and leave the basic structure of the 


rRNA largely unchanged. 


Eukaryotic and bacterial ribosomes have similar structures and functions, 
being composed of one large and one small subunit that fit together to form a 
complete ribosome with a mass of several million daltons (Figure 6-61 ). The small 
subunit provides the framework on which the tRNAs are accurately matched to 
the codons of the mRNA, while the large subunit catalyzes the formation of the 
peptide bonds that link the amino acids together into a polypeptide chain (see 
Figure 6-58). 

When not actively synthesizing proteins, the two subunits of the ribosome 
are separate. They join together on an MRNA molecule, usually near its 5’ end, 
to initiate the synthesis of a protein. The mRNA is then pulled through the ribo- 
some, three nucleotides at a time. As its codons enter the core of the ribosome, the 
mRNA nucleotide sequence is translated into an amino acid sequence using the 
tRNAs as adaptors to add each amino acid in the correct sequence to the growing 
end of the polypeptide chain. When a stop codon is encountered, the ribosome 
releases the finished protein, and its two subunits separate again. These subunits 
can then be used to start the synthesis of another protein on another mRNA mol- 
ecule. Ribosomes operate with remarkable efficiency: in one second, a eukaryotic 
ribosome adds 2 amino acids to a polypeptide chain; the ribosomes of bacterial 
cells operate even faster, at a rate of about 20 amino acids per second. 

To choreograph the many coordinated movements required for efficient trans- 
lation, a ribosome contains four binding sites for RNA molecules: one is for the 
mRNA and three (called the A site, the P site, and the E site) are for tRNAs (Figure 
6-62). A tRNA molecule is held tightly at the A and P sites only if its anticodon 
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Figure 6-62 The RNA-binding sites in the ribosome. Each ribosome has one binding site for mRNA and three binding sites 
for tRNA: the A, P, and E sites (short for aminoacyl-tRNA, peptidyl-tRNA, and exit, respectively). (A) A bacterial ribosome viewed 
with the small subunit in the front (dark green) and the large subunit in the back (light green). Both the rRNAs and the ribosomal 
proteins are illustrated. tRNAs are shown bound in the E site (red), the P site (orange), and the A site (yellow). Although all three 
tRNA sites are shown occupied here, during the process of protein synthesis not more than two of these sites are thought to 
contain tRNA molecules at any one time (See Figure 6-64). (B) Large and small ribosomal subunits arranged as though the 
ribosome in (A) were opened like a book. (C) The ribosome in (A) rotated through 90° and viewed with the large subunit on top 
and small subunit on the bottom. (D) Schematic representation of a ribosome [in the same orientation as (C)], which will be used 
in subsequent figures. (A, B, and C, adapted from M.M. Yusupov et al., Science 292:883-896, 2001. With permission from 
AAAS; courtesy of Albion Baucom and Harry Noller.) 


forms base pairs with a complementary codon (allowing for wobble) on the 
mRNA molecule that is threaded through the ribosome (Figure 6-63). The A and 
P sites are close enough together for their two tRNA molecules to be forced to 
form base pairs with adjacent codons on the mRNA molecule. This feature of the 
ribosome maintains the correct reading frame on the MRNA. 

Once protein synthesis has been initiated, each new amino acid is added to the 
elongating chain in a cycle of reactions containing four major steps: tRNA binding 
(step 1), peptide bond formation (step 2), large subunit translocation (step 3), and 
small subunit translocation (step 4). As a result of the two translocation steps, the 
entire ribosome moves three nucleotides along the mRNA and is positioned to 
start the next cycle. Figure 6-64 illustrates this four-step process, beginning at a 
point at which three amino acids have already been linked together and there is a 
tRNA molecule in the P site on the ribosome, covalently joined to the C-terminal 
end of the short polypeptide. In step 1, a tRNA carrying the next amino acid in the 
chain binds to the ribosomal A site by forming base pairs with the mRNA codon 
positioned there, so that the P site and the A site contain adjacent bound tRNAs. In 
step 2, the carboxyl end of the polypeptide chain is released from the tRNA at the 
P site (by breakage of the high-energy bond between the tRNA and its amino acid) 
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and joined to the free amino group of the amino acid linked to the tRNA at the A 
site, forming a new peptide bond. This central reaction of protein synthesis is cat- 
alyzed by a peptidyl transferase contained in the large ribosomal subunit. In step 
3, the large subunit moves relative to the mRNA held by the small subunit, thereby 
shifting the acceptor stems of the two tRNAs to the E and P sites of the large sub- 
unit. In step 4, another series of conformational changes moves the small subunit 
and its bound mRNA exactly three nucleotides, ejecting the spent tRNA from the E 
site and resetting the ribosome so it is ready to receive the next aminoacyl-tRNA. 
Step 1 is then repeated with a new incoming aminoacyl-tRNA, and so on. 

This four-step cycle is repeated each time an amino acid is added to the poly- 
peptide chain, as the chain grows from its amino to its carboxyl end. 


Elongation Factors Drive Translation Forward and Improve Its 
Accuracy 


The basic cycle of polypeptide elongation shown in outline in Figure 6-64 has an 
additional feature that makes translation especially efficient and accurate. Two 
elongation factors enter and leave the ribosome during each cycle, each hydro- 
lyzing GTP to GDP and undergoing conformational changes in the process. These 
factors are called EF-Tu and EF-G in bacteria, and EF] and EF2 in eukaryotes. 
Under some conditions in vitro, ribosomes can be forced to synthesize proteins 


Figure 6-64 Translating an MRNA molecule. Each amino acid added to 
the growing end of a polypeptide chain is selected by complementary base- 
pairing between the anticodon on its attached tRNA molecule and the next 
codon on the mRNA chain. Because only one of the many types of tRNA 
molecules in a cell can base-pair with each codon, the codon determines 
the specific amino acid to be added to the growing polypeptide chain. The 
four-step cycle shown is repeated over and over during the synthesis of a 
protein. In step 1, an aminoacyl-tRNA molecule binds to a vacant A site on 
the ribosome. In step 2, a new peptide bond is formed. In step 3, the large 
subunit translocates relative to the small subunit, leaving the two tRNAs in 
hybrid sites: P on the large subunit and A on the small, for one; E on the 
large subunit and P on the small, for the other. In step 4, the small subunit 
translocates carrying its MRNA a distance of three nucleotides through the 
ribosome. This “resets” the ribosome with a fully empty A site, ready for the 
next aminoacyl-tRNA molecule to bind. As indicated, the mRNA is translated 
in the 5'-to-3' direction, and the N-terminal end of a protein is made first, with 
each cycle adding one amino acid to the C-terminus of the polypeptide chain 
(Movie 6.7 and Movie 6.8). 
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Figure 6-63 The path of mRNA (blue) 
through the small ribosomal subunit. 


The 


orientation is the same as that in the right- 


hand panel of Figure 6-62B. (Courtesy 
of Harry F. Noller, based on data in G.Z. 
Yusupova et al., 
With permission from Elsevier.) 
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Figure 6-65 Detailed view of the translation cycle. The outline of 
translation presented in Figure 6-64 has been expanded to show the roles 
of the two elongation factors EF-Tu and EF-G, which drive translation in the 
forward direction. As explained in the text, EF-Tu provides opportunities for 
proofreading of the codon—anticodon match. In this way, incorrectly paired 
tRNAs are selectively rejected, and the accuracy of translation is improved. 
The binding of a molecule of EF-G to the ribosome and the subsequent 
hydrolysis of GTP lead to a rearrangement of the ribosome structure, moving 
the mRNA being decoded exactly three nucleotides through it (Movie 6.9). 


without the aid of these elongation factors and GTP hydrolysis, but this synthe- 
sis is very slow, inefficient, and inaccurate. Coupling the GTP hydrolysis-driven 
changes in the elongation factors to transitions between different states of the 
ribosome speeds up protein synthesis enormously. The cycles of elongation fac- 
tor association, GTP hydrolysis, and dissociation also ensure that all such changes 
occur in the “forward” direction, helping translation to proceed efficiently (Figure 
6-65). 

In addition to moving translation forward, EF-Tu increases its accuracy. As we 
discussed in Chapter 3, EF-Tu can simultaneously bind GTP and aminoacyl-tR- 
NAs (see Figures 3-72 and 3-73), and it is in this form that the initial codon-anti- 
codon interaction occurs in the A site of the ribosome. Because of the free-energy 
change associated with base-pair formation, a correct codon-anticodon match 
will bind more tightly than an incorrect interaction. However, this difference in 
affinity is relatively modest and cannot by itself account for the high accuracy of 
translation. 

To increase the accuracy of this binding reaction, the ribosome and EF-Tu 
work together in the following ways. First, the 16s rRNA in the small subunit of 
the ribosome assesses the “correctness” of the codon-anticodon match by folding 
around it and probing its molecular details (Figure 6-66). When a correct match 
is found, the rRNA closes tightly around the codon-anticodon pair, causing a con- 
formational change in the ribosome that triggers GTP hydrolysis by EF-Tu. Only 
when GTP is hydrolyzed does EF-Tu release its grip on the aminoacyl-tRNA and 
allow it to be used in protein synthesis. Incorrect codon-anticodon matches do 
not readily trigger this conformational change, and these errant tRNAs mostly fall 
off the ribosome before they can be used in protein synthesis. Proofreading, how- 
ever, does not end here. 

After GTP is hydrolyzed and EF-Tu dissociates from the ribosome, there is a 
second opportunity for the ribosome to prevent an incorrect amino acid from 
being added to the growing chain. There is a short time delay as the amino acid 
carried by the tRNA moves into position on the ribosome. This time delay is 
shorter for correct than incorrect codon-anticodon pairs. Moreover, incorrectly 
matched tRNAs dissociate more rapidly than those correctly bound because their 
interaction with the codon is weaker. Thus, most incorrectly bound tRNA mole- 
cules (as well as a significant number of correctly bound molecules) will leave the 
ribosome without being used for protein synthesis. The two proofreading steps, 
acting in series, are largely responsible for the 99.99% accuracy of the ribosome in 
translating RNA into protein. 

Even if the wrong amino acid slips through the proofreading steps just 
described and is incorporated onto the growing polypeptide chain, there is still 
one more opportunity for the ribosome to detect the error and provide a solu- 
tion, albeit one that is not, strictly speaking, proofreading. An incorrect codon- 
anticodon interaction in the P site of the ribosome (which would occur after the 
misincorporation) causes an increased rate of misreading in the A site. Succes- 
sive rounds of amino acid misincorporation eventually lead to premature ter- 
mination of the protein by release factors, which are described below. Normally, 
these release factors act when translation of a protein is complete; here, they act 
early. Although this mechanism does not correct the original error, it releases the 
flawed protein for degradation, ensuring that no additional peptide synthesis is 
wasted on it. 
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Figure 6-66 Recognition of correct codon-anticodon matches by 

the small-subunit rRNA of the ribosome. Shown here is the interaction 
between a nucleotide of the small-subunit rRNA and the first nucleotide pair 
of a correctly paired codon-anticodon. Similar interactions form between 
other nucleotides of the rRNA and the second and third positions of codon- 
anticodon pair. The small-subunit rRNA can form this network of hydrogen 
bonds only when an anticodon is correctly matched to a codon. As explained 
in the text, this codon—anticodon monitoring by the small-subunit rRNA 
increases the accuracy of protein synthesis. (From J.M. Ogle et al., Science 
292:897-902, 2001. With permission from AAAS.) 


Many Biological Processes Overcome the Inherent Limitations of 
Complementary Base-Pairing 


We have seen in this and the previous chapter that DNA replication, repair, tran- 
scription, and translation all rely on complementary base-pairing—G with C, and 
A with T (or U). However, if only the difference in hydrogen bonding is considered, 
a correct versus incorrect match should differ in affinity only by a factor of 10- to 
100-fold. These processes have an accuracy much higher than can be accounted 
for by this difference. Although the mechanisms used to “squeeze out” additional 
specificity from complementary base-pairing differ from one process to the next, 
two principles exemplified by the ribosome appear to be general. 

The first is induced fit. We have seen that, before an amino acid is added to 
a growing polypeptide chain, the ribosome folds around the codon-anticodon 
interaction, and only when the match is correct is this folding completed and the 
reaction allowed to proceed. Thus, the codon-anticodon interaction is thereby 
checked twice—once by the initial complementary base-pairing and a second 
time by the folding of the ribosome, which depends on the correctness of the 
match. This same principle of induced fit is seen in transcription by RNA poly- 
merase; here, an incoming nucleoside triphosphate initially forms a base pair 
with the template; at this point the enzyme folds around the base pair (thereby 
assessing its correctness) and, in doing so, creates the active site of the enzyme. 
The enzyme then covalently adds the nucleotide to the growing chain. Because 
their geometry is “wrong,” incorrect base pairs block this induced fit, and they are 
therefore likely to dissociate before being incorporated into the growing chain. 

A second principle used to increase the specificity of complementary 
base-pairing is called kinetic proofreading. We have seen that after the initial 
codon-anticodon pairing and conformational change of the ribosome, GTP is 
hydrolyzed. This creates an irreversible step and starts the clock on a time delay 
during which the aminoacyl-tRNA moves into the proper position for catalysis. 
During this delay, those incorrect codon-anticodon pairs that have somehow 
slipped through the induced-fit scrutiny have a higher likelihood of dissociating 
than correct pairs. There are two reasons for this: (1) the interaction of the wrong 
tRNA with the codon is weaker, and (2) the delay is longer for incorrect than cor- 
rect matches. 

In its most general form, kinetic proofreading refers to a time delay that begins 
with an irreversible step such as ATP or GTP hydrolysis, during which an incorrect 
substrate is more likely to dissociate than a correct one. In this case, kinetic proof- 
reading thus increases the specificity of complementary base-pairing above what 
is possible from simple thermodynamic associations alone. The increase in spec- 
ificity produced by kinetic proofreading comes at an energetic cost in the form of 
ATP or GTP hydrolysis. Kinetic proofreading is believed to operate in many bio- 
logical processes, but its role is understood particularly well for translation. 


Accuracy in Translation Requires an Expenditure of Free Energy 


Translation by the ribosome is a compromise between the opposing constraints 
of accuracy and speed. We have seen, for example, that the accuracy of transla- 
tion (1 mistake per 10* amino acids joined) requires time delays each time a new 
amino acid is added to a growing polypeptide chain, producing an overall speed 
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of translation of 20 amino acids incorporated per second in bacteria. Mutant bac- 
teria with a specific alteration in the small ribosomal subunit have longer delays 
and translate mRNA into protein with an accuracy considerably higher than this; 
however, protein synthesis is so slow in these mutants that the bacteria are barely 
able to survive. 

We have also seen that attaining the observed accuracy of protein synthesis 
requires the expenditure of a great deal of free energy; this is expected, since, as 
discussed in Chapter 2, there is a price to be paid for any increase in order in the 
cell. In most cells, protein synthesis consumes more energy than any other biosyn- 
thetic process. At least four high-energy phosphate bonds are split to make each 
new peptide bond: two are consumed in charging a tRNA molecule with an amino 
acid (see Figure 6-54), and two more drive steps in the cycle of reactions occur- 
ring on the ribosome during protein synthesis itself (see Figure 6-65). In addition, 
extra energy is consumed each time that an incorrect amino acid linkage is hydro- 
lyzed by a tRNA synthetase (see Figure 6-57) and each time that an incorrect tRNA 
enters the ribosome, triggers GTP hydrolysis, and is rejected (see Figure 6-65). To 
be effective, any proofreading mechanism must also allow an appreciable fraction 
of correct interactions to be removed; for this reason, proofreading is even more 
costly in energy than it might at first seem. 


The Ribosome Is a Ribozyme 


The ribosome is a large complex composed of two-thirds RNA and one-third pro- 
tein. The determination, in 2000, of the entire three-dimensional conformation of 
its large and small subunits is a major triumph of modern structural biology. The 
findings confirm earlier evidence that rRNAs—and not proteins—are responsible 
for the ribosome’s overall structure, its ability to position tRNAs on the mRNA, 
and its catalytic activity in forming covalent peptide bonds. The ribosomal RNAs 
are folded into highly compact, precise three-dimensional structures that form 
the compact core of the ribosome and determine its overall shape (Figure 6-67). 
In marked contrast to the central positions of the rRNAs, the ribosomal pro- 
teins are generally located on the surface and fill in the gaps and crevices of the 
folded RNA (Figure 6-68). Some of these proteins send out extended regions 
of polypeptide chain that penetrate short distances into holes in the RNA core 
(Figure 6-69). The main role of the ribosomal proteins seems to be to stabilize the 





Figure 6-67 Structure of the rRNAs 
in the large subunit of a bacterial 
ribosome, as determined by x-ray 
crystallography. (A) Three-dimensional 
conformations of the large-subunit 
rRNAs (5S and 23S) as they appear 

in the ribosome. One of the protein 
subunits of the ribosome (L1) is also 
shown as a reference point, since it 
forms a characteristic protrusion on 
the ribosome. (B) Schematic diagram 
of the secondary structure of the 23S 
rRNA, showing the extensive network 
of base-pairing. The structure has been 
divided into six “domains” whose colors 


correspond to those in (A). The secondary- 


structure diagram is highly schematized 
to represent as much of the structure as 
possible in two dimensions. To do this, 
several discontinuities in the RNA chain 
have been introduced, although in reality 
the 23S rRNA is a single RNA molecule. 
For example, the base of Domain Ill is 
continuous with the base of Domain IV 


even though a gap appears in the diagram. 


(Adapted from N. Ban et al., Science 
289:905-920, 2000. With permission 
from AAAS.) 
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RNA core, while permitting the changes in rRNA conformation that are necessary 
for this RNA to catalyze efficient protein synthesis. The proteins also aid in the 
initial assembly of the rRNAs that make up the core of the ribosome. 

Not only are the A, P, and E binding sites for tRNAs formed primarily by ribo- 
somal RNAs, but the catalytic site for peptide bond formation is also formed by 
RNA, as the nearest amino acid is located more than 1.8 nm away. This discovery 
came as a surprise to biologists because, unlike proteins, RNA does not contain 
easily ionizable functional groups that can be used to catalyze sophisticated reac- 
tions like peptide bond formation. Moreover, metal ions, which are often used by 
RNA molecules to catalyze chemical reactions (as discussed later in the chapter), 
were not observed at the active site of the ribosome. Instead, it is believed that 
the 23S rRNA forms a highly structured pocket that, through a network of hydro- 
gen bonds, precisely orients the two reactants (the growing peptide chain and an 
aminoacyl-tRNA) and thereby greatly accelerates their covalent joining. An addi- 
tional surprise came from the discovery that the tRNA in the P site contributes an 
important OH group to the active site and participates directly in the catalysis. 
This mechanism may ensure that catalysis occurs only when the P site tRNA is 
properly positioned in the ribosome. 

RNA molecules that possess catalytic activity are known as ribozymes. We saw 
earlier in this chapter that some ribozymes function in self-splicing reactions. In 
the final section of this chapter, we consider what the ability of RNA molecules to 
function as catalysts might mean for the early evolution of living cells. For now, 
we merely note that there is good reason to suspect that RNA rather than protein 
molecules served as the first catalysts for living cells. If so, the ribosome, with its 
RNA core, may be a relic of an earlier time in life’s history—when protein synthe- 
sis evolved in cells that were run almost entirely by ribozymes. 


Nucleotide Sequences in MRNA Signal Where to Start Protein 
Synthesis 


The initiation and termination of translation share features of the translation 
elongation cycle described above. The site at which protein synthesis begins on 
the mRNA is especially crucial, since it sets the reading frame for the whole length 
of the message. An error of one nucleotide either way at this stage would cause 
every subsequent codon in the message to be misread, resulting in a nonfunc- 
tional protein with a garbled sequence of amino acids. The initiation step is also 
important because for most genes it is the last point at which the cell can decide 
whether the mRNA is to be translated to produce a protein. The rate of this step is 
thus one determinant of the rate at which any particular protein will be synthe- 
sized. We shall see in Chapter 7 how regulation of this step occurs. 

The translation of an mRNA begins with the codon AUG, and a special tRNA 
is required to start translation. This initiator tRNA always carries the amino acid 
methionine (in bacteria, a modified form of methionine—formylmethionine—is 
used), with the result that all newly made proteins have methionine as the first 
amino acid at their N-terminus, the end of a protein that is synthesized first. (This 
methionine is usually removed later by a specific protease.) The initiator tRNA 
is specially recognized by initiation factors because it has a nucleotide sequence 
distinct from that of the tRNA that normally carries methionine. 

In eukaryotes, the initiator tRNA-methionine complex (Met-tRNAi) is first 
loaded into the small ribosomal subunit along with additional proteins called 
eukaryotic initiation factors, or eIFs. Of all the aminoacyl-tRNAs in the cell, only 
the methionine-charged initiator tRNA is capable of tightly binding the small 
ribosome subunit without the complete ribosome being present, and unlike other 
tRNAs it binds directly to the P site (Figure 6-70). Next, the small ribosomal sub- 
unit binds to the 5’ end ofan mRNA molecule, which is recognized by virtue ofits 5’ 
cap that has previously bound two initiation factors, eIF4E and eIF4G (see Figure 
6-38). The small ribosomal subunit then moves forward (5’ to 3’) along the mRNA, 
searching for the first AUG; additional initiation factors that act as ATP-powered 
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Figure 6-68 Location of the protein 
components of the bacterial large 
ribosomal subunit. The rRNAs (5S and 
23S) are shown in blue and the proteins 
of the large subunit in green. This view is 
toward the outside of the ribosome; the 
interface with the small subunit is on the 
opposite face. (PDB code: 1FFK.) 





Figure 6-69 Structure of the L15 protein 
in the large subunit of the bacterial 
ribosome. The globular domain of the 
protein lies on the surface of the ribosome 
and an extended region penetrates deeply 
into the RNA core of the ribosome. The 
L15 protein is shown in green and a portion 
of the ribosomal RNA core is shown in 
blue. (From D. Klein, P.B. Moore and 

T.A. Steitz, J. Mol. Biol. 340:141-177, 
2004. With permission from Academic 
Press. PDB code: 18572.) 
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Figure 6-70 The initiation of protein synthesis in eukaryotes. Only three 
of the many translation initiation factors required for this process are shown. 
Efficient translation initiation also requires the poly-A tail of the mRNA bound 
by poly-A-binding proteins, which, in turn, interact with elF4G (see Figure 
6-38). In this way, the translation apparatus ascertains that both ends of 
the mRNA are intact before initiating protein synthesis. Although only one 
GTP-hydrolysis event is shown in the figure, a second is known to occur 
just before the large and small ribosomal subunits join. In the last two steps 
shown in the figure, the ribosome has begun the standard elongation cycle, 
depicted in Figure 6-64. 


helicases facilitate this movement. In 90% of mRNAs, translation begins at the first 
AUG encountered by the small subunit. At this point, the initiation factors dis- 
sociate, allowing the large ribosomal subunit to assemble with the complex and 
complete the ribosome. The initiator tRNA remains at the P site, leaving the A site 
vacant. Protein synthesis is therefore ready to begin (see Figure 6-70). 

The nucleotides immediately surrounding the start site in eukaryotic mRNAs 
influence the efficiency of AUG recognition during the above scanning process. If 
this recognition site differs substantially from the consensus recognition sequence 
(5'-ACCAUGG-3’), scanning ribosomal subunits will sometimes ignore the first 
AUG codon in the mRNA and skip to the second or third AUG codon instead. Cells 
frequently use this phenomenon, known as “leaky scanning,’ to produce two or 
more proteins, differing in their N-termini, from the same mRNA molecule. This 
mechanism allows some genes to produce the same protein with and without 
a signal sequence attached at its N-terminus, for example, so that the protein is 
directed to two different compartments in the cell. 

The mechanism for selecting a start codon in bacteria is different. Bacte- 
rial mRNAs have no 5’ caps to signal the ribosome where to begin searching for 
the start of translation. Instead, each bacterial mRNA contains a specific ribo- 
some-binding site (called the Shine-Dalgarno sequence, named after its discov- 
erers) that is located a few nucleotides upstream of the AUG at which translation 
is to begin. This nucleotide sequence, with the consensus 5'-AGGAGGU-3’, forms 
base pairs with the 16S rRNA of the small ribosomal subunit to position the initiat- 
ing AUG codon in the ribosome. A set of translation initiation factors orchestrates 
this interaction, as well as the subsequent assembly of the large ribosomal sub- 
unit to complete the ribosome. 

Unlike a eukaryotic ribosome, a bacterial ribosome can readily assemble 
directly on a start codon that lies in the interior of an mRNA molecule, so long 
as a ribosome-binding site precedes it by several nucleotides. As a result, bac- 
terial mRNAs are often polycistronic—that is, they encode several different pro- 
teins, each of which is translated from the same mRNA molecule (Figure 6-71). 
In contrast, a eukaryotic mRNA generally encodes only a single protein, or more 
accurately, a single set of closely related proteins. 


Stop Codons Mark the End of Translation 


The end of the protein-coding message is signaled by the presence of one of three 
stop codons (UAA, UAG, or UGA) (see Figure 6-48). These are not recognized by 
a tRNA and do not specify an amino acid, but instead signal to the ribosome to 
stop translation. Proteins known as release factors bind to any ribosome with a 
stop codon positioned in the A site, forcing the peptidyl transferase in the ribo- 
some to catalyze the addition of a water molecule instead of an amino acid to the 
peptidyl-tRNA (Figure 6-72). This reaction frees the carboxyl end of the growing 
polypeptide chain from its attachment to a tRNA molecule, and since only this 
attachment normally holds the growing polypeptide to the ribosome, the com- 
pleted protein chain is immediately released into the cytoplasm. The ribosome 
then releases its bound mRNA molecule and separates into the large and small 
subunits. These subunits can then assemble on this or another mRNA molecule 
to begin a new round of protein synthesis. 
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Figure 6-71 Structure of a typical bacterial mRNA molecule. Unlike eukaryotic ribosomes, 
which typically require a capped 5’ end on the mRNA, prokaryotic ribosomes initiate translation 
at ribosome-binding sites (Shine-Dalgarno sequences), which can be located anywhere along an 
MRNA molecule. This property of their ribosomes permits bacteria to synthesize more than one 
type of protein from a single MRNA molecule. 





BINDING OF 
RELEASE 


During translation, the nascent polypeptide moves through a large, water-filled FACTOR 


tunnel (approximately 10 nm x 1.5 nm) in the large subunit of the ribosome. The Hn’ 
walls of this tunnel, made primarily of 23S rRNA, are a patchwork of tiny hydro- 

phobic surfaces embedded in a more extensive hydrophilic surface. This struc- 

ture is not complementary to any peptide, and thus provides a “Teflon” coating 
through which a polypeptide chain can easily slide. The dimensions of the tunnel 
suggest that nascent proteins are largely unstructured as they pass through the as 
ribosome, although some a-helical regions of the protein can form before leaving 
the ribosome tunnel. As it leaves the ribosome, a newly synthesized protein must 
fold into its proper three-dimensional conformation to be useful to the cell. Later 
in this chapter we discuss how this folding occurs. First, however, we describe sev- 
eral additional aspects of the translation process itself. 


Proteins Are Made on Polyribosomes 


The synthesis of most protein molecules takes between 20 seconds and several 
minutes. During this very short period, however, it is usual for multiple initiations 
to take place on each MRNA molecule being translated. As soon as the preceding 
ribosome has translated enough of the nucleotide sequence to move out of the 
way, the 5’ end of the mRNA is threaded into a new ribosome. The mRNA mole- 
cules being translated are therefore usually found in the form of polyribosomes (or 
polysomes): large cytoplasmic assemblies made up of several ribosomes spaced 
as Close as 80 nucleotides apart along a single mRNA molecule (Figure 6-73). 
These multiple initiations allow the cell to make many more protein molecules in 
a given time than would be possible if each protein had to be completed before mISSOCIATION 
the next could start. 

Both bacteria and eukaryotes use polysomes, and both employ additional 
strategies to speed up the overall rate of protein synthesis. Because bacterial 
mRNA does not need to be processed and is accessible to ribosomes while it is ọ a Ace 
being made, ribosomes can attach to the free end of a bacterial mRNA molecule t 
and start translating it even before the transcription of that RNA is complete, fol- 
lowing closely behind the RNA polymerase as it moves along DNA. In eukaryotes, AUGAACUGGUAGCGAUCG 
as we have seen, the 5' and 3’ ends of the mRNA interact (see Figure 6-73A); there- 5 E RALL > 
fore, as soon as a ribosome dissociates, its two subunits are in an optimal position 


to reinitiate translation on the same mRNA molecule. / 
There Are Minor Variations in the Standard Genetic Code 

As discussed in Chapter 1, the genetic code (shown in Figure 6-48) applies to Figure 6-72 The final phase of protein 
all three major branches of life, providing important evidence for the common synthesis. The binding of a release 
ancestry of all life on Earth. Although rare, there are exceptions to this code. For factor to an A site bearing a stop codon 
example, Candida albicans, the most prevalent human fungal pathogen, trans- terminates translation. The completed 
lates the codon CUG as serine, whereas nearly all other organisms translate it Uae ae dea cree a 
as leucine. Mitochondria (which have their own genomes and encode much of -hd GTP hydrolysis (not shown), the 
their translational apparatus) often deviate from the standard code. For example, ribosome dissociates into its two 


in mammalian mitochondria AUA is translated as methionine, whereas in the separate subunits. 
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messenger RNA Figure 6-73 A polyribosome. (A) Schematic 
3) (mRNA) drawing showing how a series of ribosomes 
can simultaneously translate the same 
eukaryotic MRNA molecule. (B) Electron 
micrograph of a polyribosome from a 
eukaryotic cell (Movie 6.10). (B, courtesy 
of John Heuser.) 
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cytosol of the cell it is translated as isoleucine (see Table 14-3, p. 805). This type of 
deviation in the genetic code is “hardwired” into the organisms or the organelles 
in which it occurs. 

A different type of variation, sometimes called translation recoding, occurs 
in many cells. In this case, other nucleotide sequence information present in an 
mRNA can change the meaning of the genetic code at a particular site in the mRNA 
molecule. The standard code allows cells to manufacture proteins using only 20 
amino acids. However, bacteria, archaea, and eukaryotes have available to them a 
twenty-first amino acid that can be incorporated directly into a growing polypep- 
tide chain through translation recoding. Selenocysteine, which is essential for the 
efficient function of a variety of enzymes, contains a selenium atom in place of the 
sulfur atom of cysteine. Selenocysteine is enzymatically produced from a serine 
attached to a special tRNA molecule that base-pairs with the UGA codon, a codon 
normally used to signal a translation stop. The mRNAs for proteins in which sele- 
nocysteine is to be inserted at a UGA codon carry an additional nearby nucleotide 
sequence in the mRNA that triggers this recoding event (Figure 6-74). 
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Figure 6-74 Incorporation of selenocysteine into a growing polypeptide chain. A specialized tRNA is charged with serine by the normal seryl- 
tRNA synthetase, and the serine is subsequently converted enzymatically to selenocysteine. A specific RNA structure in the mRNA (a stem and loop 
structure with a particular nucleotide sequence) signals that selenocysteine is to be inserted at the neighboring UGA codon. As indicated, this event 
requires the participation of a selenocysteine-specific translation factor. After the addition of selenocysteine, translation continues until a conventional 
stop codon is encountered. 


FROM RNA TO PROTEIN 


Inhibitors of Prokaryotic Protein Synthesis Are Useful as Antibiotics 


Many of the most effective antibiotics used in modern medicine are compounds 
made by fungi that inhibit bacterial protein synthesis. Fungi and bacteria com- 
pete for many of the same environmental niches, and millions of years of coevo- 
lution have resulted in fungi producing potent bacterial inhibitors. Some of these 
drugs exploit the structural and functional differences between bacterial and 
eukaryotic ribosomes so as to interfere preferentially with the function of bacte- 
rial ribosomes. Thus, humans can take high dosages of some of these compounds 
without undue toxicity. Many antibiotics lodge in pockets in the ribosomal RNAs 
and simply interfere with the smooth operation of the ribosome; others block 
specific parts of the ribosome such as the exit channel (Figure 6-75). Table 6-4 
lists some common antibiotics of this kind along with several other inhibitors of 
protein synthesis, some of which act on eukaryotic cells and therefore cannot be 
used as antibiotics. 

Because they block specific steps in the processes that lead from DNA to 
protein, many of the compounds listed in Table 6-4 are useful for cell biological 
studies. Among the most commonly used drugs in such investigations are chlor- 
amphenicol, cycloheximide, and puromycin, all of which specifically inhibit pro- 
tein synthesis. In a eukaryotic cell, for example, chloramphenicol inhibits protein 
synthesis on ribosomes only in mitochondria (and in chloroplasts in plants), pre- 
sumably reflecting the prokaryotic origins of these organelles (discussed in Chap- 
ter 14). Cycloheximide, in contrast, affects only ribosomes in the cytosol. Puromy- 
cin is especially interesting because it is a structural analog of a tRNA molecule 
linked to an amino acid and is therefore another example of molecular mimicry; 
the ribosome mistakes it for an authentic amino acid and covalently incorporates 
it at the C-terminus of the growing peptide chain, thereby causing the premature 
termination and release of the polypeptide. As might be expected, puromycin 
inhibits protein synthesis in both prokaryotes and eukaryotes. 


Quality Control Mechanisms Act to Prevent Translation of 
Damaged mRNAs 


In eukaryotes, mRNA production involves both transcription and a series of elab- 
orate RNA processing steps; as we have seen, these take place in the nucleus, 
segregated from ribosomes, and only when the processing is complete are the 
mRNAs transported to the cytosol to be translated (see Figure 6-38). However, 
this scheme is not foolproof, and some incorrectly processed mRNAs are inadver- 
tently sent to the cytosol. In addition, mRNAs that were flawless when they left the 
nucleus can become broken or otherwise damaged in the cytosol. The danger of 
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Figure 6-75 Binding sites for antibiotics 
on the bacterial ribosome. The small (/eft) 
and large (right) subunits of the ribosome 
are arranged as though the ribosome has 
been opened like a book. Antibiotic binding 
sites are marked with colored spheres, 

and the bound tRNA molecules are shown 
in purple (see Figure 6-62). Most of the 
antibiotics shown bind directly to pockets 
formed by the ribosomal RNA molecules. 
Hygromycin B induces errors in translation, 
spectinomycin blocks the translocation 

of the peptidyl-tRNA from the A site to 

the P site, and streptogramin B prevents 
elongation of nascent peptides. Table 6-4 
lists the inhibitory mechanisms of the other 
antibiotics shown in the figure. (Adapted 
from J. Poehlsgaard and S. Douthwaite, 
Nat. Rev. Microbiol. 3:870-881, 2005. With 
permission from Macmillan Publishers Ltd.) 
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TABLE 6-4 


Inhibitor Specific effect 


The ribosomes of eukaryotic mitochondria (and chloroplasts) often resemble those of bacteria in their sensitivity to inhibitors. Therefore, some of 
these antibiotics can have a deleterious effect on human mitochondria. 





translating damaged or incompletely processed mRNAs (which would produce 
truncated or otherwise aberrant proteins) is apparently so great that the cell has 
several backup measures to prevent this from happening. To avoid translating 
broken mRNAs, for example, the 5’ cap and the poly-A tail are both recognized by 
the translation-initiation machinery before translation begins (see Figure 6-70). 

The most powerful mRNA surveillance system, called nonsense-mediated 
mRNA decay, eliminates defective mRNAs before they move away from the 
nucleus. This mechanism is brought into play when the cell determines that an 
mRNA molecule has a nonsense (stop) codon (UAA, UAG, or UGA) in the “wrong” 
place. This situation is likely to arise in an mRNA molecule that has been improp- 
erly spliced, because aberrant splicing will usually result in the random intro- 
duction of a nonsense codon into the reading frame of the mRNA—especially 
in organisms, such as humans, that have a large average intron size (see Figure 
6-31B). 

The nonsense-mediated mRNA decay mechanism begins as an mRNA mol- 
ecule is being transported from the nucleus to the cytosol. As its 5’ end emerges 
from a nuclear pore, the mRNA is met by a ribosome, which begins to translate 
it. As translation proceeds, the exon junction complexes (EJCs) that are bound 
to the mRNA at each splice site are displaced by the moving ribosome. The nor- 
mal stop codon will lie within the last exon, so by the time the ribosome reaches 
it and stalls, no more EJCs will be bound to the mRNA. In this case, the mRNA 
“passes inspection” and is released to the cytosol where it can be translated in ear- 
nest (Figure 6-76). However, if the ribosome reaches a stop codon earlier, when 
EJCs remain bound, the mRNA molecule is rapidly degraded. In this way, the first 
round of translation allows the cell to test the fitness of each MRNA molecule as it 
exits the nucleus. 

Nonsense-mediated decay may have been especially important in evolution, 
allowing eukaryotic cells to more easily explore new genes formed by DNA rear- 
rangements, mutations, or alternative patterns of splicing—by selecting only those 
mRNAs for translation that can produce a full-length protein. Nonsense-medi- 
ated decay is also important in cells of the developing immune system, where 
the extensive DNA rearrangements that occur (see Figure 24-28) often generate 
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Figure 6-76 Nonsense-mediated mRNA decay. As shown on the right, the failure to correctly splice a pre-mRNA often 
introduces a premature stop codon into the reading frame for the protein. These abnormal mRNAs are destroyed by the 
nonsense-mediated decay mechanism. To activate this mechanism, an MRNA molecule, bearing exon junction complexes 
(EJCs) to mark successfully completed splices, is first met by a ribosome that performs a “test” round of translation. As the 
mRNA passes through the tight channel of the ribosome, the EJCs are stripped off, and successful mRNAs are released to 
undergo multiple rounds of translation (left side). However, if an in-frame stop codon is encountered before the final EJC is 
reached (right side), the mRNA undergoes nonsense-mediated decay, which is triggered by the Upf proteins (green) that bind to 
each EJC. Note that this mechanism ensures that nonsense-mediated decay is triggered only when the premature stop codon 
is in the same reading frame as that of the normal protein. (Adapted from J. Lykke-Andersen et al., Cell 103:1121-1131, 2000. 
With permission from Elsevier.) 
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premature termination codons. The surveillance system degrades the mRNAs 
produced from such rearranged genes, thereby avoiding the potential toxic effects 
of truncated proteins. 

The nonsense-mediated surveillance pathway also plays an important role in 
mitigating the symptoms of many inherited human diseases. As we have seen, 
inherited diseases are usually caused by mutations that spoil the function of a 
key protein, such as hemoglobin or one of the blood-clotting factors. Approxi- 
mately one-third of all genetic disorders in humans result from nonsense muta- 
tions or mutations (such as frameshift mutations or splice-site mutations) that 
place nonsense mutations into the gene’s reading frame. In individuals that carry 
one mutant and one functional gene, nonsense-mediated decay eliminates the 
aberrant mRNA and thereby prevents a potentially toxic protein from being made. 
Without this safeguard, individuals with one functional and one mutant “disease 
gene” would likely suffer much more severe symptoms. 


some Proteins Begin to Fold While Still Being Synthesized 


The process of gene expression is not over when the genetic code has been used to 
create the sequence of amino acids that constitutes a protein. To be useful to the 
cell, this new polypeptide chain must fold up into its unique three-dimensional 
conformation, bind any small-molecule cofactors required for its activity, be 
appropriately modified by protein kinases or other protein-modifying enzymes, 
and assemble correctly with the other protein subunits with which it functions 
(Figure 6-77). 

The information needed for all of the steps listed above is ultimately contained 
in the sequence of amino acids that the ribosome produces when it translates 
an mRNA molecule into a polypeptide chain. As discussed in Chapter 3, when a 
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Figure 6-77 Steps in the creation of a functional protein. As indicated, 
translation of an MRNA sequence into an amino acid sequence on the 
ribosome is not the end of the process of forming a protein. To function, the 
completed polypeptide chain must fold correctly into its three-dimensional 
conformation, bind any cofactors required, and assemble with its partner 
protein chains, if any. Noncovalent bond formation drives these changes. 
As indicated, many proteins also require covalent modifications of selected 
amino acids. Although the most frequent modifications are protein 
glycosylation and protein phosphorylation, over 200 different types of 
covalent modifications are known (see pp. 165-166). 


protein folds into a compact structure, it buries most of its hydrophobic residues 
in an interior core. In addition, large numbers of noncovalent interactions form 
between various parts of the molecule. It is the sum of all of these energetically 
favorable arrangements that determines the final folding pattern of the polypep- 
tide chain—as the conformation of lowest free energy (see pp. 114-115). 

Through many millions of years of evolution, the amino acid sequence of each 
protein has been selected not only for the conformation that it adopts but also 
for an ability to fold rapidly. For some proteins, this folding begins immediately, 
as the protein chain emerges from the ribosome, starting from the N-terminal 
end. In these cases, as each protein domain emerges from the ribosome, within 
a few seconds it forms a compact structure that contains most of the final second- 
ary features (a helices and p sheets) aligned in roughly the right conformation 
(Figure 6-78). For some protein domains, this unusually dynamic and flexible 
state, called a molten globule, is the starting point for a relatively slow process in 
which many side-chain adjustments occur that eventually form the correct ter- 
tiary structure. It takes several minutes to synthesize a protein of average size, and 
for some proteins much of the folding process is complete by the time the ribo- 
some releases the C-terminal end of a protein (Figure 6-79). 


Molecular Chaperones Help Guide the Folding of Most Proteins 


Most proteins probably do not fold correctly during their synthesis and require a 
special class of proteins called molecular chaperones to do so. Molecular chap- 
erones are useful for cells because there are many different folding paths avail- 
able to an unfolded or partially folded protein. Without chaperones, some of these 
pathways would not lead to the correctly folded (and most stable) form because 
the protein would become “kinetically trapped” in structures that are off-path- 
way. Some of these off-pathway configurations would aggregate and be left as irre- 
versible dead ends of nonfunctional (and potentially dangerous) structures. 
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Figure 6-78 The structure of a molten 
globule. (A) A molten globule form of 
cytochrome bse2 is more open and less 
highly ordered than the final folded form 

of the protein, shown in (B). Note that 

the molten globule contains most of the 
secondary structure of the final form, 
although the ends of the a helices are 
unraveled and one of the helices is only 
partly formed. (Courtesy of Joshua Wand, 
from Y. Feng et al., Nat. Struct. Biol. 1:30- 
35, 1994. With permission from Macmillan 
Publishers Ltd.) 
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Figure 6-79 Co-translational protein folding. A growing polypeptide chain is shown acquiring 
its secondary and tertiary structure as it emerges from a ribosome. The N-terminal domain folds 
first, while the C-terminal domain is still being synthesized. This protein has not achieved its final 
conformation at the time it is released from the ribosome. (Modified from A.N. Fedorov and 

T.O. Baldwin, J. Biol. Chem. 272:32715-32718, 1997.) 


Molecular chaperones specifically recognize incorrect, off-pathway configu- 
rations by their exposure of hydrophobic surfaces, which in correctly folded pro- 
teins are typically buried in the interior. The binding of these exposed hydrophobic 
surfaces to each other is what causes off-pathway conformations to irreversibly 
aggregate. We saw in Chapter 3 that in some cases of inherited human diseases, 
aggregates do form and can cause severe symptoms and even death. Chaperones 
prevent this from happening in normal proteins by binding to the exposed hydro- 
phobic surfaces using hydrophobic surfaces of their own. As we shall see shortly, 
there are several types of chaperones; once bound to an incorrectly folded pro- 
tein, they ultimately release it in a way that gives the protein another chance to 
fold correctly. 


Cells Utilize Several Types of Chaperones 


Many molecular chaperones are called heat-shock proteins (designated hsp), 
because they are synthesized in dramatically increased amounts after a brief 
exposure of cells to an elevated temperature (for example, 42°C for cells that nor- 
mally live at 37°C). This reflects the operation of a feedback system that responds 
to an increase in misfolded proteins (such as those produced by elevated tem- 
peratures) by boosting the synthesis of the chaperones that help these proteins 
refold. 

There are several major families of molecular chaperones, including the hsp60 
and hsp70 proteins. Different members of these families function in different 
organelles. Thus, as discussed in Chapter 12, mitochondria contain their own 
hsp60 and hsp70 molecules that are distinct from those that function in the cyto- 
sol; and a special hsp70 (called BIP) helps to fold proteins in the endoplasmic 
reticulum. 

The hsp60 and hsp70 proteins each work with their own small set of associ- 
ated proteins when they help other proteins to fold. These hsps share an affinity 
for the exposed hydrophobic patches on incompletely folded proteins, and they 
hydrolyze ATP, often binding and releasing their protein substrate with each cycle 
of ATP hydrolysis. In other respects, the two types of hsp proteins function differ- 
ently. The hsp70 machinery acts early in the life of many proteins (often before 
the protein leaves the ribosome), with each monomer of hsp70 binding to a string 
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of about four or five hydrophobic amino acids (Figure 6-80). On binding ATP, protein to refold. 


hsp70 releases the protein into solution allowing it a chance to re-fold. In contrast, 
hsp60-like proteins form a large barrel-shaped structure that acts after a protein 
has been fully synthesized. This type of chaperone, sometimes called a chapero- 
nin, forms an “isolation chamber” for the folding process (Figure 6-81). 

To enter a chamber, a substrate protein is first captured via the hydropho- 
bic entrance to the chamber. The protein is then released into the interior of the 
chamber, which is lined with hydrophilic surfaces, and the chamber is sealed 
with a lid, a step requiring ATP. Here, the substrate is allowed to fold into its final 
conformation in isolation, where there are no other proteins with which to aggre- 
gate. When ATP is hydrolyzed, the lid pops off, and the substrate protein, whether 
folded or not, is released from the chamber. 

The chaperones shown in Figures 6-80 and 6-81 often need many cycles of 
ATP hydrolysis to fold a single polypeptide chain correctly. This energy is used 
to perform mechanical movements of the hsp60 and hsp70 “machines,” convert- 
ing them from binding forms to releasing forms. Just as we saw for transcription, 
splicing, and translation, the expenditure of free energy can be used by cells to 
improve the accuracy of a biological process. In the case of protein folding, ATP 
hydrolysis allows chaperones to recognize a wide variety of misfolded structures, 
to halt any further misfolding, and to recommence the folding of a protein in an 
orderly way. 
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Figure 6-81 The structure and function of the hsp60 family of molecular chaperones. (A) A g 
and on the right a cross section through its center. (B, adapted from B. Bukau and A.L. Horwich, Cell (B) 
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opening. The initial binding often helps to unfold a misfolded protein. The subsequent binding of ATP 
and a cap releases the substrate protein into an enclosed space, where it has a new opportunity to 
fold. After about 10 seconds, ATP hydrolysis occurs, weakening the binding of the cap. Subsequent 
binding of additional ATP molecules ejects the cap, and the protein is released. As indicated, only 
half of the symmetric barrel operates on a client protein at any one time. This type of molecular 
chaperone is also known as a chaperonin; it is designated as hsp60 in mitochondria, TCP1 in the 
cytosol of vertebrate cells, and GroEL in bacteria. (B) The structure of GroEL bound to its GroES cap, 
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Although our discussion focuses on only two types of chaperones, the cell has 
a variety of others. The enormous diversity of proteins in cells presumably requires 
a wide range of chaperones with versatile surveillance and correction capabilities. 


Exposed Hydrophobic Regions Provide Critical Signals for Protein 
Quality Control 


If radioactive amino acids are added to cells for a brief period, the newly synthe- 
sized proteins can be followed as they mature into their final functional forms. 
This type of experiment demonstrates that the hsp70 proteins act first, beginning 
when a protein is still being synthesized on a ribosome, and the hsp60-like pro- 
teins act only later to help fold completed proteins. We have seen that the cell dis- 
tinguishes misfolded proteins, which require additional rounds of ATP-catalyzed 
refolding, from those with correct structures through the recognition of hydro- 
phobic surfaces. 

Usually, if a protein has a sizable exposed patch of hydrophobic amino acids 
on its surface, it is abnormal: it has either failed to fold correctly after leaving the 
ribosome, suffered an accident that partly unfolded it at a later time, or failed to 
find its normal partner subunit in a larger protein complex. Such a protein is not 
merely useless to the cell, it can be dangerous. 

Proteins that rapidly fold correctly on their own do not display such patterns 
and generally bypass the chaperones. For the others, the chaperones can carry out 
“protein repair” by giving them additional chances to fold while, at the same time, 
preventing their aggregation. 

Figure 6-82 outlines all of the quality-control choices that a cell makes for a 
difficult-to-fold, newly synthesized protein. As indicated, when attempts to refold 
a protein fail, an additional mechanism is called into play that completely destroys 
the protein by proteolysis. This proteolytic pathway begins with the recognition of 
an abnormal hydrophobic patch on a protein’s surface, and it ends with the deliv- 
ery of the entire protein to a protein-destruction machine, a complex protease 
known as the proteasome. As described next, this process depends on an elab- 
orate protein-marking system that also carries out other central functions in the 
cell by destroying selected normal proteins. 


The Proteasome Is a Compartmentalized Protease with 
Sequestered Active Sites 


The proteolytic machinery and the chaperones compete with one another to rec- 
ognize a misfolded protein. If a newly synthesized protein folds rapidly, at most 
only a small fraction of it is degraded. In contrast, a slowly folding protein is vul- 
nerable to the proteolytic machinery for a longer time, and many more of its mol- 
ecules may be destroyed before the remainder attain the proper folded state. Due 
to mutations or to errors in transcription, RNA splicing, and translation, some 
proteins never fold properly, and it is particularly important that the cell destroy 
these potentially harmful proteins. 

The apparatus that deliberately destroys aberrant proteins is the proteasome, 
an abundant ATP-dependent protease that constitutes nearly 1% of cell protein. 
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Figure 6-82 The processes that monitor 
protein quality following protein 
synthesis. A newly synthesized protein 
sometimes folds correctly and assembles 
on its own with its partner proteins, in 
which case the quality control mechanisms 
leave it alone. Incompletely folded proteins 
are helped to properly fold by molecular 
chaperones: first by a family of hsp70 
proteins, and then, in some cases, by 
hsp60-like proteins. For both types of 
chaperones, the substrate proteins are 
recognized by an abnormally exposed 
patch of hydrophobic amino acids on their 
surface. These “protein-rescue” processes 
compete with another mechanism that, 
upon recognizing an abnormally exposed 
hydrophobic patch, marks the protein 

for destruction by the proteasome. The 
combined activity of all of these processes 
is needed to prevent massive protein 
aggregation in a cell, which can occur 
when many hydrophobic regions on 
proteins clump together nonspecifically. 
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Figure 6-83 The proteasome. (A) A cut- 
away view of the structure of the central 
20S cylinder, as determined by x-ray 
crystallography, with the active sites of the 
proteases indicated by red dots. (B) The 
entire proteasome, in which the central 
cylinder (yellow) is supplemented by a 
19S cap (blue) at each end. The complex 
cap (also called the regulatory particle) 
selectively binds proteins that have been 
marked by ubiquitin for destruction; it 
then uses ATP hydrolysis to unfold their 
polypeptide chains and feed them through 
a narrow channel (See Figure 6-85) into 
the inner chamber of the 20S cylinder for 
digestion to short peptides. (B, from 

W. Baumeister et al., Cell 92:367-380, 
1998. With permission from Elsevier.) 





Present in many copies dispersed throughout the cytosol and the nucleus, the 
proteasome also destroys aberrant proteins that have entered the endoplasmic 
reticulum (ER). In the latter case, an ER-based surveillance system detects pro- 
teins that have failed either to fold or to be assembled properly after they enter the 
ER, and retrotranslocates them back to the cytosol for degradation by the protea- 
some (discussed in Chapter 12). 

Each proteasome consists of a central hollow cylinder (the 20S core protea- 
some) formed from multiple protein subunits that assemble as a stack of four hep- 
tameric rings (Figure 6-83). Some of the subunits are proteases whose active sites 
face the cylinder’s inner chamber, thus preventing them from running rampant 
through the cell. Each end of the cylinder is normally associated with a large pro- 
tein complex (the 19S cap) that contains a six-subunit protein ring through which 
target proteins are threaded into the proteasome core, where they are degraded 
(Figure 6-84). The threading reaction, driven by ATP hydrolysis, unfolds the target 
proteins as they move through the cap, exposing them to the proteases lining the 
proteasome core (Figure 6-85). The proteins that make up the ring structure in 
the proteasome cap belong to a large class of protein “unfoldases” known as AAA 
proteins. Many of them function as hexamers, and they share mechanistic fea- 
tures with the ATP-dependent DNA helicases that unwind DNA (see Figure 5-14). 
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Figure 6-84 Processive protein digestion by the proteasome. (A) The proteasome cap recognizes proteins marked by a 
polyubiquitin chain (see Figure 3-70), and subsequently translocates them into the proteasome core, where they are digested. At 
an early stage, the ubiquitin is cleaved from the substrate protein and is recycled. Translocation into the core of the proteasome 
is mediated by a ring of ATPases that unfold the substrate protein as it is threaded through the ring and into the proteasome 
core. This unfoldase ring is depicted in Figure 6-85). (B) Detailed structure of the proteasome cap. The cap includes a ubiquitin 
receptor, which holds a ubiquitylated protein in place while it begins to be pulled into the proteasome core, and a ubiquitin 


hydrolase, which cleaves ubiquitin from the doomed protein. (A, from S. Prakash and A. Matouschek, Trends Biochem. Sci. 
29:593-600, 2004. With permission from Elsevier. B, adapted from G.C. Lander et al., Nature 482:186-191, 2012.) 
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A crucial property of the proteasome, and one reason for the complexity of 
its design, is the processivity of its mechanism: in contrast to a “simple” prote- 
ase that cleaves a substrate’s polypeptide chain just once before dissociating, the 
proteasome keeps the entire substrate bound until all of it is converted into short 
peptides. 

One would expect that a machine as efficient as the proteasome would be 
tightly regulated; in particular, it must be able to distinguish abnormal proteins 
from those that are properly folded. The 19S cap of the proteasome acts as a gate 
at the entrance to the inner proteolytic core, and only those proteins marked for 
destruction are threaded through the cap. The destruction “mark” is the covalent 
attachment of the small protein ubiquitin. As we saw in Chapter 3, ubiquitin mod- 
ification of proteins is used for many purposes in the cell. The particular type of 
ubiquitin linkage that concerns us here is a chain of ubiquitin molecules linked 
together at lysine 48 (see Figure 3-69); this is the distinguishing feature of the 
ubiquitin tag that marks a protein for destruction in the proteasome. 

A special set of E3 molecules (see Figure 3-70B) is responsible for the ubiq- 
uitylation of denatured or otherwise misfolded proteins, as well as proteins 
containing oxidized or other abnormal amino acids. Abnormal proteins tend to 
display on their surface hydrophobic amino acid sequences or conformational 
motifs that are recognized as degradation signals by these E3 molecules; these 
sequences are buried and therefore inaccessible in the normal, properly folded 
version. However, a proteolytic pathway that recognizes and destroys abnormal 
proteins must be able to distinguish completed proteins that have “wrong” confor- 
mations from the many growing polypeptides on ribosomes (as well as polypep- 
tides just released from ribosomes) that have not yet achieved their normal folded 
conformation. This is not a trivial problem; in the course of carrying out its main 
job, the ubiquitin-proteasome system probably destroys many nascent and newly 
formed protein molecules, not because these proteins are abnormal as such, but 
because they have transiently exposed degradation signals that are buried in their 
mature (folded) state. 


Many Proteins Are Controlled by Regulated Destruction 


One function of intracellular proteolytic mechanisms is to recognize and elimi- 
nate misfolded or otherwise abnormal proteins, as just described. Indeed, every 
protein in the cell eventually accumulates damage and is probably degraded by 
the proteasome. Yet another function of these proteolytic pathways is to confer 
short lifetimes on specific normal proteins whose concentrations must change 
promptly with alterations in the state of a cell. Some of these short-lived proteins 
are degraded rapidly at all times, while many others are conditionally short-lived; 
that is, they are metabolically stable under some conditions, but become unsta- 
ble upon a change in the cell’s state. For example, mitotic cyclins are long-lived 
throughout the cell cycle until their sudden degradation at the end of mitosis, as 
explained in Chapter 17. 
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Figure 6-85 A hexameric protein 
unfoldase. (A) The proteasome cap 
includes proteins (orange) that recognize 
and hydrolyze ubiquitin and a hexameric 
ring (blue) through which ubiquitylated 
proteins are threaded. The hexameric ring 
is formed from six subunits, each belonging 
to the AAA family of proteins. (B) Model 
for the ATP-dependent unfoldase activity 
of AAA proteins. The ATP-bound form of 

a hexameric ring of AAA proteins binds 

a folded substrate protein that is held in 
place by its ubiquitin tag. A conformational 
change, driven by ATP hydrolysis, pulls 
the substrate into the central core and 
strains the ring structure. At this point, the 
substrate protein, which is being tugged 
upon, can partially unfold and enter further 
into the pore or it can maintain its structure 
and partially withdraw. Very stable protein 
substrates may require hundreds of cycles 
of ATP hydrolysis and dissociation before 
they are successfully pulled through the 
AAA protein ring. Once unfolded (and 
de-ubiquitylated), the substrate protein 
moves relatively quickly through the pore 
by successive rounds of ATP hydrolysis. 
(A, adapted from G.C. Lander et al., Nature 
482:186-191, 2012; B, adapted from 

R.T. Sauer et al., Cell 119:9-18, 2004. 
With permission from Elsevier.) 
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How is such a regulated destruction of a protein controlled? Several general 
mechanisms are illustrated in Figure 6-86. Specific examples of each mechanism 
are discussed in later chapters. In one general class of mechanism (Figure 6-86A), 
the activity of a ubiquitin ligase is turned on either by E3 phosphorylation or by 
an allosteric transition in an E3 protein caused by its binding to a specific small or 
large molecule. For example, the anaphase-promoting complex (APC) is a multi- 
subunit ubiquitin ligase that is activated by a cell-cycle-timed subunit addition at 
mitosis. The activated APC then causes the degradation of mitotic cyclins and sev- 
eral other regulators of the metaphase-anaphase transition (see Figure 17-15A). 

Alternatively, in response either to intracellular signals or to signals from the 
environment, a degradation signal can be created in a protein, causing its rapid 
ubiquitylation and destruction by the proteasome (Figure 6-86B). One common 
way to create such a signal is to phosphorylate a specific site on a protein that 
unmasks a normally hidden degradation signal. Another way to unmask such a 
signal is by the regulated dissociation of a protein subunit. Finally, powerful deg- 
radation signals can be created by cleaving a single peptide bond, provided that 
this cleavage creates a new N-terminus that is recognized by a specific E3 protein 
as a “destabilizing” N-terminal residue. This E3 protein recognizes only certain 
amino acids at the N-terminus of a protein; thus not all protein-cleavage events 
will lead to degradation of the C-terminal fragment produced. 

In humans, nearly 80% of proteins are acetylated on their N-terminal residue, 
and we now know that this modification is recognized by a specific E3 enzyme, 


Figure 6-86 Two general ways of 
inducing the degradation of a specific 
protein. (A) Activation of a specific 

E3 molecule creates a new ubiquitin 
ligase. Eukaryotic cells have many 
different E3 molecules, each activated 

by a different signal. (B) Creation of an 
exposed degradation signal in the protein 
to be degraded. This signal binds a 
ubiquitin ligase, causing the addition of a 
polyubiquitin chain to a nearby lysine on 
the target protein. All six pathways shown 
are known to be used by cells to induce 
the movement of selected proteins into the 
proteasome. 
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which directs the ubiquitylation of the protein and sends it to the proteasome 
for degradation. Thus, the majority of human proteins carry their own signals for 
destruction. It has been proposed that when a protein is properly folded (and, 
before that, when it is in contact with a chaperone), this acetylated N-terminus is 
buried and therefore inaccessible to the E3 enzyme. According to this idea, as a 
protein ages and becomes damaged (or if it fails to fold correctly from the start), 
this destruction signal becomes exposed, and the protein is destroyed. 


There Are Many Steps From DNA to Protein 


We have seen so far in this chapter that many different types of chemical reactions 
are required to produce a properly folded protein from the information contained 
in a gene (Figure 6-87). The final level of a properly folded protein in a cell there- 
fore depends upon the efficiency with which each of the many steps is performed. 
We also now know that the cell devotes enormous resources to selectively degrad- 
ing proteins, particularly those that fail to fold properly or accumulate damage 
as they age. It is the balance between the rates of synthesis and degradation that 
determines the final amount of every protein in the cell. 

In the following chapter, we shall see that cells have the ability to change the 
levels of their proteins according to their needs. In principle, any or all of the steps 
in Figure 6-87 could be regulated for each individual protein. As we shall see in 
Chapter 7, there are examples of regulation at each step from gene to protein. 
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Figure 6-87 The production of a protein 
by a eukaryotic cell. The final level of each 


Qar protein in a eukaryotic cell depends upon 
© COOH the efficiency of each step depicted. 
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Summary 


The translation of the nucleotide sequence of an mRNA molecule into protein takes 
place in the cytosol on a large ribonucleoprotein assembly called a ribosome. Each 
amino acid used for protein synthesis is first attached to a tRNA molecule that rec- 
ognizes, by complementary base-pair interactions, a particular set of three nucle- 
otides (codons) in the mRNA. As an mRNA is threaded through a ribosome, its 
sequence of nucleotides is then read from one end to the other in sets of three accord- 
ing to the genetic code. 

To initiate translation, a small ribosomal subunit binds to the mRNA molecule 
at a start codon (AUG) that is recognized by a unique initiator tRNA molecule. A 
large ribosomal subunit then binds to complete the ribosome and begin protein 
synthesis. During this phase, aminoacyl-tRNAs—each bearing a specific amino 
acid—bind sequentially to the appropriate codons in mRNA through complemen- 
tary base-pairing between tRNA anticodons and mRNA codons. Each amino acid 
is added to the C-terminal end of the growing polypeptide in four sequential steps: 
aminoacyl-tRNA binding, followed by peptide bond formation, followed by two 
ribosome translocation steps. Elongation factors use GTP hydrolysis both to drive 
these reactions forward and to improve the accuracy of amino acid selection. The 
mRNA molecule progresses codon by codon through the ribosome in the 5'-to-3' 
direction until it reaches one of three stop codons. A release factor then binds to the 
ribosome, terminating translation and releasing the completed polypeptide. 

Eukaryotic and bacterial ribosomes are closely related, despite differences in the 
number and size of their rRNA and protein components. The rRNA has the domi- 
nant role in translation, determining the overall structure of the ribosome, forming 
the binding sites for the tRNAs, matching the tRNAs to codons in the mRNA, and 
creating the active site of the peptidyl transferase enzyme that links amino acids 
together during translation. 

In the final steps of protein synthesis, two distinct types of molecular chaper- 
ones guide the folding of polypeptide chains. These chaperones, known as hsp60 
and hsp70, recognize exposed hydrophobic patches on proteins and serve to prevent 
the protein aggregation that would otherwise compete with the folding of newly 
synthesized proteins into their correct three-dimensional conformations. This pro- 
tein-folding process must also compete with an elaborate quality control mecha- 
nism that destroys proteins with abnormally exposed hydrophobic patches. In this 
case, ubiquitin is covalently added to a misfolded protein by a ubiquitin ligase, and 
the resulting polyubiquitin chain is recognized by the cap on a proteasome that 
unfolds the protein and threads it into the interior of the proteasome for proteo- 
lytic degradation. A closely related proteolytic mechanism, based on special degra- 
dation signals recognized by ubiquitin ligases, is used to determine the lifetimes of 
many normally folded proteins as well as to remove selected proteins from the cell 
in response to specific signals. 


THE RNA WORLD AND THE ORIGINS OF LIFE 


We have seen that the expression of hereditary information requires extraordi- 
narily complex machinery and proceeds from DNA to protein through an RNA 
intermediate. This machinery presents a central paradox: if nucleic acids are 
required to synthesize proteins and proteins are required, in turn, to synthesize 
nucleic acids, how did such a system of interdependent components ever arise? 
One view is that an RNA world existed on Earth before modern cells arose (Fig- 
ure 6-88). According to this hypothesis, RNA both stored genetic information and 
catalyzed the chemical reactions in primitive cells. Only later in evolutionary time 
did DNA take over as the genetic material and proteins become the major cata- 
lysts and structural components of cells. If this idea is correct, then the transition 
out of the RNA world was never complete; as we have seen in this chapter, RNA 
still catalyzes several fundamental reactions in modern-day cells, which can be 
viewed as molecular fossils from an earlier world. 


THE RNA WORLD AND THE ORIGINS OF LIFE 
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Figure 6-88 Time line for the universe, suggesting the early existence of an RNA world of 
living systems. 


The RNA world hypothesis relies on the fact that, among present-day biologi- 
cal molecules, RNA is unique in being able to act as both a carrier of genetic infor- 
mation and as a ribozyme to catalyze chemical reactions. In this section, we dis- 
cuss these properties of RNA and how they may have been especially important 
in early cells. 


Single-Stranded RNA Molecules Can Fold into Highly Elaborate 
Structures 


We have seen in this chapter that RNA can carry genetic information in mRNAs, 
and we saw in Chapter 5 that the genomes of some viruses are composed solely 
of RNA. We have also seen that complementary base-pairing and other types of 
hydrogen bonds can occur between nucleotides in the same chain of RNA, caus- 
ing an RNA molecule to fold up in a unique way determined by its nucleotide 
sequence (see, for example, Figures 6-50 and 6-67). Comparisons of many RNA 
structures have revealed conserved motifs, short structural elements that are used 
over and over again as parts of larger structures (Figure 6-89). 

Protein catalysts require a surface with unique contours and chemical prop- 
erties on which a given set of substrates can react (discussed in Chapter 3). In 
exactly the same way, an RNA molecule with an appropriately folded shape can 
serve as a catalyst (Figure 6-90). Like some proteins, many of these ribozymes 
work by positioning metal ions at their active sites. This feature gives them a wider 
range of catalytic activities than provided by the limited chemical groups of a 
polynucleotide chain. 

Much of our inference about the RNA world has come from experiments in 
which large pools of RNA molecules of random nucleotide sequences are gener- 
ated in the laboratory. Those rare RNA molecules with a property specified by the 
experimenter are then selected out and studied (Figure 6-91). Such experiments 
have created RNAs that can catalyze a wide variety of biochemical reactions (Table 
6-5), with reaction rate enhancements only a few orders of magnitude lower than 
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Figure 6-89 Some common elements 

of RNA structure. Conventional, 
complementary base-pairing interactions are 
indicated by red “rungs” in double-helical 
portions of the RNA. 
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Figure 6-90 A ribozyme. This simple RNA molecule catalyzes the cleavage of a second RNA at a specific site. This ribozyme is found embedded 

in larger RNA genomes — called viroids— which infect plants. The cleavage, which occurs in nature at a distant location on the same RNA molecule 
that contains the ribozyme, is a step in the replication of the viroid genome. Although not shown in the figure, the reaction requires a magnesium ion 
positioned at the active site. (Adapted from T.R. Cech and O.C. Uhlenbeck, Nature 372:39-40, 1994. With permission from Macmillan Publishers Ltd.) 
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Figure 6-91 /n vitro selection of a synthetic ribozyme. Beginning with a 
large pool of nucleic acid molecules synthesized in the laboratory, those rare 
RNA molecules that possess a specified catalytic activity can be isolated 
and studied. Although a specific example (that of an autophosphorylating 
ribozyme) is shown, variations of this procedure have been used to generate 
many of the ribozymes listed in Table 6-5. During the autophosphorylation 
step, the RNA molecules are kept sufficiently dilute to prevent the “cross”- 
phosphorylation of additional RNA molecules. In reality, several repetitions 
of this procedure are necessary to select the very rare RNA molecules with 
this catalytic activity. Thus, the material initially eluted from the column is 
converted back into DNA, amplified many fold (using reverse transcriptase 
and PCR, as explained in Chapter 8), transcribed back into RNA, and 
subjected to repeated rounds of selection. (Adapted from J.R. Lorsch and 
J.W. Szostak, Nature 371:31-36, 1994. With permission from Macmillan 
Publishers Ltd.) 


those of the “fastest” protein enzymes. Given these findings, it is not clear why 
protein catalysts greatly outnumber ribozymes in modern cells. Experiments have 
shown, however, that RNA molecules may have more difficulty than proteins in 
binding to flexible, hydrophobic substrates. In any case, the availability of 20 types 
of amino acids presumably provides proteins with a greater number of catalytic 
strategies. 


RNA Can Both Store Information and Catalyze Chemical 
Reactions 


RNA molecules have one property that contrasts with those of polypeptides: they 
can directly guide the formation of copies of their own sequence. This capacity 
depends on complementary base-pairing of their nucleotide subunits, which 
enables one RNA to act as a template for the formation of another. As we have seen 
in this and the preceding chapter, these complementary templating mechanisms 
lie at the heart of DNA replication and transcription in modern-day cells. 


TABLE 6-5 


Peptide bond formation in protein Ribosomal RNA 

synthesis 

RNA cleavage, RNA ligation Self-splicing RNAs; RNAse P; also in vitro 
selected RNA 
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But the efficient synthesis of RNA by such complementary templating mech- 
anisms requires catalysts to promote the polymerization reaction: without cata- 
lysts, polymer formation is slow, error-prone, and inefficient. 

Because RNA has all the properties required of a molecule that could catalyze 
a variety of chemical reactions, including those that lead to its own synthesis (Fig- 
ure 6-92), it has been proposed that RNAs served long ago as the catalysts for tem- 
plate-dependent RNA synthesis. Although self-replicating systems of RNA mol- 
ecules have not been found in nature, scientists have made significant progress 
toward constructing them in the laboratory. While such demonstrations would 
not prove that self-replicating RNA molecules were central to the origin of life on 
Earth, they would establish that such a scenario is plausible. 


How Did Protein Synthesis Evolve? 


The molecular processes underlying protein synthesis in present-day cells seem 
inextricably complex. Although we understand most of them, they do not make 
conceptual sense in the way that DNA transcription, DNA repair, and DNA rep- 
lication do. It is especially difficult to imagine how protein synthesis evolved 
because it is now performed by a complex interlocking system of protein and RNA 
molecules; obviously the proteins could not have existed until an early version of 
the translation apparatus was already in place. As attractive as the RNA world idea 
is for envisioning early life, it does not explain how the modern-day system of pro- 
tein synthesis arose. Although we can only speculate on the origins of the genetic 
code, several experimental observations have provided plausible scenarios. 

In modern cells, some short peptides (such as antibiotics) are synthesized 
without the ribosome; peptide synthetase enzymes assemble these peptides, with 
their proper sequence of amino acids, without mRNAs to guide their synthesis. It 
is plausible that this noncoded, primitive version of protein synthesis first devel- 
oped in the RNA world, where it would have been catalyzed by RNA molecules. 
This idea presents no conceptual difficulties because, as we have seen, rRNA cata- 
lyzes peptide bond formation in present-day cells. However, it leaves unexplained 
how the genetic code—which lies at the core of protein synthesis in today’s cells— 
might have arisen. We know that ribozymes created in the laboratory can perform 
specific aminoacylation reactions; that is, they can match specific amino acids to 
specific tRNAs. It is therefore possible that tRNA-like adaptors, each matched to a 
specific amino acid, could have arisen in the RNA world, marking the beginnings 
of a genetic code. 

Once coded protein synthesis evolved, the transition to a protein-dominated 
world could proceed, with proteins eventually taking over the majority of catalytic 
and structural tasks because of their greater versatility, with 20 rather than 4 dif- 
ferent subunits. Although these ideas are highly speculative, they are consistent 
with the known properties of RNA and protein molecules. 


All Present-Day Cells Use DNA as Their Hereditary Material 


If the evolutionary speculations embodied in the RNA world hypothesis are 
correct, early cells would have differed fundamentally from the cells we know 
today in having their hereditary information stored in RNA rather than in DNA 
(Figure 6-93). Evidence that RNA arose before DNA in evolution can be found 
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Figure 6-92 An RNA molecule that 

can catalyze its own synthesis. This 
hypothetical process would require 
catalysis both of the production of a 
second RNA strand of complementary 
nucleotide sequence (not shown) and the 
use of this second RNA molecule as a 
template to form many molecules of RNA 
with the original sequence. The red rays 
represent the active site of this hypothetical 
RNA enzyme. 


RNA-based systems 
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Figure 6-93 The hypothesis that RNA 
preceded DNA and proteins in evolution. 
In the earliest cells, RNA molecules (or their 
close analogs) would have had combined 
genetic, structural, and catalytic functions. 
In present-day cells, DNA is the repository 
of genetic information, and proteins 
perform the vast majority of catalytic 
functions in cells. RNA primarily functions 
today as a go-between in protein synthesis, 
although it remains a catalyst for a small 
number of crucial reactions. 
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in the chemical differences between them. Ribose, like glucose and other sim- 
ple carbohydrates, can be formed from formaldehyde (HCHO), a simple chemi- 
cal which is readily produced in laboratory experiments that attempt to simulate 
conditions on the primitive Earth. The sugar deoxyribose is harder to make, and 
in present-day cells it is produced from ribose in a reaction catalyzed by a pro- 
tein enzyme, suggesting that ribose pre-dates deoxyribose in cells. Presumably, 
DNA appeared on the scene later, but then proved more suitable than RNA as 
a permanent repository of genetic information. In particular, the deoxyribose in 
its sugar-phosphate backbone makes chains of DNA chemically more stable than 
chains of RNA, so that much greater lengths of DNA can be maintained without 
breakage. 

The other differences between RNA and DNA—the double-helical structure of 
DNA and the use of thymine rather than uracil—further enhance DNA stability by 
making the many unavoidable accidents that occur to the molecule much easier 
to repair, as discussed in detail in Chapter 5 (pp. 271-273). 


Summary 


From our knowledge of present-day organisms and the molecules they contain, it 
seems likely that the development of the distinctive autocatalytic mechanisms fun- 
damental to living systems began with the evolution of families of RNA molecules 
that could catalyze their own replication. DNA is likely to have been a late addition: 
as the accumulation of protein catalysts allowed more efficient and complex cells to 
evolve, the DNA double helix replaced RNA as a more stable molecule for storing the 
increased amounts of genetic information required by such cells. 


PROBLEMS 


Which statements are true? Explain why or why not. 
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WHAT WE DON’T KNOW 


e How did the present relationships 
between nucleic acids and proteins 
evolve? How did the genetic code 
originate? 


e The information carried in genomes 
specifies the sequences of all proteins 
and RNA molecules in the cell, 

and it determines when and where 
these molecules are synthesized. 

Do genomes carry other types of 
information that we have not yet 
discovered? 


e Cells go to great length to correct 
mistakes in the processes of DNA 
replication, transcription, splicing, 
and translation. Are there analogous 
strategies to correct mistakes in the 
selection of which genes are to be 
expressed in a given cell tyoe? Could 
the great complexity of transcription 
initiation in animals and plants reflect 
such a strategy? 
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6-2 Since introns are largely genetic “junk,” they do not 
have to be removed precisely from the primary transcript 
during RNA splicing. 


6-3 Wobble pairing occurs between the first position 
in the codon and the third position in the anticodon. 


6-4 During protein synthesis, the thermodynamics of 
base-pairing between tRNAs and mRNAs sets the upper 
limit for the accuracy with which protein molecules are 
made. 


6-5 Protein enzymes are thought to greatly outnum- 
ber ribozymes in modern cells because they can catalyze 
a much greater variety of reactions and all of them have 
faster rates than any ribozyme. 


Discuss the following problems. 


6-6 In which direction along the template must the 
RNA polymerase in Figure Q6-1 be moving to have gen- 
erated the supercoiled structures that are shown? Would 
you expect supercoils to be generated if the RNA poly- 
merase were free to rotate about the axis of the DNA as it 
progressed along the template? 


Figure Q6-1 Supercoils around a moving RNA polymerase (Problem 
6-6). 


6-7 You have attached an RNA polymerase molecule 
to a glass slide and have allowed it to initiate transcription 
on a template DNA that is tethered to a magnetic bead as 
shown in Figure Q6-2. If the DNA with its attached mag- 
netic bead moves relative to the RNA polymerase as indi- 
cated in the figure, in which direction will the bead rotate? 


Figure Q6-2 System for 
measuring the rotation of DNA 









magnet 
caused by RNA polymerase 
(Problem 6-7). The magnet 
holds the bead upright (but 
fluorescent 
beads rec doesn’t interfere with its 


rotation), and the attached 

tiny fluorescent beads allow 

the direction of motion to 

DNA be visualized under the 
microscope. RNA polymerase is 
held in place by attachment to 


ane the glass slide. 


polymerase 


glass slide 


CHAPTER 6 END-OF-CHAPTER PROBLEMS 


(A) HUMAN a-TROPOMYOSIN GENE 


1 4 


23 78 11 12 13 


(B) FOUR DIFFERENT SPLICE VARIANTS 











et 


Figure Q6-3 Alternatively spliced mRNAs from the human 
a-tropomyosin gene (Problem 6-8). (A) Exons in the human 
a-tropomyosin gene. The locations and relative sizes of exons are 
shown by the blue and red rectangles, with alternative exons in red. 

(B) Splicing patterns for four a-tropomyosin mRNAs. Splicing is 
indicated by lines connecting the exons that are included in the MRNA. 


6-8 The human a-tropomyosin gene is alternatively 
spliced to produce different forms of a-tropomyosin 
mRNA in different cell types (Figure Q6-3). For all forms of 
the mRNA, the protein sequences encoded by exon 1 are 
the same, as are the protein sequences encoded by exon 
10. Exons 2 and 3 are alternative exons used in different 
mRNAs, as are exons 7 and 8. Which of the following state- 
ments about exons 2 and 3 is the most accurate? Is that 
statement also the most accurate one for exons 7 and 8? 
Explain your answers. 


A. Exons 2 and 3 must have the same number of 
nucleotides. 
B. Exons 2 and 3 must each contain an integral num- 


ber of codons (that is, the number of nucleotides divided 
by 3 must be an integer). 

C. Exons 2 and 3 must each contain a number of 
nucleotides that when divided by 3 leaves the same 
remainder (that is, 0, 1, or 2). 


6-9 After treating cells with a chemical mutagen, you 
isolate two mutants. One carries alanine and the other 
carries methionine at a site in the protein that normally 
contains valine (Figure Q6-4). After treating these two 
mutants again with the mutagen, you isolate mutants from 
each that now carry threonine at the site of the original 
valine (Figure Q6-4). Assuming that all mutations involve 
single-nucleotide changes, deduce the codons that are 
used for valine, methionine, threonine, and alanine at the 
affected site. Would you expect to be able to isolate valine- 
to-threonine mutants in one step? 


6-10 Which of the following mutational changes would 
you predict to be the most deleterious to gene function? 
Explain your answers. 


Figure Q6-4 Two rounds of 
mutagenesis and the altered 
amino acids at a single position in 
a protein (Problem 6-9). 


first Ala second 
d b iii 
Val Thr 
Met 
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l. Insertion of a single nucleotide near the end of the 
coding sequence. 
2. Removal of a single nucleotide near the beginning 
of the coding sequence. 
3: Deletion of three consecutive nucleotides in the 
middle of the coding sequence. 
A, Substitution of one nucleotide for another in the 
middle of the coding sequence. 
6-11 Prokaryotes and eukaryotes both protect against 


the dangers of translating broken mRNAs. What dangers 
do partial mRNAs pose for the cell? 


6-12 Both hsp60-like and hsp70 molecular chaperones 
share an affinity for exposed hydrophobic patches on pro- 
teins, using them as indicators of incomplete folding. Why 
do you suppose hydrophobic patches serve as critical sig- 
nals for the folding status of a protein? 


6-13 Most proteins require molecular chaperones to 
assist in their correct folding. How do you suppose the 
chaperones themselves manage to fold correctly? 


6-14 What is so special about RNA that it is hypothe- 
sized to be an evolutionary precursor to DNA and protein? 
What is it about DNA that makes it a better material than 
RNA for storage of genetic information? 


6-15 If an RNA molecule could form a hairpin with a 
symmetric internal loop, as shown in Figure Q6-5, could 
the complement of this RNA form a similar structure? If so, 
would there be any regions of the two structures that are 
identical? Which ones? 


C-U Figure Q6-5 An RNA hairpin with 
a symmetric internal loop (Problem 
6-15). 


6-16 Imagine a warm pond on the primordial Earth. 
Chance processes have just assembled a single copy of an 
RNA molecule with a catalytic site that can carry out RNA 
replication. This RNA molecule folds into a structure that 
is capable of linking nucleotides according to instructions 
in an RNA template. Given an adequate supply of nucleo- 
tides, will this single RNA molecule be able to use itself as a 
template to catalyze its own replication? Why or why not? 
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Control of Gene Expression 


An organism’s DNA encodes all of the RNA and protein molecules required to 
construct its cells. Yet a complete description of the DNA sequence of an organ- 
ism—be it the few million nucleotides of a bacterium or the few billion nucleo- 
tides of ahuman—no more enables us to reconstruct the organism than a list of 
English words enables us to reconstruct a play by Shakespeare. In both cases, the 
problem is to know how the elements in the DNA sequence or the words on the 
list are used. Under what conditions is each gene product made, and, once made, 
what does it do? 

In this chapter, we focus on the first half of this problem—the rules and mech- 
anisms that enable a subset of genes to be selectively expressed in each cell. These 
mechanisms operate at many levels, and we shall discuss each level in turn. But 
first we present some of the basic principles involved. 


AN OVERVIEW OF GENE CONTROL 


The different cell types in a multicellular organism differ dramatically in both 
structure and function. If we compare a mammalian neuron with a liver cell, for 
example, the differences are so extreme that it is difficult to imagine that the two 
cells contain the same genome (Figure 7-1). For this reason, and because cell dif- 
ferentiation often seemed irreversible, biologists originally suspected that genes 
might be selectively lost when a cell differentiates. We now know, however, that 
cell differentiation generally occurs without changes in the nucleotide sequence 
of a cell’s genome. 


The Different Cell Types of a Multicellular Organism Contain the 
Same DNA 


The cell types in a multicellular organism become different from one another 
because they synthesize and accumulate different sets of RNA and protein mol- 
ecules. The initial evidence that they do this without altering the sequence of 
their DNA came from a classic set of experiments in frogs. When the nucleus of 
a fully differentiated frog cell is injected into a frog egg whose nucleus has been 
removed, the injected donor nucleus is capable of directing the recipient egg to 
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CHAPTER 


IN THIS CHAPTER 
AN OVERVIEW OF GENE CONTROL 


CONTROL OF TRANSCRIPTION BY 
SEQUENCE-SPECIFIC 
DNA-BINDING PROTEINS 


TRANSCRIPTION REGULATORS 
SWITCH GENES ON AND OFF 


MOLECULAR GENETIC 
MECHANISMS THAT CREATE AND 
MAINTAIN SPECIALIZED CELL 
TYPES 


MECHANISMS THAT REINFORCE 
CELL MEMORY IN PLANTS AND 
ANIMALS 


POST- TRANSCRIPTIONAL 
CONTROLS 


REGULATION OF GENE 
EXPRESSION BY NONCODING 
RNAs 


Figure 7-1 A neuron and a liver cell share 
the same genome. The long branches 

of this neuron from the retina enable it to 
receive electrical signals from many other 
neurons and convey them to neighboring 
neurons. The liver cell, which is drawn 

to the same scale, is involved in many 
metabolic processes, including digestion 
and the detoxification of alcohol and other 
drugs. Both of these mammalian cells 
contain the same genome, but they express 
different sets of RNAs and proteins. (Neuron 
adapted from S. Ramon y Cajal, Histologie 
du Systeme Nerveux de |’Homme et de 
Vertebres, 1909-1911. Paris: Maloine; 
reprinted, Madrid: C.S.1.C, 1972.) 
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produce a normal tadpole (Figure 7-2A). The tadpole contains a full range of dif- 
ferentiated cells that derived their DNA sequences from the nucleus of the origi- 
nal donor cell. Thus, the differentiated donor cell cannot have lost any important 
DNA sequences. A similar conclusion came from experiments performed with 
plants. When differentiated pieces of plant tissue are placed in culture and then 
dissociated into single cells, often one of these individual cells can regenerate an 
entire adult plant (Figure 7-2B). And the same principle has been more recently 
demonstrated in mammals that include sheep, cattle, pigs, goats, dogs, and mice 
(Figure 7-2C). 

Most recently, detailed DNA sequencing has confirmed the conclusion that 
the changes in gene expression that underlie the development of multicellular 
organisms do not generally involve changes in the DNA sequence of the genome. 


Different Cell Tyoes Synthesize Different Sets of RNAs and 
Proteins 


As a first step in understanding cell differentiation, we would like to know how 
many differences there are between any one cell type and another. Although we 
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Figure 7-2 Differentiated cells contain 
all the genetic instructions necessary 

to direct the formation of a complete 
organism. (A) The nucleus of a skin cell 
from an adult frog transplanted into an 
enucleated egg can give rise to an entire 
tadpole. The broken arrow indicates that, 
to give the transplanted genome time to 
adjust to an embryonic environment, a 
further transfer step is required in which one 
of the nuclei is taken from an early embryo 
that begins to develop and is put back 

into a second enucleated egg. (B) In many 
types of plants, differentiated cells retain 
the ability to “de-differentiate,” so that a 
single cell can form a clone of progeny cells 
that later give rise to an entire plant. (C) A 
nucleus removed from a differentiated cell 
from an adult cow and introduced into an 
enucleated egg from a different cow can 
give rise to a calf. Different calves produced 
from the same differentiated cell donor are 
all clones of the donor and are therefore 
genetically identical. (A, modified from 

J.B. Gurdon, Sci. Am. 219:24-35, 1968.) 
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still do not have an exact answer to this fundamental question, we can make sev- 
eral general statements. 


1. Many processes are common to all cells, and any two cells in a single 
organism therefore have many gene products in common. These include 
the structural proteins of chromosomes, RNA and DNA polymerases, DNA 
repair enzymes, ribosomal proteins and RNAs, the enzymes that catalyze 
the central reactions of metabolism, and many of the proteins that form the 
cytoskeleton such as actin (Figure 7-3A). 


2. Some RNAs and proteins are abundant in the specialized cells in which 
they function and cannot be detected elsewhere, even by sensitive tests. 
Hemoglobin, for example, is expressed specifically in red blood cells, 
where it carries oxygen, and the enzyme tyrosine aminotransferase (which 
breaks down tyrosine in food) is expressed in liver but not in most other 
tissues (Figure 7-3B). 

3. Studies of the number of different RNAs suggest that, at any one time, a 
typical human cell expresses 30-60% of its approximately 30,000 genes at 
some level. There are about 21,000 protein-coding genes and a roughly esti- 
mated 9000 noncoding RNA genes in humans. When the patterns of RNA 
expression in different human cell lines are compared, the level of expres- 
sion of almost every gene is found to vary from one cell type to another. A 
few of these differences are striking, like those of hemoglobin and tyrosine 
aminotransferase noted above, but most are much more subtle. But even 
those genes that are expressed in all cell types usually vary in their level of 
expression from one cell type to the next. 


4. Although there are striking differences in coding RNAs (mRNAs) in special- 
ized cell types, they underestimate the full range of differences in the final 
pattern of protein production. As we discuss in this chapter, there are many 
steps after RNA production at which gene expression can be regulated. 
And, as we saw in Chapter 3, proteins are often covalently modified after 
they are synthesized. The radical differences in gene expression between 
cell types are therefore most fully revealed through methods that directly 
display the levels of proteins along with their post-translational modifica- 
tions (Figure 7-4). 
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Figure 7-3 Differences in RNA levels 

for two human genes in seven different 
tissues. To obtain the RNA data by the 
technique known as RNA-seg (see 

p. 447), RNA was collected from human 
cell lines grown in culture, derived from 
each of the seven indicated tissues. Millions 
of “sequence reads” were obtained and 
mapped across the human genome by 
matching RNA sequences to the DNA 
sequence of the genome. At each position 
along the genome, the height of the colored 
trace is proportional to the number of 
sequence reads that match the genome 
sequence at that point. As seen in the 
figure, the exon sequences in transcribed 
genes are present at high levels, reflecting 
their presence in mature mRNAs. Intron 
sequences are present at much lower 
levels and reflect pre-mRNA molecules 

that have not yet been spliced plus intron 
sequences that have been spliced out but 
not yet degraded. (A) The gene coding for 
“all-purpose” actin, a major component 

of the cytoskeleton. Note that the left- 

hand end of the mature B-actin MRNA is 
not translated into protein. As explained 
later in this chapter, many mRNAs have 

5’ untranslated regions that regulate their 
translation into protein. (B) The same type 
of data displayed for the enzyme tyrosine 
aminotransferase, which is highly expressed 
in liver cells but not in the other cell types 
tested. (Information for both panels from the 
University of California, Santa Cruz, Genome 
Browser (http://genome.ucsc.edu), which 
provides this type of information for every 
human gene. See also S. Djebali et al., 
Nature 489:101-108, 2012.) 


372 Chapter 7: Control of Gene Expression 


(A) human brain (B) human liver 


<< =.. 
S| TESLAN _ — 
ae a3 uae 
th `a aae ° “—e «os 
b E an pry A > . s ana ee = @ 
s.. a’ 7 ° -> Š 
E oe ">. - >. a > f 
& = ~ ni a a e à Tt " a 2% =.. $% «=-- Sr 
J se m - 
>- > - - > 
oF J è e Gro a > Pa -2 -— es > * a6, -*e- a -- oF =. 
oO ý n rt © u @ ioe ; °° © . «a 
= we = = > = ‘es eT. =- a í “2, ~ -+ oe eo eee 
5 .. == * = * o9 
oo xP: °° 2.2» Pig a š = -° T = - o = 
= “Sie Pe o E all = o ` - SE - à a 2 Ea 
ae” = e - 5. . > ő pa = -= - - "m À 
-> > e * > 
y : Ld 2} J ai > © > . 
O e oè >.. «+ > *- n @ s i aa . - 
acidic isoelectric point basic 


External Signals Can Cause a Cell to Change the Expression of Its 
Genes 


Although the specialized cells in a multicellular organism have characteristic 
patterns of gene expression, each cell is capable of altering its pattern of gene 
expression in response to extracellular cues. If a liver cell is exposed to a gluco- 
corticoid hormone, for example, the production of a set of proteins is dramatically 
increased. Released in the body during periods of starvation or intense exercise, 
glucocorticoids signal the liver to increase the production of energy from amino 
acids and other small molecules; the set of proteins whose production is induced 
includes the enzyme tyrosine aminotransferase, mentioned above. When the hor- 
mone is no longer present, the production of these proteins drops to its normal, 
unstimulated level in liver cells. 

Other cell types respond to glucocorticoids differently. Fat cells, for example, 
reduce the production of tyrosine aminotransferase, while some other cell types 
do not respond to glucocorticoids at all. These examples illustrate a general fea- 
ture of cell specialization: different cell types often respond very differently to the 
same extracellular signal. Other features of the gene expression pattern do not 
change and give each cell type its permanently distinctive character. 


Gene Expression Can Be Regulated at Many of the Steps in the 
Pathway from DNA to RNA to Protein 


If differences among the various cell types of an organism depend on the partic- 
ular genes that the cells express, at what level is the control of gene expression 
exercised? As we saw in the previous chapter, there are many steps in the pathway 
leading from DNA to protein. We now know that all of them can in principle be 
regulated. Thus a cell can control the proteins it makes by (1) controlling when 
and how often a given gene is transcribed (transcriptional control), (2) con- 
trolling the splicing and processing of RNA transcripts (RNA processing control), 
(3) selecting which completed mRNAs are exported from the nucleus to the cyto- 
sol and determining where in the cytosol they are localized (RNA transport and 
localization control), (4) selecting which mRNAs in the cytoplasm are translated 
by ribosomes (translational control), (5) selectively destabilizing certain mRNA 
molecules in the cytoplasm (mRNA degradation control), or (6) selectively acti- 
vating, inactivating, degrading, or localizing specific protein molecules after they 
have been made (protein activity control) (Figure 7-5). 

For most genes, transcriptional controls are paramount. This makes sense 
because, of all the possible control points illustrated in Figure 7-5, only transcrip- 
tional control ensures that the cell will not synthesize superfluous intermediates. 
In the following sections, we discuss the DNA and protein components that per- 
form this function by regulating the initiation of gene transcription. We shall then 
return to the additional ways of regulating gene expression. 


Figure 7-4 Differences in the proteins 
expressed by two human tissues, 

(A) brain and (B) liver. In each panel, 

the proteins are displayed using 
two-dimensional polyacrylamide-gel 
electrophoresis (See pp. 452-454). The 
proteins have been separated by molecular 
weight (top to bottom) and isoelectric 
point, the pH at which the protein has 

no net charge (right to /eft). The protein 
spots artificially colored red are common 
to both samples; those in blue are specific 
to that tissue. The differences between 

the two tissue samples vastly outweigh 
their similarities: even for proteins that 

are shared between the two tissues, their 
relative abundances are usually different. 
Note that this technique separates proteins 
by both size and charge; therefore a protein 
that has several different phosphorylation 
states will appear as a series of horizontal 
spots (see upper right-hand portion of 
right panel). Only a small portion of the 
complete protein spectrum is shown for 
each sample. 

Methods based on mass spectrometry 
(see pp. 455-457) provide much more 
detailed information, including the 
identity of each protein, the position of 
each modification, and the nature of the 
modification. (Courtesy of Tim Myers and 
Leigh Anderson, Large Scale Biology 
Corporation.) 
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Summary 


The genome of a cell contains in its DNA sequence the information to make many 
thousands of different protein and RNA molecules. A cell typically expresses only a 
fraction of its genes, and the different types of cells in multicellular organisms arise 
because different sets of genes are expressed. Moreover, cells can change the pattern 
of genes they express in response to changes in their environment, such as signals 
from other cells. Although all of the steps involved in expressing a gene can in prin- 
ciple be regulated, for most genes the initiation of RNA transcription provides the 
most important point of control. 


CONTROL OF TRANSCRIPTION BY SEQUENCE- 
SPECIFIC DNA-BINDING PROTEINS 


How does a cell determine which of its thousands of genes to transcribe? Perhaps 
the most important concept, one that applies to all species on Earth, is based on 
a group of proteins known as transcription regulators. These proteins recognize 
specific sequences of DNA (typically 5-10 nucleotide pairs in length) that are 
often called cis-regulatory sequences, because they must be on the same chro- 
mosome (that is, in cis) to the genes they control. Transcription regulators bind 
to these sequences, which are dispersed throughout genomes, and this binding 
puts into motion a series of reactions that ultimately specify which genes are to be 
transcribed and at what rate. Approximately 10% of the protein-coding genes of 
most organisms are devoted to transcription regulators, making them one of the 
largest classes of proteins in the cell. In most cases, a given transcription regulator 
recognizes its own cis-regulatory sequence, which is different from those recog- 
nized by all the other regulators in the cell. 

Transcription of each gene is, in turn, controlled by its own collection of 
cis-regulatory sequences. These typically lie near the gene, often in the intergenic 
region directly upstream from the transcription start point of the gene. Although 
a few genes are controlled by a single cis-regulatory sequence that is recognized 
by a single transcription regulator, the majority have complex arrangements of 
cis-regulatory sequences, each of which is recognized by a different transcription 
regulator. It is therefore the positions, identity, and arrangement of cis-regulatory 
sequences—which are an important part of the information embedded in the 
genome—that ultimately determine the time and place that each gene is tran- 
scribed. 

We begin our discussion by describing how transcription regulators recognize 
cis-regulatory sequences. 


The Sequence of Nucleotides in the DNA Double Helix Can Be 
Read by Proteins 
As discussed in Chapter 4, the DNA in a chromosome consists of a very long 


double helix that has both a major and a minor groove (Figure 7-6). Transcrip- 
tion regulators must recognize short, specific cis-regulatory sequences within 


Figure 7-5 Six steps at which eukaryotic 
gene expression can be controlled. 
Controls that operate at steps 1 through 

5 are discussed in this chapter. Step 6, 

the regulation of protein activity, occurs 
largely through covalent post-translational 
modifications including phosphorylation, 
acetylation, and ubiquitylation (see 

Table 3-3, p. 165). Step 6 was introduced 
in Chapter 3 and is subsequently discussed 
in many chapters throughout the book. 


minor 
groove 


major 
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Figure 7-6 Double-helical structure of 
DNA. A space-filling model of DNA showing 
the major and minor grooves on the outside 
of the double helix (see Movie 4.1). The 
atoms are colored as follows: carbon, dark 
blue; nitrogen, light blue; hydrogen, white; 
oxygen, red; phosphorus, yellow. 
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this structure. When first discovered in the 1960s, it was thought that these pro- 
teins might require direct access to the interior of the double helix to distinguish 
between one DNA sequence and another. It is now clear, however, that the outside 
of the double helix is studded with DNA sequence information that transcription 
regulators recognize: the edge of each base pair presents a distinctive pattern of 
hydrogen-bond donors, hydrogen-bond acceptors, and hydrophobic patches 
in both the major and minor grooves (Figure 7-7). Because the major groove is 
wider and displays more molecular features than does the minor groove, nearly 
all transcription regulators make the majority of their contacts with the major 
groove—as we shall see. 


Transcription Regulators Contain Structural Motifs That Can Read 
DNA Sequences 


Molecular recognition in biology generally relies on an exact fit between the sur- 
faces of two molecules, and the study of transcription regulators has provided 
some of the clearest examples of this principle. A transcription regulator rec- 
ognizes a specific cis-regulatory sequence because the surface of the protein is 
extensively complementary to the special surface features of the double helix that 
displays that sequence. Each transcription regulator makes a series of contacts 
with the DNA, involving hydrogen bonds, ionic bonds, and hydrophobic interac- 
tions. Although each individual contact is weak, the 20 or so contacts that are typ- 
ically formed at the protein-DNA interface add together to ensure that the inter- 
action is both highly specific and very strong (Figure 7-8). In fact, DNA-protein 


Figure 7-7 How the different base 

pairs in DNA can be recognized from 
their edges without the need to open 
the double helix. The four possible 
configurations of base pairs are shown, 
with potential hydrogen-bond donors 
indicated in blue, potential hydrogen-bond 
acceptors in red, and hydrogen bonds of 
the base pairs themselves as a series of 
short, parallel red lines. Methyl groups, 
which form hydrophobic protuberances, 
are shown in yellow, and hydrogen atoms 
that are attached to carbons, and are 
therefore unavailable for hydrogen-bonding, 
are white. From the major groove, each of 
the four base-pair configurations projects 
a unique pattern of features. (From 

C. Branden and J. Tooze, Introduction 

to Protein Structure, 2nd ed. New York: 
Garland Publishing, 1999.) 
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interactions include some of the tightest and most specific molecular interactions 
known in biology. 

Although each example of protein-DNA recognition is unique in detail, x-ray 
crystallographic and nuclear magnetic resonance (NMR) spectroscopic studies 
of hundreds of transcription regulators have revealed that many of them contain 
one or another of a small set of DNA-binding structural motifs (Panel 7-1). These 
motifs generally use either a helices or P sheets to bind to the major groove of 
DNA. The amino acid side chains that extend from these protein motifs make the 
specific contacts with the DNA. Thus, a given structural motif can be used to rec- 
ognize many different cis-regulatory sequences depending on the specific side 
chains present. 


Dimerization of Transcription Regulators Increases Their Affinity 
and Specificity for DNA 


A monomer of a typical transcription regulator recognizes about 6-8 nucleotide 
pairs of DNA. However, sequence-specific DNA-binding proteins do not bind 
tightly to a single DNA sequence and reject all others; rather, they recognize a 
range of closely related sequences, with the affinity of the protein for the DNA 
varying according to how closely the DNA matches the optimal sequence. Hence, 
cis-regulatory sequences are often depicted as “logos” which display the range of 
sequences recognized by a particular transcription regulator (Figure 7-9A and 
B). In Chapter 6, we saw this same representation at work for the binding of RNA 
polymerase to promoters (see Figure 6-12). 

The DNA sequence recognized by a monomer does not contain sufficient 
information to be picked out from the background of such sequences that would 
occur at random all over the genome. For example, an exact six-nucleotide DNA 
sequence would be expected to occur by chance approximately once every 4096 
nucleotides (4°), and the range of six-nucleotide sequences described by a typi- 
cal logo would be expected to occur by chance much more often, perhaps every 
1000 nucleotides. Clearly, for a bacterial genome of 4.6 x 10° nucleotide pairs, 
not to mention a mammalian genome of 3 x 10° nucleotide pairs, this is insuf- 
ficient information to accurately control the transcription of individual genes. 
Additional contributions to DNA-binding specificity must therefore be present. 
Many transcription regulators form dimers, with both monomers making nearly 
identical contacts with DNA (Figure 7-9C). This arrangement doubles the length 
of the cis-regulatory sequence recognized and greatly increases both the affinity 
and the specificity of transcription regulator binding. Because the DNA sequence 
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Figure 7-8 The binding of a transcription 
regulator to a specific DNA sequence. On 
the /eft, a single contact is shown between 
a transcription regulator and DNA; such 
contacts allow the protein to “read” the DNA 
sequence. On the right, the complete set of 
contacts between a transcription regulator 
(a member of the homeodomain family —see 
Panel 7-1) and its cis-regulatory sequence 
is shown. The DNA-binding portion of the 
protein is 60 amino acids long. Although 

the interactions in the major groove are 

the most important, the protein is also 

seen to contact both the minor groove and 
phosphates in the sugar—phosphate DNA 
backbone. (See C. Wolberger et al., Cell 
67:517-528, 1991.) 
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Originally identified in bacterial transcription regulators, this motif has — 
since been found in many hundreds of DNA-binding proteins from both 
eukaryotes and prokaryotes. It is constructed from two a helices (blue 
and red) connected by a short extended chain of amino acids, which 
constitutes the “turn.” The two helices are held at a fixed angle, primarily 
through interactions between the two helices. The more C-terminal helix 
(in red) is called the recognition helix because it fits into the major 
groove of DNA; its amino acid side chains, which differ from protein to 
protein, play an important part in recognizing the specific DNA sequence 
to which the protein binds. All of the proteins shown here bind DNA as 
dimers in which the two copies of the recognition helix (in red) are 
separated by exactly one turn of the DNA helix (3.4 nm); thus both 
recognition helices of the dimer can fit into the major groove of DNA. 
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The /eucine zipper motif is named 
because of the way the two a 
helices, one from each monomer, are 
joined together to form a short 













Not long after the first transcription regulators were discovered in coiled-coil. These proteins bind DNA 
bacteria, genetic analyses of the fruit fly Drosophila led to the as dimers where the two long a 
characterization of an important class of genes, the homeotic selector helices are held together by 

genes, that play a critical part in orchestrating fly development (discussed interactions between hydrophobic 
in Chapter 21). It was later shown that these genes coded for amino acid side chains (often on 
transcription regulators that bound DNA through a structural motif leucines) that extend from one side 
named the homeodomain. Two different views of the same structure are of each helix. Just beyond the 






dimerization interface, the two a 
helices separate from each other to 
form a Y-shaped structure, which 
allows their side chains to contact 
the major groove of DNA. The dimer 
thus grips the double helix like a 
clothespin on a clothesline. 


shown. (A) The homeodomain is folded into three a helices, which are 
packed tightly together by hydrophobic interactions. The part containing 
helices 2 and 3 closely resembles the helix—turn—-helix motif. (B) The 
recognition helix (helix 3, red) forms important contacts with the major 
groove of DNA. The asparagine (Asn) of helix 3, for example, contacts an 
adenine, as shown in Figure 7-8. A flexible arm attached to helix 1 forms 
contacts with nucleotide pairs in the minor groove. 





















B SHEET DNA RECOGNITION PROTEINS 


In the other DNA-binding motifs displayed in this panel, a helices are the primary mechanism used to recognize specific DNA 
sequences. In one large group of transcription regulators, however, a two-stranded f sheet, with amino acid side chains 
extending from the sheet toward the DNA, reads the information on the surface of the major groove. As in the case of a 
recognition a helix, this B-sheet motif can be used to recognize many different DNA sequences; the exact DNA sequence 
recognized depends on the sequence of amino acids that make up the B sheet. Shown is a transcription regulator that binds 
two molecules of S-adenosyl methionine (red). On the left is a dimer of the protein; on the right is a simplified diagram 
showing just the two-stranded ß sheet bound to the major groove of DNA. 


ZINC FINGER PROTEINS 


This group of DNA-binding motifs includes one 
or more zinc atoms as structural components. 
All such zinc-coordinated DNA-binding motifs 
are called zinc fingers, referring to their 
appearance in early schematic drawings (left). 
They fall into several distinct structural groups, 
only one of which we consider here. It has a 
simple structure, in which the zinc atom holds 
an a helix and a f sheet together (middle). This 
type of zinc finger is often found in clusters 
with the a helix of each finger contacting the 
major groove of the DNA, forming a nearly 
continuous stretch of a helices along that 
groove. In this way, a strong and specific 
DNA-protein interaction is built up through a 
repeating basic structural unit. Three such 
fingers are shown on the right. 


HELIX—LOOP-HELIX PROTEINS 


Related to the leucine zipper, the helix-loop-helix 
motif consists of a short a helix connected by a loop 
(red) to a second, longer a helix. The flexibility of the 
loop allows one helix to fold back and park against 
the other thereby forming the dimerization surface. 
As shown, this two-helix structure binds both to DNA 
and to the two-helix structure of a second protein to 
create either a homodimer or a heterodimer. Two a 
helices that extend from the dimerization interface 
make specific contacts with the major groove of DNA. 
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recognized by the protein has increased from approximately 6 nucleotide pairs 
to 12 nucleotide pairs, there are many fewer random occurrences of matching 
sequences. 

Heterodimers are often formed from two different transcription regulators. 
Transcription regulators may form heterodimers with more than one partner pro- 
tein; in this way, the same transcription regulator can be “reused” to create several 
distinct DNA-binding specificities (see Figure 7-9C). 


Transcription Regulators Bind Cooperatively to DNA 


In the simplest case, the collection of noncovalent bonds that holds the above 
dimers or heterodimers together is so extensive that these structures form obliga- 
torily, and never fall apart. In this case, the unit of binding is the dimer or heterod- 
imer, and the binding curve for the transcription regulator (the fraction of DNA 
bound as a function of protein concentration) has a standard exponential shape 
(Figure 7-10A). 

In many cases, however, the dimers and heterodimers are held together very 
weakly; they exist predominantly as monomers in solution, and yet dimers are 
observed on the appropriate DNA sequence. Here, the proteins are said to bind to 
DNA cooperatively, and the curve describing their binding is sigmoidal in shape 
(Figure 7-10B). Cooperative binding means that, over a range of concentrations of 
the transcription regulator, binding is more of an all-or-none phenomenon than 
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Figure 7-9 Transcription regulators and 
cis-regulatory sequences. (A) Depiction 
of the cis-regulatory sequence for Nanog, 
a homeodomain family member that is 

a key regulator in embryonic stem cells. 
This “logo” form (see Figure 6-12) shows 
that the protein can recognize a collection 
of closely related DNA sequences and 
gives the preferred nucleotide pair at each 
position. Cis-regulatory sequences are 
“read” as double-stranded DNA, but only 
one strand typically is shown in a logo. 

(B) Representation of the cis-regulatory 
sequence as a colored box. (C) Many 
transcription regulators form dimers 
(homodimers) and heterodimers. In the 
example shown, three different DNA- 
binding specificities are formed from two 
transcription regulators. 


Figure 7-10 Occupancy of a cis- 
regulatory sequence by a transcription 
regulator. (A) Noncooperative binding by a 
stable heterodimer. (B) Cooperative binding 
by components of a heterodimer that are 
predominantly monomers in solution. The 
shape of the curve differs from that of (A) 
because the fraction of protein in a form 
competent to bind DNA (the heterodimer) 
increases with increasing protein 
concentration. 
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for noncooperative binding; that is, at most protein concentrations, the cis-regu- 
latory sequence is either nearly empty or nearly fully occupied and rarely is some- 
where in between. A discussion of the mathematics behind cooperative binding is 
given in Chapter 8 (see Figure 8-79A). 


Nucleosome Structure Promotes Cooperative Binding of 
Transcription Regulators 


As we have just seen, cooperative binding of transcription regulators to DNA often 
occurs because the monomers have only a weak affinity for each other. However, 
there is a second, indirect mechanism for cooperative binding, one that arises 
from the nucleosome structure of eukaryotic chromosomes. 

In general, transcription regulators bind to DNA in nucleosomes with lower 
affinity than they do to naked DNA. There are two reasons for this difference. 
First, the surface of the cis-regulatory sequence recognized by the transcription 
regulator may be facing inward on the nucleosome, toward the histone core, and 
therefore not be readily available to the regulatory protein. Second, even if the 
face of the cis-regulatory sequence is exposed on the outside of the nucleosome, 
many transcription regulators subtly alter the conformation of the DNA when 
they bind, and these changes are generally opposed by the tight wrapping of the 
DNA around the histone core. For example, many transcription regulators induce 
a bend or kink in the DNA when they bind. 

We saw in Chapter 4 that nucleosome remodeling can alter the structure of 
the nucleosome, allowing transcription regulators access to the DNA. Even with- 
out remodeling, however, transcription regulators can still gain limited access to 
DNA in a nucleosome. The DNA at the end of a nucleosome “breathes,” transiently 
exposing the DNA and allowing regulators to bind. This breathing happens at a 
much lower rate in the middle of the nucleosome; therefore, the positions where 
the DNA exits the nucleosome are much easier to occupy (Figure 7-11). 

These properties of the nucleosome promote cooperative DNA binding by 
transcription regulators. If a regulatory protein enters the DNA of a nucleosome 
and prevents the DNA from tightly rewrapping around the nucleosome core, it 
will increase the affinity of a second transcription regulator for a nearby cis-regu- 
latory sequence. If the two transcription regulators also interact with each other 
(as described above), the cooperative effect is even greater. In some cases, the 
combined action of the regulatory proteins can eventually displace the histone 
core of the nucleosome altogether. 
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Figure 7-11 How nucleosomes effect the 
binding of transcription regulators. 
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The cooperation among transcription regulators can become much greater 
when nucleosome remodeling complexes are involved. If one transcription reg- 
ulator binds its cis-regulatory sequence and attracts a chromatin remodeling 
complex, the localized action of the remodeling complex can allow a second 
transcription regulator to efficiently bind nearby. Moreover, we have discussed 
how transcription regulators can work together in pairs; in reality, larger num- 
bers often cooperate by repeated use of the same principles. A highly cooperative 
binding of transcription regulators to DNA probably explains why many sites in 
eukaryotic genomes that are bound by transcription regulators are “nucleosome 
free.” 


Summary 


Transcription regulators recognize short stretches of double-helical DNA of defined 
sequence called cis-regulatory sequences, and thereby determine which of the thou- 
sands of genes in a cell will be transcribed. Approximately 10% of the protein-cod- 
ing genes in most organisms produce transcription regulators, and they control 
many features of cells. Although each of these transcription regulators has unique 
features, most bind to DNA as homodimers or heterodimers and recognize DNA 
through one of a small number of structural motifs. Transcription regulators typi- 
cally work in groups and bind DNA cooperatively, a feature that has several under- 
lying mechanisms, some of which exploit the packaging of DNA in nucleosomes. 


TRANSCRIPTION REGULATORS SWITCH GENES ON 
AND OFF 


Having seen how transcription regulators bind to cis-regulatory sequences embed- 
ded in the genome, we can now discuss how, once bound, these proteins influence 
the transcription of genes. The situation in bacteria is simpler than in eukaryotes 
(for one thing, chromatin structure is not an issue), and we therefore discuss it first. 
Following this, we turn to the more complex situation in eukaryotes. 


The Tryotophan Repressor Switches Genes Off 


The genome of the bacterium E. coli consists of a single, circular DNA molecule 
of about 4.6 x 10° nucleotide pairs. This DNA encodes approximately 4300 pro- 
teins, although only a fraction of these are made at any one time. Bacteria regu- 
late the expression of many of their genes according to the food sources that are 
available in the environment. For example, in E. coli, five genes code for enzymes 
that manufacture the amino acid tryptophan. These genes are arranged in a clus- 
ter on the chromosome and are transcribed from a single promoter as one long 
mRNA molecule; such coordinately transcribed clusters are called operons (Fig- 
ure 7-12). Although operons are common in bacteria, they are rare in eukaryotes, 
where genes are typically transcribed and regulated individually (see Figure 7-3). 

When tryptophan concentrations are low, the operon is transcribed; the result- 
ing mRNA is translated to produce a full set of biosynthetic enzymes, which work 
in tandem to synthesize tryptophan from much simpler molecules. When trypto- 
phan is abundant, however—for example, when the bacterium is in the gut of a 
mammal that has just eaten a protein-rich meal—the amino acid is imported into 
the cell and shuts down production of the enzymes, which are no longer needed. 
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Figure 7-12 A cluster of bacterial 

genes can be transcribed from a single 
promoter. Each of these five genes 
encodes a different enzyme, and all of 
these enzymes are needed to synthesize 
the amino acid tryptophan from simpler 
molecules. The genes are transcribed as 

a single mRNA molecule, a feature that 
allows their expression to be coordinated. 
Clusters of genes transcribed as a single 
mRNA molecule are common in bacteria. 
Each of these clusters is called an operon 
because its expression is controlled by a 
cis-regulatory sequence called the operator 
(green), situated within the promoter. (In 
this and subsequent figures, the yellow 
blocks in the promoter represent DNA 
sequences that bind RNA polymerase; see 
Figure 6-12). 
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We now understand exactly how this repression of the tryptophan operon 
comes about. Within the operon’s promoter is a cis-regulatory sequence that 
is recognized by a transcription regulator. When this regulator binds to this 
sequence, it blocks access of RNA polymerase to the promoter, thereby prevent- 
ing transcription of the operon (and thus production of the tryptophan-producing 
enzymes). The transcription regulator is known as the tryptophan repressor and its 
cis-regulatory sequence is called the tryptophan operator. These components are 
controlled in a simple way: the repressor can bind to DNA only if it has also bound 
several molecules of tryptophan (Figure 7-13). 

The tryptophan repressor is an allosteric protein, and the binding of trypto- 
phan causes a subtle change in its three-dimensional structure so that the pro- 
tein can bind to the operator sequence. Whenever the concentration of free tryp- 
tophan in the bacterium drops, tryptophan dissociates from the repressor, the 
repressor no longer binds to DNA, and the tryptophan operon is transcribed. The 
repressor is thus a simple device that switches production of a set of biosynthetic 
enzymes on and off according to the availability of the end product of the pathway 
that the enzymes catalyze. 

The tryptophan repressor protein itself is always present in the cell. The gene 
that encodes it is continuously transcribed at a low level, so that a small amount of 
the repressor protein is always being made. Thus the bacterium can respond very 
rapidly to a rise or fall in tryptophan concentration. 


Repressors Turn Genes Off and Activators Turn Them On 


The tryptophan repressor, as its name suggests, is a transcriptional repressor 
protein: in its active form, it switches genes off, or represses them. Some bacterial 
transcription regulators do the opposite: they switch genes on, or activate them. 
These transcriptional activator proteins work on promoters that—in contrast to the 
promoter for the tryptophan operon—are only marginally able to bind and posi- 
tion RNA polymerase on their own. However, these poorly functioning promoters 
can be made fully functional by activator proteins that bind to nearby cis-regula- 
tory sequences and contact the RNA polymerase to help it initiate transcription 
(Figure 7-14). 
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Figure 7-13 Genes can be switched off 
by repressor proteins. If the concentration 
of tryotophan inside a bacterium is low 
(left), RNA polymerase (blue) binds to the 
promoter and transcribes the five genes 

of the tryptophan operon. However, if the 
concentration of tryptophan is high (right), 
the repressor protein (dark green) becomes 
active and binds to the operator (light 
green), where it blocks the binding of RNA 
polymerase to the promoter. Whenever the 
concentration of intracellular tryptophan 
drops, the repressor falls off the DNA, 
allowing the polymerase to again transcribe 
the operon. Although not shown in the 
figure, the repressor is a stable dimer. 


Figure 7-14 Genes can be switched 

on by activator proteins. An activator 
protein binds to its cis-regulatory sequence 
on the DNA and interacts with the RNA 
polymerase to help it initiate transcription. 
Without the activator, the promoter fails to 
initiate transcription efficiently. In bacteria, 
the binding of the activator to DNA is often 
controlled by the interaction of a metabolite 
or other small molecule (red triangle) with 
the activator protein. The Lac operon works 
in this manner, as we discuss shortly. 
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DNA-bound activator proteins can increase the rate of transcription initiation 
as much as 1000-fold, a value consistent with a relatively weak and nonspecific 
interaction between the transcription regulator and RNA polymerase. For exam- 
ple, a 1000-fold change in the affinity of RNA polymerase for its promoter corre- 
sponds to a change in AG of ~18 kJ/mole, which could be accounted for by just a 
few weak, noncovalent bonds. Thus, many activator proteins work simply by pro- 
viding a few favorable interactions that help to attract RNA polymerase to the pro- 
moter. To provide this assistance, however, the activator protein must be bound to 
its cis-regulatory sequence, and this sequence must be positioned, with respect to 
the promoter, so that the favorable interactions can occur. 

Like the tryptophan repressor, activator proteins often have to interact with 
a second molecule to be able to bind DNA. For example, the bacterial activator 
protein CAP has to bind cyclic AMP (cAMP) before it can bind to DNA. Genes 
activated by CAP are switched on in response to an increase in intracellular 
cAMP concentration, which rises when glucose, the bacterium’s preferred carbon 
source, is no longer available; as a result, CAP drives the production of enzymes 
that allow the bacterium to digest other sugars. 


An Activator and a Repressor Control the Lac Operon 


In many instances, the activity of a single promoter is controlled by several differ- 
ent transcription regulators. The Lac operon in E. coli, for example, is controlled 
by both the Lac repressor and the CAP activator that we just discussed. The Lac 
Operon encodes proteins required to import and digest the disaccharide lac- 
tose. In the absence of glucose, the bacterium makes cAMP, which activates CAP 
to switch on genes that allow the cell to utilize alternative sources of carbon— 
including lactose. It would be wasteful, however, for CAP to induce expression 
of the Lac operon if lactose itself were not present. Thus the Lac repressor shuts 
off the operon in the absence of lactose. This arrangement enables the control 
region of the Lac operon to integrate two different signals, so that the operon is 
highly expressed only when two conditions are met: glucose must be absent and 
lactose must be present (Figure 7-15). This genetic circuit thus behaves much like 
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Figure 7-15 The Lac operon is controlled 
by two transcription regulators, the 

Lac repressor and CAP. LacZ, the first 
gene of the operon, encodes the enzyme 
B-galactosidase, which breaks down lactose 
to galactose and glucose. When lactose is 
absent, the Lac repressor binds to a cis- 
regulatory sequence, called the Lac operator, 
and shuts off expression of the operon 
(Movie 7.4). Addition of lactose increases 
the intracellular concentration of a related 
compound, allolactose; allolactose binds to 
the Lac repressor, causing it to undergo a 
conformational change that releases its grip 
on the operator DNA (not shown). When 
glucose is absent, cyclic AMP (red triangle) is 
produced by the cell, and CAP binds to DNA. 
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a switch that carries out a logic operation in a computer. When lactose is present 
AND glucose is absent, the cell executes the appropriate program—in this case, 
transcription of the genes that permit the uptake and utilization of lactose. 

All transcription regulators, whether they are repressors or activators, must be 
bound to DNA to exert their effects. In this way, each regulatory protein acts selec- 
tively, controlling only those genes that bear a cis-regulatory sequence recognized 
by it. The logic of the Lac operon first attracted the attention of biologists more 
than 50 years ago. The way it works was uncovered by a combination of genetics 
and biochemistry, providing some of the first insights into how transcription is 
controlled in any organism. 


DNA Looping Can Occur During Bacterial Gene Regulation 


We have seen that transcription activators help RNA polymerase initiate tran- 
scription and repressors hinder it. However, the two types of proteins are very 
similar to one another. For example, to occupy their cis-regulatory sequences, 
both the tryptophan repressor and the CAP activator protein must bind a small 
molecule; moreover, they both recognize their cis-regulatory sequences using the 
same structural motif (the helix-turn-helix shown in Panel 7-1). Indeed, some 
proteins (for example, the CAP protein) can act as both a repressor and an acti- 
vator, depending on the exact placement of their cis-regulatory sequence relative 
to the promoter: for some genes, the CAP cis-regulatory sequence overlaps the 
promoter, and CAP binding thereby prevents the assembly of RNA polymerase at 
the promoter. 

Most bacteria have small, compact genomes, and the cis-regulatory sequences 
that control the transcription of a gene are typically located very near to the start 
point of transcription. But there are some exceptions to this generalization— 
cis-regulatory sequences can be located hundreds and even thousands of nucle- 
otide pairs from the bacterial genes they control (Figure 7-16). In these cases, the 
intervening DNA is looped out, allowing a protein bound at a distant site along 
the DNA to contact RNA polymerase. Here, the DNA acts as a tether, enormously 
increasing the probability that the proteins will collide, compared with the situa- 
tion where one protein is bound to DNA and the other is free in solution. We will 
see shortly that, although it is the exception in bacteria, DNA looping occurs in the 
regulation of nearly every eukaryotic gene. 

A possible explanation for this difference is based on evolutionary consid- 
erations. It has been proposed that the compact, simple genetic switches found 
in bacteria evolved in response to large population sizes where competition for 
growth put selective pressure on bacteria to maintain small genome sizes. In 
contrast, there appears to have been little selective pressure to “streamline” the 
genomes of multicellular organisms. 
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Figure 7-16 Transcriptional activation 

at a distance. (A) The NtrC protein is 

a bacterial transcription regulator that 
activates transcription by directly contacting 
RNA polymerase. (B) The interaction of NtrC 
and RNA polymerase, with the intervening 
DNA looped out, can be seen in the electron 
microscope. (B, courtesy of Harrison Echols 
and Sydney Kustu.) 
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Complex Switches Control Gene Transcription in Eukaryotes 


When compared to the situation in bacteria, transcription regulation in eukaryotes 
involves many more proteins and much longer stretches of DNA. It often seems 
bewilderingly complex. Yet many of the same principles apply. As in bacteria, the 
time and place that each gene is to be transcribed is specified by its cis-regulatory 
sequences, which are “read” by the transcription regulators that bind to them. 
Once bound to DNA, positive transcription regulators (activators) help RNA poly- 
merase begin transcribing genes, and negative regulators (repressors) block this 
from happening. In bacteria, as we have seen, most of the interactions between 
DNA-bound transcription regulators and RNA polymerases (whether they acti- 
vate or repress transcription) are direct. In contrast, these interactions are almost 
always indirect in eukaryotes: many intermediate proteins, including the his- 
tones, act between the DNA-bound transcription regulator and RNA polymerase. 
Moreover, in multicellular organisms, it is common for dozens of transcription 
regulators to control a single gene, with cis-regulatory sequences spread over tens 
of thousands of nucleotide pairs. DNA looping allows the DNA-bound regulatory 
proteins to interact with each other and ultimately with RNA polymerase at the 
promoter. Finally, because nearly all of the DNA in eukaryotic organisms is com- 
pacted by nucleosomes and higher-order structures, transcription initiation in 
eukaryotes must overcome this inherent block. 

In the next sections, we discuss these features of transcription initiation in 
eukaryotes, emphasizing how they provide extra levels of control not found in 
bacteria. 


A Eukaryotic Gene Control Region Consists of a Promoter Plus 
Many cis-Regulatory Sequences 


In eukaryotes, RNA polymerase II transcribes all the protein-coding genes and 
many noncoding RNA genes, as we saw in Chapter 6. This polymerase requires 
five general transcription factors (27 subunits in toto; see Table 6-3, p. 311), in 
contrast to bacterial RNA polymerase, which needs only a single general tran- 
scription factor (the o subunit). As we have seen, the stepwise assembly of the 
general transcription factors at a eukaryotic promoter provides, in principle, mul- 
tiple steps at which the cell can speed up or slow down the rate of transcription 
initiation in response to transcription regulators. 

Because the many cis-regulatory sequences that control the expression of a 
typical gene are often spread over long stretches of DNA, we use the term gene 
control region to describe the whole expanse of DNA involved in regulating and 
initiating transcription ofa eukaryotic gene. This includes the promoter, where the 
general transcription factors and the polymerase assemble, plus all of the cis-reg- 
ulatory sequences to which transcription regulators bind to control the rate of 
the assembly processes at the promoter (Figure 7-17). In animals and plants, it 
is not unusual to find the regulatory sequences of a gene dotted over stretches of 
DNA as large as 100,000 nucleotide pairs. Some of this DNA is transcribed (but 
not translated), and we discuss these long noncoding RNAs (IncRNAs) later in 
this chapter. For now, we can regard much of this DNA as “spacer” sequences that 
transcription regulators do not directly recognize. It is important to keep in mind 
that, like other regions of eukaryotic chromosomes, most of the DNA in gene con- 
trol regions is packaged into nucleosomes and higher-order forms of chromatin, 
thereby compacting its overall length and altering its properties. 

In this chapter, we shall loosely use the term gene to refer to a segment of DNA 
that is transcribed into a functional RNA molecule, one that either codes for a pro- 
tein or has a different role in the cell (see Table 6-1, p. 305). However, the classical 
view of a gene includes the gene control region as well, since mutations in it can 
produce an altered phenotype. Alternative RNA splicing further complicates the 
definition of a gene—a point we shall return to later. 

In contrast to the small number of general transcription factors, which are 
abundant proteins that assemble on the promoters of all genes transcribed by 
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RNA polymerase II, there are thousands of different transcription regulators 
devoted to turning individual genes on and off. In eukaryotes, operons—sets of 
genes transcribed as a unit—are rare, and, instead, each gene is regulated individ- 
ually. Not surprisingly, the regulation of each gene is different in detail from that 
of every other gene, and it is difficult to formulate simple rules for gene regula- 
tion that apply in every case. We can, however, make some generalizations about 
how transcription regulators, once bound to gene control regions on DNA, set in 
motion the series of events that lead to gene activation or repression. 


Eukaryotic Transcription Regulators Work in Groups 


In bacteria, we saw that proteins such as the tryptophan repressor, the Lac repres- 
sor, and the CAP protein bind to DNA on their own and directly affect RNA poly- 
merase at the promoter. Eukaryotic transcription regulators, in contrast, usually 
assemble in groups at their cis-regulatory sequences. Often two or more regula- 
tors bind cooperatively, as discussed earlier in the chapter. In addition, a broad 
class of multisubunit proteins termed coactivators and co-repressors assemble 
on DNA with them. Typically, these coactivators and co-repressors do not rec- 
ognize specific DNA sequences themselves; they are brought to those sequences 
by the transcription regulators. Often the protein-protein interactions between 
transcription regulators and between regulators and coactivators are too weak for 
them to assemble in solution; however, the appropriate combination of cis-regu- 
latory sequences can “crystallize” the assembly of these complexes on DNA (Fig- 
ure 7-18). 

As their names imply, coactivators are typically involved in activating tran- 
scription and co-repressors in repressing it. In the following sections, we will see 
that coactivators and co-repressors can act in a variety of different ways to influ- 
ence transcription after they have been localized on the genome by transcription 
regulators. 

As shown in Figure 7-18, an individual transcription regulator can often par- 
ticipate in more than one type of regulatory complex. A protein might function, 


Figure 7-17 The gene control region 
for a typical eukaryotic gene. The 
promoter is the DNA sequence where 
the general transcription factors and the 
polymerase assemble (see Figure 6-15). 
The cis-regulatory sequences are binding 
sites for transcription regulators, whose 
presence on the DNA affects the rate of 
transcription initiation. These sequences 
can be located adjacent to the promoter, 
far upstream of it, or even within introns 
or entirely downstream of the gene. The 
broken stretches of DNA signify that the 
length of DNA between the cis-regulatory 
sequences and the start of transcription 
varies, sometimes reaching tens of 
thousands of nucleotide pairs in length. The 
TATA box is a DNA recognition sequence 
for the general transcription factor TFIID. 
As shown in the lower panel, DNA looping 
allows transcription regulators bound at 
any of these positions to interact with the 
proteins that assemble at the promoter. 
Many transcription regulators act through 
Mediator (described in Chapter 6), while 
some interact with the general transcription 
factors and RNA polymerase directly. 
Transcription regulators also act by 
recruiting proteins that alter the chromatin 
structure of the promoter (not shown, but 
discussed below). 

Whereas Mediator and the general 
transcription factors are the same for all 
RNA polymerase Il-transcribed genes, the 
transcription regulators and the locations of 
their binding sites relative to the promoter 
differ for each gene. 
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for example, in one case as part of a complex that activates transcription and in 
another case as part of a complex that represses transcription. Thus, individual 
eukaryotic transcription regulators function as regulatory parts that are used to 
build complexes whose function depends on the final assembly of all of the indi- 
vidual components. Each eukaryotic gene is therefore regulated by a “committee” 
of proteins, all of which must be present to express the gene at its proper level. 


Activator Proteins Promote the Assembly of RNA Polymerase at 
the Start Point of Transcription 


The cis-regulatory sequences to which eukaryotic transcription activator proteins 
bind were originally called enhancers because their presence “enhanced” the rate 
of transcription initiation. It came as a surprise when it was discovered that these 
sequences could be found tens of thousands of nucleotide pairs away from the 
promoter; as we have seen, DNA looping, which was not widely appreciated at the 
time, can now explain this initially puzzling observation. 

Once bound to DNA, how do assemblies of activator proteins increase the 
rate of transcription initiation? At most genes, mechanisms work in concert. Their 
function is both to attract and position RNA polymerase II at the promoter and to 
release it so that transcription can begin. 

Some activator proteins bind directly to one or more of the general transcrip- 
tion factors, accelerating their assembly on a promoter that has been brought in 
proximity—through DNA looping—to that activator. Most transcription activators, 
however, attract coactivators that then perform the biochemical tasks needed to 
initiate transcription. One of the most prevalent coactivators is the large Media- 
tor protein complex, composed of more than 30 subunits. About the same size 
as RNA polymerase itself, Mediator serves as a bridge between DNA-bound tran- 
scription activators, RNA polymerase, and the general transcription factors, facil- 
itating their assembly at the promoter (see Figure 7-17). 


Eukaryotic Transcription Activators Direct the Modification of Local 
Chromatin Structure 


The eukaryotic general transcription factors and RNA polymerase are unable, on 
their own, to assemble on a promoter that is packaged in nucleosomes. Thus, in 
addition to directing the assembly of the transcription machinery at the promoter, 
eukaryotic transcription activators promote transcription by triggering changes 
to the chromatin structure of the promoters, making the underlying DNA more 
accessible. 

The most important ways of locally altering chromatin are through covalent 
histone modifications, nucleosome remodeling, nucleosome removal, and his- 
tone replacement (discussed in Chapter 4). Eukaryotic transcription activators 
use all four of these mechanisms: thus they attract coactivators that include his- 
tone modification enzymes, ATP-dependent chromatin remodeling complexes, 
and histone chaperones, each of which can alter the chromatin structure of 


Figure 7-18 Eukaryotic transcription 
regulators assemble into complexes on 
DNA. (A) Seven transcription regulators 
are shown. The nature and function of 

the complex they form depend on the 
specific cis-regulatory sequences that 
seed their assembly. (B) Some assembled 
complexes activate gene transcription, 
while another represses transcription. 

Note that the light green and dark green 
proteins are shared by both activating and 
repressing complexes. Proteins that do 
not themselves bind DNA but assemble on 
other DNA-bound transcription regulators 
are termed coactivators or co-repressors. 
In some cases (lower right), RNA molecules 
are found in these assemblies. As 
described later in this chapter, these RNAs 
often act as scaffolds to hold a group of 
proteins together. 
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promoters (Figure 7-19). These local alterations in chromatin structure provide 
greater access to DNA, thereby facilitating the assembly of the general transcrip- 
tion factors at the promoter. In addition, some histone modifications specifically 
attract these proteins to the promoter. These mechanisms often work together 
during transcription initiation (Figure 7-20). Finally, as discussed earlier in this 
chapter, the local chromatin changes directed by one transcriptional regulator 
can allow the binding of additional regulators. By repeated use of this principle, 
large assemblies of proteins can form on control regions of genes to regulate their 
transcription. 

The alterations of chromatin structure that occur during transcription initia- 
tion can persist for different lengths of time. In some cases, as soon as the tran- 
scription regulator dissociates from DNA, the chromatin modifications are rapidly 
reversed, restoring the gene to its pre-activated state. This rapid reversal is espe- 
cially important for genes that the cell must quickly switch on and off in response 
to external signals. In other cases, the altered chromatin structure persists, even 
after the transcription regulator that directed its establishment has dissociated 
from DNA. In principle, this memory can extend into the next cell generation 
because, as discussed in Chapter 4, chromatin structure can be self-renewing 
(see Figure 4-44). The fact that different histone modifications persist for different 
times provides the cell with a mechanism that makes possible both longer- and 
shorter-term memory of gene expression patterns. 

A special type of chromatin modification occurs as RNA polymerase II tran- 
scribes through a gene. The histones just ahead of the polymerase can be acetyl- 
ated by enzymes carried by the polymerase, removed by histone chaperones, 
and deposited behind the moving polymerase. These histones are then rapidly 
deacetylated and methylated, also by complexes that are carried by the poly- 
merase, leaving behind nucleosomes that are especially resistant to transcrip- 
tion. This remarkable process seems to prevent spurious transcription reinitiation 
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Figure 7-19 Eukaryotic transcription 
activator proteins direct local alterations 
in chromatin structure. Nucleosome 
remodeling, nucleosome removal, histone 
replacement, and certain types of histone 
modifications favor transcription initiation 
(see Figure 4-39). These alterations 
increase the accessibility of DNA and 
facilitate the binding of RNA polymerase 
and the general transcription factors. 
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Figure 7-20 Successive histone modifications during transcription 
initiation. In this example, taken from the human interferon gene promoter, 
a transcription activator binds to DNA packaged into chromatin and attracts 
a histone acetyl transferase that acetylates lysine 9 of histone H3 and lysine 
8 of histone H4. Then a histone kinase, also attracted by the transcription 
activator, phosphorylates serine 10 of histone H3 but it can only do so after 
lysine 9 has been acetylated. This serine modification signals the histone 
acetyl transferase to acetylate position K14 of histone H3. Next, the general 
transcription factor TFIID and a chromatin remodeling complex bind to the 
chromatin to promote the subsequent steps of transcription initiation. TFIID 
and the remodeling complex both recognize acetylated histone tails through 
a bromodomain, a protein domain specialized to read this particular mark on 
histones; a bromodomain is carried in a subunit of each protein complex. 
The histone acetyl transferase, the histone kinase, and the chromatin 
remodeling complex are all coactivators. The order of events shown applies 
to a specific promoter; at other genes, the steps may occur in a different 
order or individual steps may be omitted altogether. (Adapted from 
T. Agalioti, G. Chen and D. Thanos, Cell 111:381-392, 2002. With 
permission from Elsevier.) 


behind a moving polymerase, which, in essence, must clear a path through chro- 
matin as it transcribes. Later in this chapter, when we discuss RNA interference, 
the potential dangers to the cell of such inappropriate transcription will become 
especially obvious. The modification of nucleosomes behind a moving RNA poly- 
merase also plays an important role in RNA splicing (see p. 323). 


Transcription Activators Can Promote Transcription by Releasing 
RNA Polymerase from Promoters 


In some cases, transcription initiation requires that a DNA-bound transcription 
activator releases RNA polymerase from the promoter so as to allow it to begin 
transcribing the gene. In other cases, the RNA polymerase halts after transcribing 
about 50 nucleotides of RNA, and further elongation requires a transcription acti- 
vator bound behind it (Figure 7-21). These paused polymerases are common in 
humans, where a significant fraction of genes that are not being transcribed have 
a paused polymerase located just downstream from the promoter. 

The release of RNA polymerase can occur in several ways. In some cases, the 
activator brings in a chromatin remodeling complex that removes a nucleosome 
block to the elongating RNA polymerase. In other cases, the activator communi- 
cates with RNA polymerase (typically through a coactivator), signaling it to move 
ahead. Finally, as we saw in Chapter 6, RNA polymerase requires elongation fac- 
tors to effectively transcribe through chromatin. In some cases, the key step in 
gene activation is the loading of these factors onto RNA polymerase, which can be 
directed by DNA-bound transcription activators. Once loaded, these factors allow 
the polymerase to move through blocks imposed by chromatin structure and 
begin transcribing the gene in earnest. Having RNA polymerase already poised 
on a promoter in the beginning stages of transcription bypasses the step of assem- 
bling many components at the promoter, which is often slow. This mechanism 
can therefore allow cells to begin transcribing a gene as a rapid response to an 
extracellular signal. 


Transcription Activators Work Synergistically 


We have seen that complexes of transcription activators and coactivators assem- 
ble cooperatively on DNA. We have also seen that these assemblies can promote 
different steps in transcription initiation. In general, where several factors work 
together to enhance a reaction rate, the joint effect is not merely the sum of the 
enhancements that each factor alone contributes, but the product. If, for exam- 
ple, factor A lowers the free-energy barrier for a reaction by a certain amount and 
thereby speeds up the reaction 100-fold, and factor B, by acting on another aspect 
of the reaction, does likewise, then A and B acting in parallel will lower the barrier 
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by a double amount and speed up the reaction 10,000-fold. Even if A and B work 
simply by attracting the same protein, the affinity of that protein for the reac- 
tion site increases multiplicatively. Thus, transcription activators often exhibit 
transcriptional synergy, where several DNA-bound activator proteins working 
together produce a transcription rate that is much higher than the sum of their 
transcription rates working alone (Figure 7-22). 

An important point is that a transcription activator protein must be bound 
to DNA to influence transcription of its target gene. And the rate of transcription 
of a gene ultimately depends upon the spectrum of regulatory proteins bound 
upstream and downstream ofits transcription start site, along with the coactivator 
proteins they bring to DNA. 


Eukaryotic Transcription Repressors Can Inhibit Transcription in 
Several Ways 


Although the “default” state of eukaryotic DNA packaged into nucleosomes is 
resistant to transcription, eukaryotes nonetheless use transcription regulators to 
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Figure 7-21 Transcription activators 
can act at different steps. In addition 

to (A) promoting binding of additional 
transcription regulators and (B) assembling 
RNA polymerase at promoters, 
transcription activators are often needed 
(C) to release already assembled RNA 
polymerases from promoters or (D) to 
release RNA polymerase molecules that 
become stalled after transcribing about 50 
nucleotides of RNA. The activities shown in 
Figure 7-19 can affect each of these four 
steps. 


Figure 7-22 Transcriptional synergy. 

This experiment compares the rate 

of transcription produced by three 
experimentally constructed regulatory 
regions in a eukaryotic cell and reveals 
transcriptional synergy, a greater than 
additive effect of multiple activators working 
together. For simplicity, coactivators have 
been omitted from the diagram. 

Such transcriptional synergy is not only 
observed between different transcription 
activators from the same organism; it is 
also seen between activator proteins from 
different eukaryotic species when they are 
experimentally introduced into the same 
cell. This last observation reflects the high 
degree of conservation of the machinery 
responsible for eukaryotic transcription 
initiation. 
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Figure 7-23 Six ways in which eukaryotic repressor proteins can operate. (A) Activator proteins and repressor proteins 
compete for binding to the same regulatory DNA sequence. (B) Both proteins bind DNA, but the repressor prevents the 
activator from carrying out its functions. (C) The repressor blocks assembly of the general transcription factors. (D) The repressor 
recruits a chromatin remodeling complex, which returns the nucleosomal state of the promoter region to its pre-transcriptional 
form. (E) The repressor attracts a histone deacetylase to the promoter. As we have seen, histone acetylation can stimulate 
transcription initiation (see Figure 7-20), and the repressor simply reverses this modification. (F) The repressor attracts a histone 
methyl transferase, which modifies certain positions on histones by attaching methyl groups; the methylated histones, in turn, 
are bound by proteins that maintain the chromatin in a transcriptionally silent form. 


repress the transcription of genes. These transcription repressors can both depress 
the rate of transcription below the default value and rapidly shut off genes that 
were previously activated. We saw in Chapter 4 that large regions of the genome 
can be shut down by the packaging of DNA into especially resistant forms of chro- 
matin. However, eukaryotic genes are rarely organized along the genome accord- 
ing to function, and this strategy is not generally applicable for shutting off a set 
of genes that work together. Instead, most eukaryotic repressors work on a gene- 
by-gene basis. Unlike bacterial repressors, eukaryotic repressors do not directly 
compete with the RNA polymerase for access to the DNA; rather, they use a variety 
of other mechanisms, some of which are illustrated in Figure 7-23. Although all 
of these mechanisms ultimately block transcription by RNA polymerase, eukary- 
otic transcription repressors typically act by bringing co-repressors to DNA. Like 
transcription activation, transcription repression can act through more than one 
mechanism at a given target gene, thereby ensuring especially efficient repres- 
sion. 

Gene repression is especially important to animals and plants whose growth 
depends on elaborate and complex developmental programs. Misexpression of a 
single gene at a critical time can have disastrous consequences for the individual. 
For this reason, many of the genes encoding the most important developmental 
regulatory proteins are kept tightly repressed when they are not needed. 
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Figure 7-24 Schematic diagram summarizing the properties of insulators and barrier sequences. (A) Insulators 
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directionally block the action of cis-regulatory sequences, whereas barrier sequences prevent the spread of heterochromatin. 
How barrier sequences likely function is depicted in Figure 4—41. (B) Insulator-binding proteins (purple) hold chromatin in loops, 
thereby favoring “correct” cis-regulatory sequence—gene associations. Thus, gene B is properly regulated, and gene B’s cis- 


regulatory sequences are prevented from influencing the transcription of gene A. 


Insulator DNA Sequences Prevent Eukaryotic Transcription 
Regulators from Influencing Distant Genes 


We have seen that all genes have control regions, which dictate at which times, 
under what conditions, and in what tissues the gene will be expressed. We have 
also seen that eukaryotic transcription regulators can act across very long stretches 
of DNA, with the intervening DNA looped out. How, then, are control regions of 
different genes kept from interfering with one another? For example, what keeps 
a transcription regulator bound on the control region of one gene from looping in 
the wrong direction and inappropriately influencing the transcription of an adja- 
cent gene? 

To avoid such cross-talk, several types of DNA elements compartmentalize 
the genome into discrete regulatory domains. In Chapter 4, we discussed barrier 
sequences that prevent the spread of heterochromatin into genes that need to be 
expressed. A second type of DNA element, called an insulator, prevents cis-regu- 
latory sequences from running amok and activating inappropriate genes (Figure 
7-24). Insulators function by forming loops of chromatin, an effect mediated by 
specialized proteins that bind them (see Figures 4-48 and 7-24B). The loops hold 
a gene and its control region in rough proximity and help to prevent the control 
region from “spilling over” to adjacent genes. Importantly, these loops can be in 
different in different cell types, depending on the particular proteins and chroma- 
tin structures that are present. 

The distribution of insulators and barrier sequences in a genome is thought to 
divide it into independent domains of gene regulation and chromatin structure 
(see pp. 207-208). Aspects of this organization can be visualized by staining whole 
chromosomes for the specialized proteins that bind these DNA elements (Figure 
7-25). 
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Figure 7-25 Localization of a Drosophila insulator-binding protein on polytene chromosomes. 
A polytene chromosome (discussed in Chapter 4) was stained with propidium iodide (red) to 

show its banding patterns, with bands appearing bright red and interbands as dark gaps in the 
pattern (top). The positions on this polytene chromosome that are bound by a particular insulator 
protein are stained bright green using antibodies directed against the protein (bottom). This protein 
is preferentially localized to interband regions, reflecting its role in organizing chromosomes into 
structural, as well as functional, domains. For convenience, these two micrographs of the same 
polytene chromosome are arranged as mirror images. (Courtesy of Uli Laemmli, from K. Zhao et al., 
Cell 81:879-889, 1995. With permission from Elsevier.) 
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Although chromosomes are organized into orderly domains that discourage 
control regions from acting indiscriminately, there are special circumstances 
where a control region located on one chromosome has been found to activate 
a gene located on a different chromosome. Although there is much we do not 
understand about this mechanism, it indicates the extreme versatility of tran- 
scriptional regulation strategies. 


Summary 


Transcription regulators switch the transcription of individual genes on and off in 
cells. In prokaryotes, these proteins typically bind to specific DNA sequences close 
to the RNA polymerase start site and, depending on the nature of the regulatory 
protein and the precise location of its binding site relative to the start site, either 
activate or repress transcription of the gene. The flexibility of the DNA helix, how- 
ever, also allows proteins bound at distant sites to affect the RNA polymerase at 
the promoter by the looping out of the intervening DNA. The regulation of higher 
eukaryotic genes is much more complex, commensurate with a larger genome size 
and the large variety of cell types that are formed. A single eukaryotic gene is typ- 
ically controlled by many transcription regulators bound to sequences that can 
be tens or even hundreds of thousands of nucleotide pairs from the promoter that 
directs transcription of the gene. Eukaryotic activators and repressors act by a wide 
variety of mechanisms—generally altering chromatin structure and controlling the 
assembly of the general transcription factors and RNA polymerase at the promoter. 
They do this by attracting coactivators and co-repressors, protein complexes that 
perform the necessary biochemical reactions. The time and place that each gene 
is transcribed, as well as its rates of transcription under different conditions, are 
determined by the particular spectrum of transcription regulators that bind to the 
regulatory region of the gene. 


MOLECULAR GENETIC MECHANISMS THAT CREATE 
AND MAINTAIN SPECIALIZED CELL TYPES 


Although all cells must be able to switch genes on and off in response to changes 
in their environments, the cells of multicellular organisms have evolved this 
Capacity to an extreme degree. In particular, once a cell in a multicellular organ- 
ism becomes committed to differentiate into a specific cell type, the cell main- 
tains this choice through many subsequent cell generations, which means that 
it remembers the changes in gene expression involved in the choice. This phe- 
nomenon of cell memory is a prerequisite for the creation of organized tissues and 
for the maintenance of stably differentiated cell types. In contrast, other changes 
in gene expression in eukaryotes, as well as most such changes in bacteria, are 
only transient. The tryptophan repressor, for example, switches off the tryptophan 
genes in bacteria only in the presence of tryptophan; as soon as tryptophan is 
removed from the medium, the genes are switched back on, and the descendants 
of the cell will have no memory that their ancestors had been exposed to trypto- 
phan. 

In this section, we shall examine not only cell memory mechanisms, but 
also how gene regulatory devices can be combined to create the “logic circuits” 
through which cells integrate signals and remember events in their past. We begin 
by considering one such complex gene control region in detail. 


Complex Genetic Switches That Regulate Drosophila 
Development Are Built Up from Smaller Molecules 


We have seen that transcription regulators can be positioned at multiple sites 
along long stretches of DNA and that these proteins can bring into play coacti- 
vators and co-repressors. Here, we discuss how the numerous transcription reg- 
ulators that are bound to the control region of a gene can cause the gene to be 
transcribed at the proper place and time. 
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Consider the Drosophila Even-skipped (Eve) gene, whose expression plays an 
important part in the development of the Drosophila embryo. If this gene is inac- 
tivated by mutation, many parts of the embryo fail to form, and the embryo dies 
early in development. As discussed in Chapter 21, at the stage of development 
when Eve begins to be expressed, the embryo is a single giant cell containing mul- 
tiple nuclei in a common cytoplasm. This cytoplasm contains a mixture of tran- 
scription regulators that are distributed unevenly along the length of the embryo, 
thus providing positional information that distinguishes one part of the embryo 
from another (Figure 7-26). Although the nuclei are initially identical, they rap- 
idly begin to express different genes because they are exposed to different tran- 
scription regulators. For example, the nuclei near the anterior end of the devel- 
oping embryo are exposed to a set of transcription regulators that is distinct from 
the set that influences nuclei at the middle or at the posterior end of the embryo. 

The regulatory DNA sequences that control the Eve gene have evolved to 
“read” the concentrations of transcription regulators at each position along the 
length of the embryo, and they cause the Eve gene to be expressed in seven pre- 
cisely positioned stripes, each initially five to six nuclei wide (Figure 7-27). How is 
this remarkable feat of information processing carried out? Although there is still 
much to learn, several general principles have emerged from studies of Eve and 
other genes that are similarly regulated. 

The regulatory region of the Eve gene is very large (approximately 20,000 
nucleotide pairs). It is formed from a series of relatively simple regulatory mod- 
ules, each of which contains multiple cis-regulatory sequences and is responsible 
for specifying a particular stripe of Eve expression along the embryo. This modular 
organization of the Eve gene control region was revealed by experiments in which 
a particular regulatory module (say, that specifying stripe 2) is removed from its 
normal setting upstream of the Eve gene, placed in front of a reporter gene, and 
reintroduced into the Drosophila genome. When developing embryos derived 
from flies carrying this genetic construct are examined, the reporter gene is found 
to be expressed in precisely the position of stripe 2 (Figure 7-28). Similar exper- 
iments reveal the existence of other regulatory modules, each of which specifies 
other stripes. 


Figure 7-27 The seven stripes of the protein encoded by the Even- 
skipped (Eve) gene in a developing Drosophila embryo. Two and one-half 
hours after fertilization, the egg was fixed and stained with antibodies that 
recognize the Eve protein (green) and antibodies that recognize the Giant 
protein (red). Where Eve and Giant proteins are both present, the staining 
appears yellow. At this stage in development, the egg contains approximately 
4000 nuclei. The Eve and Giant proteins are both located in the nuclei, and 
the Eve stripes are about four nuclei wide. The pattern for the Giant protein is 
also shown in Figure 7-26. (Courtesy of Michael Levine.) 


Figure 7-26 The nonuniform distribution 
of transcription regulators in an early 
Drosophila embryo. At this stage, the 
embryo is a syncytium; that is, multiple 
nuclei are contained in a common 
cytoplasm. Although not shown in 

these drawings, all of these proteins are 
concentrated in the nuclei. How such 
differences are established is discussed in 
Chapter 21. 
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Figure 7-28 Experiment demonstrating the modular construction of the Eve gene regulatory region. (A) A 480-nucleotide- 
pair section of the Eve regulatory region was removed and (B) inserted upstream of a test promoter that directs the synthesis of 
the enzyme B-galactosidase (the product of the E. coli LacZ gene—see Figure 7-15). (C, D) When this artificial construct was 
reintroduced into the genome of Drosophila embryos, the embryos (D) expressed B-galactosidase (detectable by histochemical 
staining) precisely in the position of the second of the seven Eve stripes (C). B-Galactosidase is simple to detect and thus 
provides a convenient way to monitor the expression specified by a gene control region. As used here, B-galactosidase is said 
to serve as a reporter, since it “reports” the activity of a gene control region. (C and D, courtesy of Stephen Small and Michael 


Levine.) 


The Drosophila Eve Gene Is Regulated by Combinatorial Controls 


A detailed study of the stripe 2 regulatory module has provided insights into how 
it reads and interprets positional information. The module contains recognition 
sequences for two transcription regulators (Bicoid and Hunchback) that acti- 
vate Eve transcription and for two transcription regulators (Kriippel and Giant) 
that repress it (Figure 7-29). The relative concentrations of these four proteins 
determine whether the protein complexes that form at the stripe 2 module acti- 
vate transcription of the Eve gene. Figure 7-30 shows the distributions of the four 
transcription regulators across the region of a Drosophila embryo where stripe 2 
forms. It is thought that either of the two repressor proteins, when bound to the 
DNA, will turn off the stripe 2 module, whereas both Bicoid and Hunchback must 
bind for this module’s maximal activation. This simple regulatory scheme suffices 
to turn on the stripe 2 module (and therefore the expression of the Eve gene) only 
in those nuclei located where the levels of both Bicoid and Hunchback are high 
and both Krüppel and Giant are absent—a combination that occurs in only one 
region of the early embryo. It is not known exactly how these four transcription 
regulators interact with coactivators and co-repressors to specify the final level of 
transcription across the stripe, but the outcome very likely relies on competition 
between activators and repressors that act by the mechanisms outlined in Figures 
7-17, 7-19, and 7-23. 

The stripe 2 element is autonomous, inasmuch as it specifies stripe 2 when 
isolated from its normal context (see Figure 7-28). The other stripe regulatory 
modules are thought to be constructed similarly, reading positional information 
provided by other combinations of transcription regulators. The entire Eve gene 
control region binds more than 20 different transcription regulators. Seven com- 
binations of regulators—one combination for each stripe—specify Eve expres- 
sion, while many other combinations (all those found in the interstripe regions of 
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Figure 7-29 The Eve stripe 2 unit. The 
segment of the Eve gene control region 
identified in Figure 7—28 contains cis- 
regulatory sequences for four transcription 
regulators. It is Known from genetic 
experiments that these four regulatory 
proteins are responsible for the proper 
expression of Eve in stripe 2. Flies that 

are deficient in the two gene activators 
Bicoid and Hunchback, for example, fail 

to efficiently express Eve in stripe 2. In 

flies deficient in either of the two gene 
repressors, Giant and Kruppel, stripe 2 
expands and covers an abnormally broad 
region of the embryo. As indicated, in some 
cases the binding sites for the transcription 
regulators overlap, and the proteins can 
compete for binding to the DNA. For 
example, binding of Kruppel and binding of 
Bicoid to the site at the far right is mutually 
exclusive. 
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the embryo) keep the stripe elements silent. A large and complex control region is 
thereby built from a series of smaller modules, each of which consists of a unique 
arrangement of short cis-regulatory sequences recognized by specific transcrip- 
tion regulators. 

The Eve gene itself encodes a transcription regulator, which, after its pattern of 
expression is set up in seven stripes, controls the expression of other Drosophila 
genes. As development proceeds, the embryo is thus subdivided into finer and 
finer regions that eventually give rise to the different body parts of the adult fly, as 
discussed in Chapter 21. 

Eve exemplifies the complex control regions found in plants and animals. As 
this example shows, control regions can respond to many different inputs, inte- 
grate this information, and produce a complex spatial and temporal output as 
development proceeds. However, exactly how all these mechanisms work together 
to produce the final output is understood only in broad outline (Figure 7-31). 


Transcription Regulators Are Brought Into Play by Extracellular 
Signals 


The above example from Drosophila clearly illustrates the power of combinatorial 
control, but this case is unusual in that the nuclei are exposed directly to posi- 
tional cues in the form of concentrations of transcription regulators. In embryos 
of most other organisms and in all adults, individual nuclei are in separate cells, 
and extracellular information (including positional cues) must be passed across 
the plasma membrane so as to generate signals in the cytosol that cause different 
transcription regulators to become active in different cell types. Some of the dif- 
ferent mechanisms that are known to be used to activate transcription regulators 
are diagrammed in Figure 7-32, and in Chapter 15, we discuss how extracellular 
signals trigger these changes. 
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Figure 7-30 Distribution of the 
transcription regulators responsible 

for ensuring that Eve is expressed in 
stripe 2. The distributions of these proteins 
were visualized by staining a developing 
Drosophila embryo with antibodies directed 
against each of the four proteins. The 
expression of Eve in stripe 2 occurs only 

at the position where the two activators 
(Bicoid and Hunchback) are present and 
the two repressors (Giant and Krüppel) are 
absent. In fly embryos that lack Kruppel, 
for example, stripe 2 expands posteriorly. 
Likewise, stripe 2 expands posteriorly if the 
DNA-binding sites for Kruppel in the stripe 
2 module are inactivated by mutation (see 
also Figures 7-26 and 7-27). 


Figure 7-31 The integration of multiple 
inputs at a promoter. Multiple sets of 
transcription regulators, coactivators, 

and co-repressors can work together 

to influence transcription initiation at a 
promoter, as they do in the Eve stripe 

2 module illustrated in Figure 7-29. It is 
not yet understood in detail how the cell 
achieves integration of multiple inputs, but 
it is likely that the final transcriptional activity 
of the gene results from a competition 
between activators and repressors that act 
by the mechanisms Summarized in Figures 
7-17, 7-19, and 7-23. 
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Combinatorial Gene Control Creates Many Different Cell Types 


We have seen that transcription regulators can act in combination to control the 
expression of an individual gene. It is also generally true that each transcription 
regulator in an organism contributes to the control of many genes. This point is 
illustrated schematically in Figure 7-33, which shows how combinatorial gene 
control makes it possible to generate a great deal of biological complexity even 
with relatively few transcription regulators. 

Due to combinatorial control, a given transcription regulator does not neces- 
sarily have a single, simply definable function as commander of a particular bat- 
tery of genes or specifier of a particular cell type. Rather, transcription regulators 
can be likened to the words of a language: they are used with different meanings 
in a variety of contexts and rarely alone; it is the well-chosen combination that 
conveys the information that specifies a gene regulatory event. 

Combinatorial gene control causes the effect of adding a new transcription 
regulator to a cell to depend on that cell’s past history, since it is this history that 
determines which transcription regulators are already present. Thus, during 
development, a cell can accumulate a series of transcription regulators that need 
not initially alter gene expression. The addition of the final members of the req- 
uisite combination of transcription regulators will complete the regulatory mes- 
sage, and can lead to large changes in gene expression. 

The importance of combinations of transcription regulators for the specifica- 
tion of cell types is most easily demonstrated by their ability—when expressed 
artificially—to convert one type of cell to another. Thus, the artificial expression 
of three neuron-specific transcription regulators in liver cells can convert the liver 
cells into functional nerve cells (Figure 7-34). In some cases, expression of even a 
single transcription regulator is sufficient to convert one cell type to another. For 
example, when the gene encoding the transcription regulator MyoD is artificially 
introduced into fibroblasts cultured from skin connective tissue, the fibroblasts 
form muscle-like cells. As discussed in Chapter 22, fibroblasts, which are derived 
from the same broad class of embryonic cells as muscle cells, have already accu- 
mulated many of the other necessary transcription regulators required for the 


Figure 7-32 Some ways in which the 
activity of transcription regulators is 
controlled inside eukaryotic cells. (A) The 
protein is synthesized only when needed 
and is rapidly degraded by proteolysis so 
that it does not accumulate. (B) Activation 
by ligand binding. (C) Activation by 
covalent modification. Phosphorylation is 
shown here, but many other modifications 
are possible (see Table 3-3, p. 165). 

(D) Formation of a complex between a 
DNA-binding protein and a separate protein 
with a transcription-activating domain. 

(E) Unmasking of an activation domain by 
the phosphorylation of an inhibitor protein. 
(F) Stimulation of nuclear entry by removal 
of an inhibitory protein that otherwise 
keeps the regulatory protein from entering 
the nucleus. (G) Release of a transcription 
regulator from a membrane bilayer by 
regulated proteolysis. 
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combinatorial control of the muscle-specific genes, and the addition of MyoD 
completes the unique combination required to direct the cells to become muscle. 
An even more striking example is seen by artificially expressing, early in devel- 
opment, a single Drosophila transcription regulator (Eyeless) in groups of cells 
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Figure 7-34 A small set of transcription regulators can convert one differentiated cell type 
into another. In this experiment, (A) liver cells grown in culture were converted into (B) neuronal 
cells via the artificial expression of three nerve-specific transcription regulators. Both types of cells 
express an artificial red fluorescent protein, which is used to visualize them. This conversion involves 
the activation of many nerve-specific genes as well as the repression of many liver-specific genes. 
(From S. Marro et al., Cell Stem Cell 9:374-382, 2011. With permission from Elsevier.) 


Figure 7-33 The importance of 
combinatorial gene control for 
development. Combinations of a few 
transcription regulators can generate many 
cell types during development. In this 
simple, idealized scheme, a “decision” to 
make one of a pair of different transcription 
regulators (shown as numbered circles) 

is made after each cell division. Sensing 

its relative position in the embryo, the 
daughter cell toward the /eft side of the 
embryo is always induced to synthesize the 
even-numbered protein of each pair, while 
the daughter cell toward the right side of 
the embryo is induced to synthesize the 
odd-numbered protein. The production of 
each transcription regulator is assumed to 
be self-perpetuating once it has become 
initiated (see Figure 7-39). In this way, 
through cell memory, the final combinatorial 
specification is built up step by step. In this 
purely hypothetical example, five different 
transcription regulators have created eight 
final cell types (G-N). 
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that would normally go on to form leg parts. Here, this abnormal gene expression 
change causes eye-like structures to develop in the legs (Figure 7-35). 


Specialized Cell Types Can Be Experimentally Reprogrammed to 
Become Pluripotent Stem Cells 


Manipulation of transcription regulators can also coax various differentiated 
cells to de-differentiate into pluripotent stem cells that are capable of giving rise 
to the different cell types in the body, much like the embryonic stem (ES) cells 
discussed in Chapter 22. When three specific transcription regulators are artifi- 
cially expressed in cultured mouse fibroblasts, a number of cells become induced 
pluripotent stem cells (iPS cells)—cells that look and behave like the pluripo- 
tent ES cells that are derived from embryos (Figure 7-36). This approach has been 
adapted to produce iPS cells from a variety of specialized cell types, including 
cells taken from humans. Such human iPS cells can then be directed to generate 
a population of differentiated cells for use in the study or treatment of disease, as 
we discuss in Chapter 22. 

Although it was once thought that cell differentiation was irreversible, it is now 
clear that by manipulating combinations of master transcription regulators, cell 
types and differentiation pathways can be readily altered. 


Combinations of Master Transcription Regulators Specify Cell 
Types by Controlling the Expression of Many Genes 


As we saw in the introduction of this chapter, different cell types of multicellular 

organisms differ enormously in the proteins and RNAs they express. For example, 

only muscle cells express special types ofactin and myosin that form the contractile 
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Figure 7-35 Expression of the 
Drosophila Eyeless gene in precursor 
cells of the leg triggers the development 
of an eye on the leg. (A) Simplified 
diagrams showing the result when a fruit fly 
larva contains either the normally expressed 
Eyeless gene (left) or an Eyeless gene that 
is additionally expressed artificially in cells 
that normally give rise to leg tissue (right). 
(B) Photograph of an abnormal leg that 
contains a misplaced eye (see also Figure 
21-2). The transcription regulator was 
named Eyeless because its inactivation in 
otherwise normal flies causes the loss of 
eyes. (B, courtesy of Walter Gehring.) 


Figure 7-36 A combination of 
transcription regulators can induce a 
differentiated cell to de-differentiate into 
a pluripotent cell. The artificial expression 
of a set of three genes, each of which 
encodes a transcription regulator, can 
reprogram a fibroblast into a pluripotent 
cell with embryonic stem (ES)-cell-like 
properties. Like ES cells, such induced 
pluripotent stem (iPS) cells can proliferate 
indefinitely in culture and can be stimulated 
by appropriate extracellular signal 
molecules to differentiate into almost any 
cell type found in the body. Transcription 
regulators such as Oct4, Sox2, and KIf4 are 
often called master transcription regulators 
because their expression is sufficient to 
trigger a change in cell identity. 
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apparatus, while nerve cells must make and assemble all the proteins needed to 
form dendrites and synapses. We have seen that these patterns of cell-type-spe- 
cific expression are orchestrated by a combination of master transcription regu- 
lators. In many cases, these proteins bind directly to cis-regulatory sequences of 
the genes particular to that cell type. Thus, MyoD binds directly to cis-regulatory 
sequences located in the control regions of the muscle-specific genes. In other 
cases, the master regulators control the expression of “downstream” transcription 
regulators which, in turn, bind to the control regions of other cell-type-specific 
genes and control their synthesis. 

The specification of a particular cell type typically involves changes in the 
expression of several thousand genes. Genes whose protein products are required 
in the cell type are expressed at high levels, while those not needed are typically 
down-regulated. As might be imagined, the pattern of binding between the mas- 
ter regulators and all of the regulated genes can be extremely elaborate (Figure 
7-37). When we consider that many of these regulated genes have control regions 
that span tens of thousands of nucleotide pairs, commensurate with the Eve 
example discussed above, we can begin to appreciate the enormous complexity 
of cell-type specification. 

An outstanding question in biology is how the information in a genome is used 
to specify a multicellular organism. Although we have the general outline of the 
answer, we are far from understanding how a single cell type is completely speci- 
fied, let alone a whole organism. 


Specialized Cells Must Rapidly Turn Sets of Genes On and Off 


Although they generally maintain their identities, specialized cells must con- 
stantly respond to changes in their environment. Among the most important 
changes are signals from other cells that coordinate the behavior of the whole 
organism. Many of these signals induce transient changes in gene transcription, 
and we discuss the nature of these signals in detail in Chapter 15. Here, we con- 
sider how specialized cell types rapidly and decisively switch groups of genes on 
and off in response to their environment. Even though control of gene expression 
is combinatorial, the effect of a single transcription regulator can still be decisive 
in switching any particular gene on or off, simply by completing the combination 
needed to maximally activate or repress that gene. This situation is analogous to 
dialing in the final number of a combination lock: the lock will spring open with 
only this simple addition if all of the other numbers have been previously entered. 


Figure 7-37 A portion of the 
transcription network specifying 
embryonic stem cells. (A) The three 
master transcription regulators in Figure 
7-36 are shown as large circles. Genes 
whose cis-regulatory sequences are bound 
by each regulator in embryonic stem cells 
are indicated by a small dot (representing 
the gene) connected by a thin line 
(representing the binding reaction). Note 
that many of the target genes are bound 
by more than one of the regulators. 

(B) The master regulators control their 
own expression. As shown here, the three 
transcriptional regulators bind to their own 
control regions (indicated by feedback 
loops), as well as those of the other 
master regulators (indicated by straight 
arrows). (Courtesy of Trevor Sorrells, based 
on data from J. Kim et al., Cell 1382:1049- 
1061, 2008.) 
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Moreover, the same number can complete the combination for many different 
locks. Likewise, the addition of a particular protein can turn on many different 
genes. 

An example is the rapid control of gene expression by the human glucocorti- 
coid receptor protein. To bind to its cis-regulatory sequences in the genome, this 
transcription regulator must first form a complex with a molecule of a glucocorti- 
coid steroid hormone, such as cortisol (see Figure 15-64). The body releases this 
hormone during times of starvation and intense physical activity, and among its 
other activities, it stimulates liver cells to increase the production of glucose from 
amino acids and other small molecules. To respond in this way, liver cells increase 
the expression of many different genes that code for metabolic enzymes, such as 
tyrosine aminotransferase, as we discussed earlier in this chapter (see Figure 7-3). 
Although these genes all have different and complex control regions, their maxi- 
mal expression depends on the binding of the hormone-glucocorticoid receptor 
complex to its cis-regulatory sequence, which is present in the control region of 
each gene. When the body has recovered and the hormone is no longer present, 
the expression of each of these genes drops to its normal level in the liver. In this 
way, a single transcription regulator can rapidly control the expression of many 
different genes (Figure 7-38). 

The effects of the glucocorticoid receptor are not confined to cells of the liver. 
In other cell types, activation of this transcription regulator by hormone also 
causes changes in the expression levels of many genes; the genes affected, how- 
ever, are usually different from those affected in liver cells. As we have seen, each 
cell type has an individualized set of transcription regulators, and because of 
combinatorial control, these critically influence the action of the glucocorticoid 
receptor. Because the receptor is able to assemble with many different sets of cell- 
type-specific transcription regulators, switching it on with hormone produces a 
different spectrum of effects in each cell type. 


Differentiated Cells Maintain Their Identity 


Once a cell has become differentiated into a particular cell type, it will generally 
remain differentiated, and all its progeny cells will remain that same cell type. 
Some highly specialized cells, including skeletal muscle cells and neurons, never 
divide again once they have differentiated—that is, they are terminally differen- 
tiated (as discussed in Chapter 17). But many other differentiated cells—such as 
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Figure 7-38 A single transcription 
regulator can coordinate the expression 
of many different genes. The action of 
the glucocorticoid receptor is illustrated 
schematically. On the /eft is a series 

of genes, each of which has various 
transcription regulators bound to its 
regulatory region. However, these bound 
proteins are not sufficient on their own 

to fully activate transcription. On the 

right is shown the effect of adding an 
additional transcription regulator —the 
glucocorticoid receptor in a complex with 
glucocorticoid hormone—that has a cis- 
regulatory sequence in the control region 
of each gene. The glucocorticoid receptor 
completes the combination of transcription 
regulators required for maximal initiation 
of transcription, and the genes are now 
switched on as a set. When the hormone 
is no longer present, the glucocorticoid 
receptor dissociates from DNA and the 
genes return to their pre-stimulated levels. 
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fibroblasts, smooth muscle cells, and liver cells—will divide many times in the life 
of an individual. When they do, these specialized cell types give rise only to cells 
like themselves: smooth muscle cells do not give rise to liver cells, nor liver cells 
to fibroblasts. 

For a proliferating cell to maintain its identity—a property called cell mem- 
ory—the patterns of gene expression responsible for that identity must be remem- 
bered and passed on to its daughter cells through subsequent cell divisions. Thus, 
in the model we discussed in Figure 7-33, the production of each transcription 
regulator, once begun, has to be continued in the daughter cells of each cell divi- 
sion. How is such perpetuation accomplished? 

Cells have several ways of ensuring that their daughters “remember” what 
kind of cells they are. One of the simplest and most important is through a pos- 
itive feedback loop, where a master cell-type transcription regulator activates 
transcription of its own gene, in addition to that of other cell-type-specific genes. 
Each time a cell divides, the regulator is distributed to both daughter cells, where 
it continues to stimulate the positive feedback loop, making more of itself each 
division. Positive feedback is crucial for establishing “self-sustaining” circuits of 
gene expression that allow a cell to commit to a particular fate—and then to trans- 
mit that information to its progeny (Figure 7-39). 

As was previously shown in Figure 7-37B, the master regulators needed to 
maintain the pluripotency of iPS cells bind to cis-regulatory sequences in their 
own control regions, providing examples of the type of positive feedback loop. In 
addition, most of these pluripotent cell regulators also activate transcription of 
other master regulators, resulting in a complex series of indirect feedback loops. 
For example, if A activates B, and B activates A, this forms a positive feedback 
loop where A activates its own expression, albeit indirectly. The series of direct 
and indirect feedback loops observed in the iPS circuit is typical of other special- 
ized cell circuits. Such a network structure strengthens cell memory, increasing 
the probability that a particular pattern of gene expression is transmitted through 
successive generations. For example, if the level of A drops below the critical 
threshold to stimulate its own synthesis, regulator B can rescue it. By succes- 
sive application of this mechanism, a complex series of positive feedback loops 
among multiple transcription regulators can stably maintain a differentiated state 
through many cell divisions. 
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Figure 7-39 A positive feedback loop can create cell memory. Protein A is a master transcription 
regulator that activates the transcription of its own gene—as well as other cell-type-specific genes (not 
shown). All of the descendants of the original cell will therefore “remember” that the progenitor cell had 
experienced a transient signal that initiated the production of protein A. 
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Positive feedback loops formed by transcription regulators are probably the 
most prevalent way of ensuring that daughter cells remember what kind of cells 
they are meant to be, and they are found in all species on Earth. For example, 
many bacteria and single-cell eukaryotes form different types of cells, and posi- 
tive feedback loops lie at the heart of mechanisms that maintain their cell types 
through many rounds of cell division. Plants and animals also make extensive use 
of transcription feedback loops; as we shall discuss later in the chapter, they have 
additional, more specialized mechanisms for making cell memory even stronger. 
But first, we briefly consider how combinations of transcription regulators and 
cis-regulatory sequences can be combined to create useful logic devices for the 
cell. 


Transcription Circuits Allow the Cell to Carry Out Logic Operations 


Simple gene regulatory switches can be combined to create all sorts of control 
devices, just as simple electronic switching elements in a computer can be linked 
to perform different types of operations. An analysis of gene regulatory circuits 
reveals that certain simple types of arrangements (called network motifs) are 
found over and over again in cells from widely different species. For example, pos- 
itive and negative feedback loops are common in all cells (Figure 7-40). Whereas 
the former provides a simple memory device, the latter is often used to keep the 
expression of a gene close to a standard level despite the variations in biochem- 
ical conditions inside a cell. Suppose, for example, that a transcription repressor 
protein binds to the regulatory region of its own gene and exerts a strong negative 
feedback, such that transcription falls to a very low rate when the concentration of 
the repressor protein is above some critical value (determined by its affinity for its 
DNA binding site). The concentration of the protein can then be held close to the 
critical value, since any circumstance that causes a fall below that value can lead 
to a steep increase in synthesis, and any that causes a rise above that value will 
lead to synthesis being switched off. Such adjustments will, however, take time, 
so that an abrupt change of conditions will cause a disturbance of gene expres- 
sion that is strong but transient. If there is a delay in the feedback loop, the result 
may be spontaneous oscillations in the expression of the gene (see Figure 15-18). 
The different types of behavior produced by a feedback loop will depend on the 
details of the system; for example, how tightly the transcription regulator binds to 
its cis-regulatory sequence, its rate of synthesis, and its rate of decay. We discuss 
these issues in quantitative terms and in more detail in Chapter 8. 

With two or more transcription regulators, the possible range of circuit behav- 
iors becomes more complex. Some bacterial viruses contain a common type of 
two-gene circuit that can flip-flop between expression of one gene and expression 
of the other. Another common circuit arrangement is called a feed-forward loop; 
such a loop can serve as a filter, responding to input signals that are prolonged 
but disregarding those that are brief (Figure 7-41). These various network motifs 
resemble miniature logic devices, and they can process information in surpris- 
ingly sophisticated ways. 

The simple types of devices just illustrated are found to be interwoven in 
eukaryotic cells, creating exceedingly complex circuits (Figure 7-42). Each cell in 
a developing multicellular organism is equipped with similarly complex control 


A E- E Oe TM N-z 


U U are 





positive negative 
feedback feedback 
loop loop flip-flop device feed-forward loop 


(indirect positive feedback loop) 


Figure 7-40 Common types of network motifs in transcription circuits. A and B represent 
transcription regulators, arrows indicate positive transcription control, while lines with bars depict 
negative transcription control. In the feed-forward loop, A and B represent transcription regulators 
that both activate the transcription of target gene Z (See also Figure 8-86). 


MOLECULAR GENETIC MECHANISMS THAT CREATE AND MAINTAIN SPECIALIZED CELL TYPES 403 


INPUT INPUT 
| EW | i Sieg 
0 0 
time —» time —» 
OUTPUT OUTPUT 
0 0 
(B) time —»> (C) tine —» 





machinery, and it must, in effect, use its intricate system of interlocking transcrip- 
tion switches to compute how it should behave at each time point in response to 
the many different past and present inputs received. We are only beginning to 
understand how to study such complex intracellular control networks. Indeed, 
without new approaches, coupled with quantitative information that is far more 
precise and complete than we now possess, it will be impossible to predict the 
behavior of a system such as that shown in Figure 7-42. As explained in Chapter 8, 
a circuit diagram by itself is not enough. 
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Figure 7-41 How a feed-forward loop 
can measure the duration of a signal. 

(A) In this theoretical example, transcription 
regulators A and B are both required for 
transcription of Z, and A becomes active 
only when an input signal is present. 

(B) If the input signal to A is brief, A 

does not stay active long enough for B 

to accumulate, and the Z gene is not 
transcribed. (C) If the signal to A persists, 

B accumulates, A remains active, and Z is 
transcribed. This arrangement allows the 
cell to ignore rapid fluctuations of the input 
signal and respond only to persistent levels. 
This strategy could be used, for example, 
to distinguish between random noise and a 
true signal. 

The behavior shown here was 
computed for one particular set of 
parameter values describing the 
quantitative properties of A, B, and the 
product of Z, along with their syntheses. 
With different values of these parameters, 
feed-forward loops can in principle perform 
other types of “calculations.” Many feed- 
forward loops have been discovered 
in cells, and theoretical analysis helps 
researchers to discern —and subsequently 
test—the different ways in which they may 
function (see Figure 8-86). (Adapted from 
S.S. Shen-Orr et al., Nat. Genet. 31:64-68, 
2002. With permission from Macmillan 
Publishers Ltd.) 


Figure 7-42 The exceedingly complex 
gene circuit that specifies a portion of 
the developing sea urchin embryo. Each 
colored small box represents a different 
gene. Those in yellow code for transcription 
regulators and those in green and blue 
code for proteins that give cells of the 
mesoderm and endoderm, respectively, 
their specialized characteristics. Genes 
depicted in gray are largely active in the 
mother and provide the egg with cues 
needed for proper development. As in 
Figure 7—40, arrows depict instances in 
which a transcription regulator activates 
the transcription of another gene. Lines 
ending in bars indicate examples of gene 
repression. (From I.S. Peter and 

E.H. Davidson, Nature 474:635-6389, 2011. 
With permission from Macmillan 

Publishers Ltd.) 
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Summary 


The many types of cells in animals and plants are created largely through mecha- 
nisms that cause different sets of genes to be transcribed in different cells. The tran- 
scription of any particular gene is generally controlled by a combination of tran- 
scription regulators. Each type of cell in a higher eukaryotic organism contains a 
specific set of transcription regulators that ensures the expression of only those genes 
appropriate to that type of cell. A given transcription regulator may be active in a 
variety of circumstances and is typically involved in the regulation of many differ- 
ent genes. 

Since specialized animal cells can maintain their unique character through 
many cell-division cycles, and even when grown in culture, the gene regulatory 
mechanisms involved in creating them must be stable once established and herita- 
ble when the cell divides. These features reflect the cell’s memory of its developmen- 
tal history. Direct or indirect positive feedback loops, which enable transcription 
regulators to perpetuate their own synthesis, provide the simplest mechanism for 
cell memory. Transcription circuits also provide the cell with the means to carry out 
other types of logic operations. Simple transcription circuits combined into large 
regulatory networks drive highly sophisticated programs of embryonic develop- 
ment that will require new approaches to decipher. 


MECHANISMS THAT REINFORCE CELL MEMORY IN 
PLANTS AND ANIMALS 


Thus far in this chapter, we have emphasized the regulation of gene transcription 
by proteins that associate either directly or indirectly with DNA. However, DNA 
itself can be covalently modified, and certain types of chromatin states appear 
to be inherited. In this section, we shall see how these phenomena also provide 
opportunities for the regulation of gene expression. At the end of this section, we 
discuss how, in mice and humans, an entire chromosome can be transcription- 
ally inactivated using such mechanisms, and how this state can be maintained 
through many cell divisions. 


Patterns of DNA Methylation Can Be Inherited When Vertebrate 
Cells Divide 


In vertebrate cells, the methylation of cytosine provides a mechanism through 
which gene expression patterns can be passed on to progeny cells. The methyl- 
ated form of cytosine, 5-methyl cytosine (5-methyl C), has the same relation to 
cytosine that thymine has to uracil, and the modification likewise has no effect on 
base-pairing (Figure 7-43). DNA methylation in vertebrate DNA occurs on cyto- 
sine (C) nucleotides largely in the sequence CG, which is base-paired to exactly 
the same sequence (in opposite orientation) on the other strand of the DNA helix. 
Consequently, a simple mechanism permits the existing pattern of DNA meth- 
ylation to be inherited directly by the daughter DNA strands. An enzyme called 
maintenance methyl transferase acts preferentially on those CG sequences that 
are base-paired with a CG sequence that is already methylated. As a result, the 
pattern of DNA methylation on the parental DNA strand serves as a template for 
the methylation of the daughter DNA strand, causing this pattern to be inherited 
directly following DNA replication (Figure 7-44). 

Although DNA methylation patterns can be maintained in differentiated 
cells by the mechanism shown in Figure 7-44, methylation patterns are dynamic 
during mammalian development. Shortly after fertilization, there is a genome- 
wide wave of demethylation, when the vast majority of methyl groups are lost 
from the DNA. This demethylation may occur either by suppression of mainte- 
nance DNA methyl transferase activity, resulting in the passive loss of methyl 
groups during each round of DNA replication, or by demethylating enzymes (dis- 
cussed below). Later in development, new methylation patterns are established 
by several de novo DNA methyl transferases that are directed to DNA by sequence 
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Figure 7-43 Formation of 5-methyl 
cytosine occurs by methylation of a 
cytosine base in the DNA double helix. 
In vertebrates, this event is largely confined 
to selected cytosine (C) nucleotides located 
in the sequence CG. CG sequences are 
sometimes denoted as CpG sequences, 
where the p indicates a phosphate linkage 
to distinguish it from a CG base pair. In 
this chapter, we will continue to use the 
simpler nomenclature CG to indicate this 
dinucleotide. 
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Figure 7-44 How DNA methylation patterns are faithfully inherited. In vertebrate DNA, a large fraction of the cytosine 
nucleotides in the sequence CG is methylated (see Figure 7-43). Because of the existence of a methyl-directed methylating 
enzyme (the maintenance methyl transferase), once a pattern of DNA methylation is established, that pattern of methylation is 


inherited in the progeny DNA, as shown. 


specific DNA-binding proteins. Once the new patterns of methylation are estab- 
lished, they can be propagated through rounds of DNA replication by the mainte- 
nance methyl transferases. 

DNA methylation has several uses in the vertebrate cell. A very important 
role is to work in conjunction with other gene expression control mechanisms 
to establish a particularly efficient form of gene repression. This combination of 
mechanisms ensures that unneeded eukaryotic genes can be repressed to very 
high degrees. For example, the rate at which a vertebrate gene is transcribed can 
vary 10°-fold between one tissue and another. The unexpressed vertebrate genes 
are much less “leaky” in terms of transcription than bacterial genes, in which the 
largest known differences in transcription rates between expressed and unex- 
pressed gene states are about 1000-fold. 

DNA methylation helps to repress transcription in several ways. The methyl 
groups on methylated cytosines lie in the major groove of DNA and interfere 
directly with the binding of proteins (transcription regulators as well as the gen- 
eral transcription factors) required for transcription initiation. In addition, the 
cell contains a repertoire of proteins that bind specifically to methylated DNA. The 
best characterized of these associate with histone modifying enzymes, leading to 
a repressive chromatin state where chromatin structure and DNA methylation act 
synergistically (Figure 7-45). One reflection of the importance of DNA methyla- 
tion to humans is the widespread involvement of “incorrect” DNA methylation 
patterns in cancer progression (discussed in Chapter 20). 


CG-Rich Islands Are Associated with Many Genes in Mammals 


Because of the way in which DNA repair enzymes work, methylated C nucleotides 
in the vertebrate genome tend to be eliminated in the course of evolution. Acci- 
dental deamination of an unmethylated C gives rise to U (see Figure 5-38), which 
is not normally present in DNA and thus is recognized easily by the DNA repair 
enzyme uracil DNA glycosylase, excised, and then replaced with a C (as discussed 
in Chapter 5). But accidental deamination of a 5-methyl C cannot be repaired in 
this way, for the deamination product is a T and so is indistinguishable from the 
other, nonmutant T nucleotides in the DNA. Although a special repair system 
exists to remove these mutant T nucleotides, many of the deaminations escape 
detection, so that those C nucleotides in the genome that are methylated tend to 
mutate to T over evolutionary time. 
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histone modifying Figure 7-45 Multiple mechanisms 
enzyme ("writer") contribute to stable gene repression. 
N In this schematic example, histone reader 
i- and writer proteins (discussed in Chapter 
4), under the direction of transcription 
regulators, establish a repressive form 
of chromatin. A de novo DNA methylase 
is attracted by the histone reader and 
methylates nearby cytosines in DNA, which 
f are, in turn, bound by DNA methyl-binding 
) proteins. During DNA replication, some 
l of the modified (blue dot) histones will be 
code “reader” protein inherited by one daughter chromosome, 
| some by the other, and in each daughter 
they can induce reconstruction of the 
same pattern of chromatin modifications 
(discussed in Chapter 4). At the same 
time, the mechanism shown in Figure 7—44 
will cause both daughter chromosomes 
to inherit the same methylation pattern. 
In these cases where DNA methylation 
stimulates the activity of the histone writer, 
the two inheritance mechanisms will be 
mutually reinforcing. This scheme can 
account for the inheritance by daughter 
cells of both the histone and the DNA 
modifications. It can also explain the 
tendency of some chromatin modifications 
to spread along a chromosome (see 
Figure 4—44). 
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During the course of evolution, more than three out of every four CGs have 
been lost in this way, leaving vertebrates with a remarkable deficiency of this 
dinucleotide. The CG sequences that remain are very unevenly distributed in the 
genome; they are present at 10 times their average density in selected regions, 
called CG islands, which average 1000 nucleotide pairs in length. The human 
genome contains roughly 20,000 CG islands and they usually include promot- 
ers of genes. For example, 60% of human protein-coding genes have promot- 
ers embedded in CG islands and these include virtually all the promoters of the 
so-called housekeeping genes—those genes that code for the many proteins that 
are essential for cell viability and are therefore expressed in nearly all cells (Figure 
7-46). Over evolutionary timescales, the CG islands were spared the accelerated 
mutation rate of bulk CG sequences because they remained unmethylated in the 
germ line (Figure 7-47). 

CG islands also remain unmethylated in most somatic tissues whether or 
not the associated gene is expressed. The unmethylated state is maintained 
by sequence-specific DNA-binding proteins, many of whose cis-regulatory 
sequences contain a CG. By binding to these sequences, which are spread across 
CG islands, they protect the DNA from methyl transferases. These proteins also 
recruit DNA demethylases, which convert 5-methyl C to hydroxy-methyl C, which 
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is later replaced by C either through DNA repair (see Figure 5-41A) or, passively, 
through multiple rounds of DNA replication. Unmethylated CG islands have sev- 
eral properties that make them particularly suitable for promoters. For example, 
some of the same proteins that bind to CG islands and protect them from methyl- 
ation recruit histone modifying enzymes that make the islands particularly “pro- 
moter friendly.” As a result, RNA polymerase is often found bound to promoters 
within CG islands, even when the associated gene is not being actively transcribed. 
At unmethylated CG islands, the balance between polymerase and nucleosome 
assembly is thus tipped toward the former. Additional steps are needed to “push” 
the bound polymerase into transcribing the adjacent gene, and these are directed 
by transcription regulators that bind to cis-regulatory sequences of DNA (often 
well upstream from the CG islands). These regulators serve to release the poly- 
merase with the appropriate elongation factors (see Figure 7-21C and D). 


Genomic Imprinting Is Based on DNA Methylation 


Mammalian cells are diploid, containing one set of genes inherited from the father 
and one set from the mother. The expression of a small minority of genes depends 
on whether they have been inherited from the mother or the father: when the 
paternally inherited gene copy is active, the maternally inherited gene copy is 
silent, or vice versa. This phenomenon is called genomic imprinting. 

Roughly 300 genes are imprinted in humans. Because only one copy of an 
imprinted gene is expressed, imprinting can “unmask” mutations that would 
normally be covered by the other, functional copy. For example, Angelman syn- 
drome, a disorder of the nervous system in humans that causes reduced mental 


ability and severe speech impairment, results from a gene deletion on one chro- VERTEBRATE ANCESTOR DNA 
mosomal homolog and the silencing, by imprinting, of the intact gene on the 
other homolog. = os [eos es = 
The insulin-like growth factor-2 (Igf2) gene in the mouse provides a well-stud- E 
ied example of imprinting. Mice that do not express Igf2 at all are born half the 
size of normal mice. However, only the paternal copy of Igf2 is transcribed, and ee ee 
only this gene copy matters for the phenotype. As a result, mice with a mutated in germ line 
paternally derived Igf2 gene are stunted, while mice with a mutated maternally 
derived Igf2 gene are normal. © eee eo %90 ọọ 
= a ss | es ee | 
—————=s 
Figure 7-47 A mechanism to explain both the marked overall deficiency | anvanillions ehyeats 
of CG sequences and their clustering into CG islands in vertebrate af olution 
genomes. White lines mark the location of CG dinucleotides in the DNA 


sequences, while red circles indicate the presence of a methyl group on the 
CG dinucleotide. CG sequences that lie in regulatory sequences of genes 





that are transcribed in germ cells are unmethylated and therefore tend tobe _ 9 d E 
retained in evolution. Methylated CG sequences, on the other hand, tend to -~ PAN z 
be lost through deamination of 5-methyl C to T, unless the CG sequence is 


critical for survival. CG island 


408 


Chapter 7: Control of Gene Expression 





female mouse male mouse 
CGC a CGC a 
imprinted allele EER Sree CS 
of geneA 
‘+ chromosome inherited from father -| calsa 
oo 
ei 
a 
mm) 
expressed allele mRNA 
of gene A 
somatic cell somatic cell 





somatic cell in offspring somatic cell in offspring 


Figure 7-48 Imprinting in the mouse. The top portion of the figure shows a pair of homologous chromosomes in the somatic 
cells of two adult mice, one male and one female. In this example, both mice have inherited the top homolog from their father 
and the bottom homolog from their mother, and the paternal copy of a gene subject to imprinting (indicated in orange) is 
methylated, preventing its expression. The maternally derived copy of the same gene (yellow) is expressed. The remainder of 
the figure shows the outcome of a cross between these two mice. During germ-cell formation, but before meiosis, the imprints 
are erased and then, much later in germ-cell development, they are reimposed in a sex-specific pattern (middle portion of 
figure). In eggs produced from the female, neither allele of the A gene is methylated. In sperm from the male, both alleles of 
gene A are methylated. Shown at the bottom of the figure are two of the possible imprinting patterns inherited by the progeny 
mice; the mouse on the /eft has the same imprinting pattern as each of the parents, whereas the mouse on the right has the 
opposite pattern. If the two alleles of gene A are distinct, these different imprinting patterns can cause phenotypic differences 
in the progeny mice, even though they carry exactly the same DNA sequences of the two A gene alleles. Imprinting provides 
an important exception to classical genetic behavior, and several hundred mouse genes are thought to be affected in this way. 
However, the majority of mouse genes are not imprinted, and therefore the rules of Mendelian inheritance apply to most of the 
mouse genome. 


In the early embryo, genes subject to imprinting are marked by methylation 
according to whether they were derived from a sperm or an egg chromosome. 
In this way, DNA methylation is used as a mark to distinguish two copies of a 
gene that may be otherwise identical (Figure 7-48). Because imprinted genes 
are somehow protected from the wave of demethylation that takes place shortly 
after fertilization (see pp. 404-405), this mark enables somatic cells to “remem- 
ber” the parental origin of each of the two copies of the gene and to regulate 
their expression accordingly. In most cases, the methyl imprint silences nearby 
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gene expression. In some cases, however, it can activate expression of a gene. In 
the case of Igf2, for example, methylation of an insulator element on the pater- 
nally derived chromosome blocks its function and allows distant cis-regulatory 
sequences to activate transcription of the Igf2 gene. On the maternally derived 
chromosome, the insulator is not methylated and the Igf2 gene is therefore not 
transcribed (Figure 7-49A). 

Other cases of imprinting involve long noncoding RNAs, which are defined as 
RNA molecules over 200 nucleotides in length that do not code for proteins. We 
discuss IncRNAs broadly at the end of this chapter; here, we focus on the role of 
a specific IncRNA in imprinting. In the case of the Kcnq1 gene, which codes for a 
voltage-gated calcium channel needed for proper heart function, the IncRNA is 
made from the paternal allele (which is unmethylated) but it is not released by the 
RNA polymerase, remaining instead at its site of synthesis on the DNA template. 
This RNA in turn recruits histone-modifying and DNA-methylating enzymes that 
direct the formation of repressive chromatin, which silences the protein-coding 
gene associated on the paternally derived chromosome (Figure 7-49B). The mater- 
nally derived gene, on the other hand, is immune to these effects because the spe- 
cific methylation present from imprinting blocks the synthesis of the IncRNA but 
allows transcription of the protein-coding gene. Like Igf2, the specificity of Kcnq1 
imprinting arises from the inherited methylation patterns; the difference lies in 
the way these patterns bring about differential expression of the imprinted gene. 

Why imprinting should exist at all is a mystery. In vertebrates, it is restricted to 
placental mammals, and many of the imprinted genes are involved in fetal devel- 
opment. One idea is that imprinting reflects a middle ground in the evolutionary 
struggle between males to produce larger offspring and females to limit offspring 
size. Whatever its purpose might be, imprinting provides startling evidence that 
features of DNA other than its sequence of nucleotides can be inherited. 


Chromosome-Wide Alterations in Chromatin Structure Can Be 
Inherited 


We have seen that DNA methylation and certain types of chromatin structure 
can be heritable, preserving patterns of gene expression across cell generations. 
Perhaps the most striking example of this effect occurs in mammals, in which an 
alteration in the chromatin structure of an entire chromosome can modulate the 
levels of expression of most genes on that chromosome. 


Figure 7-49 Mechanisms of imprinting. 
(A) On chromosomes inherited from the 
female, a protein called CTCF binds to 

an insulator (see Figure 7-24), blocking 
communication between cis-regulatory 
sequences (green) and the /gf2 gene 
(orange). Igf2 is therefore not expressed 
from the maternally inherited chromosome. 
Because of imprinting, the insulator on the 
male-derived chromosome is methylated 
(red circles); this inactivates the insulator 
by blocking the binding of the CTCF 
protein, and allows the cis-regulatory 
sequences to activate transcription of the 
Igf2 gene. In other examples of imprinting, 
methylation simply blocks gene expression 
by interfering with the binding of proteins 
required for a gene’s transcription. 

(B) Imprinting of the mouse Kcnqg7 gene. 
On the maternally derived chromosome, 
synthesis of the IncRNA is blocked by 
methylation of the DNA (red circles), 

and the Kcnq7 gene is expressed. On 

the paternally derived chromosome, the 
INCRNA is synthesized, remains in place, 
and by directing alterations in chromatin 
structure blocks expression of the Kcnq7 
gene. Although shown as directly binding 
to INCRNA, the histone-modifying enzymes 
are likely to be recruited indirectly, through 
additional proteins. 
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Males and females differ in their sex chromosomes. Females have two X chro- 
mosomes, whereas males have one X and one Y chromosome. As a result, female 
cells contain twice as many copies of X-chromosome genes as do male cells. In 
mammals, the X and Y sex chromosomes differ radically in gene content: the X 
chromosome is large and contains more than a thousand genes, whereas the Y 
chromosome is small and contains less than 100 genes. Mammals have evolved a 
dosage compensation mechanism to equalize the dosage of X-chromosome gene 
products between males and females. The correct ratio of X chromosome to auto- 
some (non-sex chromosome) gene products is carefully controlled, and mutations 
that interfere with this dosage compensation are generally lethal. 

Mammals achieve dosage compensation by the transcriptional inactivation of 
one of the two X chromosomes in female somatic cells, a process known as X-in- 
activation. As a result of X-inactivation, two X chromosomes can coexist within 
the same nucleus, exposed to the same diffusible transcription regulators, yet dif- 
fer entirely in their expression. 

Early in the development of a female embryo, when it consists of a few hun- 
dred cells, one of the two X chromosomes in each cell becomes highly condensed 
into a type of heterochromatin. The initial choice of which X chromosome to inac- 
tivate, the maternally inherited one (Xm) or the paternally inherited one (Xp), is 
random. Once either Xp or Xm has been inactivated, it remains silent through- 
out all subsequent cell divisions of that cell and its progeny, indicating that the 
inactive state is faithfully maintained through many cycles of DNA replication and 
mitosis. Because X-inactivation is random and takes place after several hundred 
cells have already formed in the embryo, every female is a mosaic of clonal groups 
of cells in which either Xp or Xm is silenced (Figure 7-50). These clonal groups are 
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Figure 7-50 X-inactivation. The clonal 
inheritance in female mammals of a 
only Xm active in this clone only Xp active in this clone condensed, inactive X chromosome. 
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Figure 7-51 Photoreceptor cells in the retina of a female mouse showing 
patterns of X-chromosome inactivation. Using genetic engineering 
techniques (described in Chapter 8), the germ line of a mouse was modified 
so that one copy of the X chromosome (if active) makes a green fluorescent 
protein and the other a red fluorescent protein. Both proteins concentrate in 
the nucleus and, in the field of cells shown here, it is clear that only one of 
the two X chromosomes is active in each cell. (From H. Wu et al., Neuron 
81:103-119, 2014. With permission from Elsevier.) 


distributed in small clusters in the adult animal because sister cells tend to remain 
close together during later stages of development (Figure 7-51). For example, 
X-chromosome inactivation causes the orange and black “tortoiseshell” coat 
coloration of some female cats. In these cats, one X chromosome carries a gene 
that produces orange hair color, and the other X chromosome carries an allele 
of the same gene that results in black hair color; it is the random X-inactivation 
that produces patches of cells of two distinctive colors. In contrast, male cats of 
this genetic stock are either solid orange or solid black, depending on which X 
chromosome they inherit from their mothers. Although X-chromosome inactiva- 
tion is maintained over thousands of cell divisions, it is reversed during germ-cell 
formation, so that all haploid oocytes contain an active X chromosome and can 
express X-linked gene products. 

How is an entire chromosome transcriptionally inactivated? X-chromosome 
inactivation is initiated and spreads from a single site near the middle of the X 
chromosome, the X-inactivation center (XIC). Within the XIC is a transcribed 
20,000-nucleotide IncRNA (called Xist), which is expressed solely from the inac- 
tive X chromosome. Xist RNA spreads from the XIC over the entire chromosome 
and directs gene silencing. Although we do not know exactly how this is accom- 
plished, it likely involves recruitment of histone-modifying enzymes and other 
proteins to form a repressive form of chromatin analogous to that of Figure 7-45. 
Curiously, about 10% of the genes on the X chromosome (including Xist itself) 
escape this silencing and remain active. 

The spread of Xist RNA along the X chromosome does not proceed linearly 
along the DNA. Rather, starting at its site of synthesis, it is first handed off across 
the base of the DNA loops that make up the chromosome; these shortcuts explain 
how Xist can spread rapidly, by a “hand-over-hand” mechanism, along the X 
chromosome once the inactivation process begins (Figure 7-52). It also helps to 
explain why the inactivation does not spread to the other, active X chromosome. 

Imprinting and X-chromosome inactivation are examples of monoallelic gene 
expression, where in a diploid genome, only one of the two copies of a gene is 
expressed. In addition to the approximately 1000 genes on the X chromosome and 
the 300 or so genes that are imprinted, there are another 1000-2000 human genes 
that exhibit monoallelic expression. Like X-chromosome inactivation (but unlike 
imprinting), the choice of which copy of the gene is to be expressed and which is 
to be silenced often appears random. Yet once the choice is made, it can persist for 
many cell divisions. Because the choice is often made relatively late in develop- 
ment, cells of the same tissue in the same individual can express different copies 
of a given gene. In other words, somatic tissues are often mosaics, where different 
clones of cells have subtly different patterns of gene expression. The mechanisms 
responsible for this type of monoallelic expression are not known in detail, and 
its general purpose—if any—is poorly understood. Several different mechanisms 
may contribute to such epigenetic inheritance, as we explain next. 


Epigenetic Mechanisms Ensure That Stable Patterns of Gene 
Expression Can Be Transmitted to Daughter Cells 
As we have seen, once a cell in an organism differentiates into a particular cell 


type, it generally remains specialized in that way; if it divides, its daughters inherit 
the same specialized character. Perhaps the simplest way for a cell to remember 
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its identity is through a positive feedback loop in which a key transcription regu- 
lator activates, either directly or indirectly, the transcription of its own gene (see 
Figure 7-39). Interlocking positive feedback loops of the type shown in Figure 
7-37 provide greater stability by buffering the circuit against fluctuations in the 
level of any one transcription regulator. Because transcription regulators are syn- 
thesized in the cytosol and diffuse throughout the nucleus, feedback loops based 
on this mechanism will affect both copies of a gene in a diploid cell. However, as 
discussed in this section, the expression pattern of a gene on one chromosome 
can differ from the copy of the same gene on the other chromosome (as in X-chro- 
mosome inactivation or in imprinting), and such differences can also be inherited 
through many cell divisions. 

The ability of a daughter cell to retain a memory of the gene expression pat- 
terns that were present in the parent cell is an example of epigenetic inheritance: 
a heritable alteration in a cell or organism’s phenotype that does not result from 
changes in the nucleotide sequence of DNA (discussed in Chapter 4). (Unfortu- 
nately, the term epigenetic is sometimes also used to refer to all covalent modifi- 
cations to histones and DNA, whether or not they are self-propagating; many of 
these modifications are erased each time a cell divides and do not generate cell 
memory.) 

In Figure 7-53, we contrast two self-propagating epigenetic mechanisms 
that work in cis, affecting only one chromosomal copy with two self-propagating 
mechanisms that work in trans, affecting both chromosomal copies of a gene. 
Cells can combine these mechanisms to ensure that patterns of gene expression 
are maintained and inherited accurately and reliably—over a period of up to a 
hundred years or more, in our own case. 

We can get some idea of the prevalence of epigenetic changes by comparing 
identical twins. Their genomes have the same sequence of nucleotides, and, obvi- 
ously, many features of identical twins—such as their appearance—are strongly 
determined by the genome sequences they inherit. When their gene expression, 
histone modification, and DNA methylation patterns are compared, however, 
many differences are observed. Because these differences are roughly correlated 
not only with age but also with the time that the twins have spent apart from each 
other, it has been proposed that some of these differences are heritable from cell 
to cell and are the result of environmental factors. Although these studies are in 
early stages, the idea that environmental events can be permanently registered 
as epigenetic changes in our cells is a fascinating one that presents an important 
challenge to the next generation of biological scientists. 
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Figure 7-52 Mammalian X-chromosome 
inactivation. X-chromosome inactivation 
begins with the synthesis of Xist 
(X-inactivation specific transcript) RNA from 
the XIC (X-inactivation center) locus and 
moves outward to the chromosome ends. 
According to the model depicted here, 

the long (~20,000 nucleotides) Xist RNA 
has many low-affinity binding sites for the 
structural components of chromosomes 
and spreads by releasing its hold on one 
portion of the chromosome while grasping 
another. The continued synthesis of Xist 
from the center of the chromosome drives 
it to the ends. As shown, Xist RNA does 
not move linearly along the chromosomal 
DNA, but, instead, moves first across the 
base of chromosome loops. It has been 
proposed that the portions of chromosomal 
DNA at the tips of long loops contain the 
10% of genes that escape X-chromosome 
inactivation. 
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Summary 


Eukaryotic cells can use inherited forms of DNA methylation and inherited states 
of chromatin condensation as additional mechanisms for generating cell memory 
of gene expression patterns. An especially dramatic case that involves chromatin 
condensation is the inactivation of an entire X chromosome in female mammals. 
DNA methylation underlies the phenomenon in mammals of genomic imprinting, 
in which the expression of a gene depends on whether it was inherited from the 
mother or the father. 


POS I- TRANSCRIPTIONAL CONTROLS 


In principle, every step required for the process of gene expression can be con- 
trolled. Indeed, one can find examples of each type of regulation, and many genes 
are regulated by multiple mechanisms. As we have seen, controls on the initiation 
of gene transcription are a critical form of regulation for all genes. But other con- 
trols can act later in the pathway from DNA to protein to modulate the amount 
of gene product that is made—and in some cases, to determine the exact amino 
acid sequence of the protein product. These post-transcriptional controls, which 
operate after RNA polymerase has bound to the gene’s promoter and has begun 
RNA synthesis, are crucial for the regulation of many genes. 

In the following sections, we consider the varieties of post-transcriptional reg- 
ulation in temporal order, according to the sequence of events that an RNA mole- 
cule might experience after its transcription has begun (Figure 7-54). 
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Figure 7-53 Four distinct mechanisms 
that can produce an epigenetic 

form of inheritance in an organism. 

(A) Epigenetic mechanisms that act 

in cis. AS discussed in this chapter, a 
maintenance methylase can propagate 
specific patterns of cytosine methylation 
(see Figure 7-44). As discussed in 
Chapter 4, a histone modifying enzyme 
that replicates the same modification that 
attracts it to chromatin can result in the 
modification being self-propagating (see 
Figure 4—44). (B) Epigenetic mechanisms 
that act in trans. Positive feedback loops, 
formed by transcriptional regulators are 
found in all species and are probably the 
most common form of cell memory. As 
discussed in Chapter 3, some proteins 
can form self-propagating prions (Figure 
3-33). If these proteins are involved in gene 
expression, they can transmit patterns of 
gene expression to daughter cells. 
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Figure 7-54 Post-transcriptional 
controls of gene expression. The final 
synthesis rate of a protein can, in principle, 
be controlled at any of the steps listed in 
capital letters. In addition, RNA splicing, 
RNA editing, and translation recoding can 
also alter the sequence of amino acids in 

a protein, making it possible for the cell to 
produce more than one protein variant from 
the same gene. Only a few of the steps 
depicted here are likely to be critical for the 
regulation of any one particular protein. 
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Transcription Attenuation Causes the Premature Termination of 
Some RNA Molecules 


It has long been known that the expression of some genes is inhibited by pre- 
mature termination of transcription, a phenomenon called transcription attenua- 
tion. In some of these cases, the nascent RNA chain adopts a structure that causes 
it to interact with the RNA polymerase in such a way as to abort its transcription. 
When the gene product is required, regulatory proteins bind to the nascent RNA 
chain and remove the attenuation, allowing the transcription of a complete RNA 
molecule. 

A well-studied example of transcription attenuation occurs during the life 
cycle of HIV, the human immunodeficiency virus that is the causative agent of 
acquired immune deficiency syndrome, or AIDS. Once the HIV genome has been 
integrated into the host genome, the viral DNA is transcribed by the cell’s RNA 
polymerase II (see Figure 5-62). However, this polymerase usually terminates 
transcription after synthesizing transcripts of several hundred nucleotides and 
therefore fails to efficiently transcribe the entire viral genome. When conditions 
for viral growth are optimal, a virus-encoded protein called Tat, which binds to 
a specific stem-loop structure in the nascent RNA that contains a “bulged base,” 
prevents this premature termination (see Figure 6-89). Once bound to this spe- 
cific RNA structure (called TAR), Tat assembles several host-cell proteins that 
allow the RNA polymerase to continue transcribing. The normal role of at least 
some of these proteins is to prevent pausing and premature termination by RNA 
polymerase when it transcribes normal cell genes. Thus, anormal cell mechanism 
has apparently been highjacked by HIV to permit transcription of its genome to be 
controlled by a single viral protein. 


Riboswitches Probably Represent Ancient Forms of Gene Control 


In Chapter 6, we discussed the idea that, before modern cells arose on Earth, RNA 
played the role of both DNA and proteins, both storing hereditary information 
and catalyzing chemical reactions (see pp. 362-366). The discovery of riboswitches 
shows that RNA can also form control devices. Riboswitches are short sequences 
of RNA that change their conformation on binding small molecules, such as 
metabolites. Each riboswitch recognizes a specific small molecule and the result- 
ing conformational change is used to regulate gene expression. Riboswitches are 
often located near the 5’ end of mRNAs, and they fold while the mRNA is being 
synthesized, blocking or permitting progress of the RNA polymerase according to 
whether the regulatory small molecule is bound (Figure 7-55). 

Riboswitches are particularly common in bacteria, in which they sense key 
small metabolites in the cell and adjust gene expression accordingly. Perhaps 
their most remarkable feature is the high specificity and affinity with which each 
recognizes only the appropriate small molecule; in many cases, every chemical 
feature of the small molecule is read by the RNA (Figure 7-55C). Moreover, the 
binding affinities observed are as tight as those typically observed between small 
molecules and proteins. 
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Figure 7-55 A riboswitch that responds 
to guanine. (A) In this example from 
bacteria, the riboswitch controls expression 
of the purine biosynthetic genes. When 
guanine levels in cells are low, an 
elongating RNA polymerase transcribes 
the purine biosynthetic genes, and the 
enzymes needed for guanine synthesis are 
therefore expressed. (B) When guanine is 
abundant, it binds the riboswitch, causing 
it to undergo a conformational change that 
forces the RNA polymerase to terminate 
transcription (See Figure 6-11). (C) Guanine 
(red) bound to the riboswitch. Only those 
nucleotides that form the guanine-binding 
pocket are shown. Many other riboswitches 
exist, including those that recognize 
S-adenosylmethionine, coenzyme B42, 
flavin mononucleotide, adenine, lysine, and 
glycine. (Adapted from M. Mandal and 

R.R. Breaker, Nat. Rev. Mol. Cell Biol. 
5:451-463, 2004. With permission from 
Macmillan Publishers Ltd; and C.K. 
Vanderpool and S. Gottesman, Mol. 
Microbiol. 54:1076-1089, 2004. With 
permission from Blackwell Publishing.) 
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Figure 7-56 Five patterns of alternative RNA splicing. In each case, a 
single type of RNA transcript is spliced in two alternative ways to produce 
two distinct mRNAs (1 and 2). The dark blue boxes mark exon sequences 
that are retained in both mRNAs. The light blue boxes mark possible exon 
sequences that are included in only one of the mRNAs. The boxes are 
joined by red lines to indicate where intron sequences (yellow) are removed. 
(Adapted from H. Keren et al. Nat. Rev. Genet. 11:345-355, 2010. With 
permission from Macmillan Publishers Ltd.) 


Riboswitches are perhaps the most economical examples of gene control 
devices, inasmuch as they bypass the need for regulatory proteins altogether. In 
the example shown in Figure 7-55, the riboswitch controls transcription elonga- 
tion, but they can also regulate other steps in gene expression, as we shall see later 
in this chapter. Clearly, highly sophisticated gene control devices can be made 
from short sequences of RNA, a fact that supports the hypothesis of an early “RNA 
world.” 


Alternative RNA Splicing Can Produce Different Forms of a Protein 
from the Same Gene 


As discussed in Chapter 6 (see Figure 6-26), RNA splicing shortens the transcripts 
of many eukaryotic genes by removing the intron sequences from the mRNA pre- 
cursor. We also saw that a cell can splice an RNA transcript differently and thereby 
make different polypeptide chains from the same gene—a process called alterna- 
tive RNA splicing (Figure 7-56). A substantial proportion of animal genes (esti- 
mated at 90% in humans) produce multiple proteins in this way. 

When different splicing possibilities exist at several positions in the transcript, 
a single gene can produce dozens of different proteins. In one extreme case, a 
Drosophila gene may produce as many as 38,000 different proteins from a single 
gene through alternative splicing (Figure 7-57), although only a fraction of these 
forms have thus far been experimentally observed. Considering that the Drosoph- 
ila genome has approximately 14,000 identified genes, it is clear that the protein 
complexity of an organism can greatly exceed the number of its genes. This exam- 
ple also illustrates the perils in equating gene number with an organism’s com- 
plexity. For example, alternative splicing is rare in single-celled budding yeasts 
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Figure 7-57 Alternative splicing of RNA transcripts of the Drosophila Dscam gene. DSCAM proteins have several different functions. In cells 

of the fly immune system, they mediate the phagocytosis of bacterial pathogens. In cells of the nervous system, DSCAM proteins are needed for 
proper wiring of neurons. The final mRNA contains 24 exons, four of which (denoted A, B, C, and D) are present in the Dscam gene as arrays of 
alternative exons. Each RNA contains 1 of 12 alternatives for exon A (red), 1 of 48 alternatives for exon B (green), 1 of 33 alternatives for exon C 
(blue), and 1 of 2 alternatives for exon D (yellow). This figure shows only one of the many possible splicing patterns (indicated by the red line and 

by the mature MRNA below it). Each variant DSCAM protein would fold into roughly the same structure (predominantly a series of extracellular 
immunoglobulin-like domains linked to a membrane-spanning region; see Figure 24—48), but the amino acid sequence of the domains vary 
according to the splicing pattern. The diversity of DSCAM variants contributes to the plasticity of the immune system as well as the formation of 
complex neural circuits; we take up the specific role of the DSCAM variants in more detail when we describe the development of the nervous system 


in Chapter 21. (Adapted from D.L. Black, Cell 103:367-370, 2000. With permission from Elsevier.) 
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but very common in flies. Budding yeast has ~6200 genes, only about 300 of which 
are subject to splicing, and nearly all of these have only a single intron. To say that 
flies have only 2-3 times as many genes as yeasts greatly underestimates the dif- 
ference in complexity of these two genomes. 

In some cases, alternative RNA splicing occurs because there is an intron 
sequence ambiguity: the standard spliceosome mechanism for removing intron 
sequences (discussed in Chapter 6) is unable to distinguish clearly between two 
or more alternative pairings of 5’ and 3’ splice sites, so that different choices are 
made by chance on different individual transcripts. Where such constitutive alter- 
native splicing occurs, several versions of the protein encoded by the gene are 
made in all cells in which the gene is expressed. 

In many cases, however, alternative RNA splicing is regulated. In the simplest 
examples, regulated splicing is used to switch from the production of a nonfunc- 
tional protein to the production of a functional one (or the other way around). 
The transposase that catalyzes the transposition of the Drosophila P element, 
for example, is produced in a functional form in germ cells and a nonfunctional 
form in somatic cells of the fly, allowing the P element to spread throughout the 
genome of the fly without causing damage in somatic cells (see Figure 5-61). The 
difference in transposon activity has been traced to the presence of an intron 
sequence in the transposase RNA that is removed only in germ cells. 

In addition to enabling switching from the production of a functional protein 
to the production of a nonfunctional one (or vice versa), the regulation of RNA 
splicing can generate different versions of a protein in different cell types, accord- 
ing to the needs of the cell. Tropomyosin, for example, is produced in specialized 
forms in different types of cells (see Figure 6-26). Cell-type-specific forms of many 
other proteins are produced in the same way. 

RNA splicing can be regulated either negatively, by a regulatory molecule that 
prevents the splicing machinery from gaining access to a particular splice site 
on the RNA, or positively, by a regulatory molecule that helps direct the splicing 
machinery to an otherwise overlooked splice site (Figure 7-58). 

Because of the plasticity of RNA splicing, the blocking of a “strong” splicing 
site will often expose a “weak” site and result in a different pattern of splicing. 
Thus, the splicing of a pre-mRNA molecule can be thought of as a delicate balance 
between competing splice sites—a balance that can easily be tipped by effects on 
splicing of regulatory proteins. 


The Definition of a Gene Has Been Modified Since the Discovery 
of Alternative RNA Splicing 


The discovery that eukaryotic genes usually contain introns and that their coding 
sequences can be assembled in more than one way raised new questions about 
the definition of a gene. A gene was first clearly defined in molecular terms in 
the early 1940s from work on the biochemical genetics of the fungus Neurospora. 
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Figure 7-58 Negative and positive 
control of alternative RNA splicing. 

(A) In negative control, a repressor protein 
binds to a specific sequence in the pre- 
mRNA transcript and blocks access of 
the splicing machinery to a splice junction. 
This often results in the use of a secondary 
splice site, thereby producing an altered 
pattern of splicing (see Figure 7-56). 

(B) In positive control, the splicing 
machinery is unable to remove a particular 
intron sequence efficiently without 
assistance from an activator protein. 
Because RNA is flexible, the nucleotide 
sequences that bind these activators can 
be located many nucleotide pairs from 

the splice junctions they control, and they 
are often called splicing enhancers, by 
analogy with the transcriptional enhancers 
mentioned earlier in this chapter. 
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Until then, a gene had been defined operationally as a region of the genome 
that segregates as a single unit during meiosis and gives rise to a definable phe- 
notypic trait, such as a red or a white eye in Drosophila or a round or wrinkled 
seed in peas. The work on Neurospora showed that most genes correspond to a 
region of the genome that directs the synthesis of a single enzyme. This led to the 
hypothesis that one gene encodes one polypeptide chain. The hypothesis proved 
fruitful for subsequent research; as more was learned about the mechanism of 
gene expression in the 1960s, a gene became identified as that stretch of DNA that 
was transcribed into the RNA coding for a single polypeptide chain (or a single 
structural RNA such as a tRNA or an rRNA molecule). The discovery of split genes 
and introns in the late 1970s could be readily accommodated by the original defi- 
nition of a gene, provided that a single polypeptide chain was specified by the 
RNA transcribed from any one DNA sequence. But it is now clear that many DNA 
sequences in higher eukaryotic cells can produce a set of distinct (but related) 
proteins by means of alternative RNA splicing. How, then, is a gene to be defined? 

In those relatively rare cases in which a single transcription unit produces two 
very different eukaryotic proteins, the two proteins are considered to be produced 
by distinct genes that overlap on the chromosome. It seems unnecessarily com- 
plex, however, to consider most of the protein variants produced by alternative 
RNA splicing as being derived from overlapping genes. A more sensible alternative 
is to modify the original definition to count a DNA sequence that is transcribed as 
a single unit and encodes one set of closely related polypeptide chains (protein 
isoforms) as a single protein-coding gene. This definition also accommodates 
those DNA sequences that encode protein variants produced by post-transcrip- 
tional processes other than RNA splicing, such as transcript cleavage and RNA 
editing (discussed below). 


A Change in the Site of RNA Transcript Cleavage and Poly-A 
Addition Can Change the C-terminus of a Protein 


We saw in Chapter 6 that the 3’ end of a eukaryotic mRNA molecule is not formed 
by the termination of RNA synthesis by the RNA polymerase, as it is in bacteria. 
Instead, it results from an RNA cleavage reaction that is catalyzed by additional 
proteins while the transcript is elongating (see Figure 6-34). A cell can control 
the site of this cleavage so as to change the C-terminus of the resultant protein. In 
the simplest cases, one protein variant is simply a truncated version of the other; 
in many other cases, however, the alternative cleavage and polyadenylation sites 
lie within intron sequences and the pattern of splicing is thereby altered. This 
process can produce two closely related proteins differing only in the amino acid 
sequences at their C-terminal ends. Close analysis of RNAs produced from the 
human genome in a variety of cell types (see Figure 7-3) indicate that as many as 
50% of human protein-coding genes produce mRNA species that differ at their 
site of polyadenylation. 

A well-studied example of regulated polyadenylation is the switch from the 
synthesis of membrane-bound to secreted antibody molecules that occurs during 
the development of B lymphocytes (see Figure 24-22). Early in the life history of 
a B lymphocyte, the antibody it produces is anchored in the plasma membrane, 
where it serves as a receptor for antigen. Antigen stimulation causes B lympho- 
cytes to multiply and to begin secreting their antibody. The secreted form of the 
antibody is identical to the membrane-bound form except at the extreme C-ter- 
minus. In this part of the protein, the membrane-bound form has a long string 
of hydrophobic amino acids that traverses the lipid bilayer of the membrane, 
whereas the secreted form has a much shorter string of hydrophilic amino acids. 
The switch from membrane-bound to secreted antibody is generated through a 
change in the site of RNA cleavage and polyadenylation, as shown in Figure 7-59. 

The change is caused by an increase in the concentration of a subunit of a 
protein (CstF) that promotes RNA cleavage (see Figure 6-34). The first cleavage/ 
poly-A addition site that a transcribing RNA polymerase encounters is subopti- 
mal and is usually skipped in unstimulated B lymphocytes, leading to production 
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of the longer RNA transcript. When activated to produce antibodies, the B lym- 
phocyte increases its CstF concentration; as a result, cleavage now occurs at the 
suboptimal site, and the shorter transcript is produced. In this way, a change in 
concentration of a general RNA-processing factor has a dramatic effect on the 
expression of a particular gene. 


RNA Editing Can Change the Meaning of the RNA Message 


The molecular mechanisms used by cells are a continual source of surprises. An 
example is the process of RNA editing, which alters the nucleotide sequences of 
RNA transcripts once they are synthesized and thereby changes the coded mes- 
sage they carry. We saw in Chapter 6 that tRNA and rRNA molecules are chemi- 
cally modified after they are synthesized: here we focus on changes to mRNAs. 

In animals, two principal types of mRNA editing occur: the deamination of 
adenine to produce inosine (A-to-I editing) and, less frequently, the deamination 
of cytosine to produce uracil (C-to-U editing), as shown in Figure 5-43. Because 
these chemical modifications alter the pairing properties of the bases (I pairs with 
C, and U pairs with A), they can have profound effects on the meaning of the RNA. 
If the edit occurs in a coding region, it can change the amino acid sequence of the 
protein or produce a truncated protein by creating a premature stop codon. Edits 
that occur outside coding sequences can affect the pattern of pre-mRNA splicing, 
the transport of mRNA from the nucleus to the cytosol, the efficiency with which 
the RNA is translated, or the base-pairing between microRNAs (miRNAs) and 
their mRNA targets, a form of regulation that will be discussed later in the chapter. 

The process of A-to-I editing is particularly prevalent in humans, where it 
occurs in approximately 1000 genes. Enzymes called ADARs (adenosine deam- 
inases acting on RNA) perform this type of editing; these enzymes recognize a 
double-stranded RNA structure that is formed through base-pairing between the 
site to be edited and a complementary sequence located elsewhere on the same 
RNA molecule, typically in an intron (Figure 7-60). The structure of the dou- 
ble-stranded RNA specifies whether the mRNA is to be edited, and if so, where 
the edit should be made. An especially important example of A-to-I editing takes 
place in the mRNA that codes for a transmitter-gated ion channel in the brain. A 
single edit changes a glutamine to an arginine; the affected amino acid lies on 
the inner wall of the channel, and the editing change alters the Ca** permeabil- 
ity of the channel. Mutant mice that cannot make this edit are prone to epileptic 
seizures and die during or shortly after weaning, showing that editing of the ion 
channel RNA is normally crucial for proper brain development. 

C-to-U editing, which is carried out by a different set of enzymes, is also crucial 
in mammals. For example, in certain cells of the gut, the mRNA for apolipopro- 
tein B undergoes a C-to-U edit that creates a premature stop codon and therefore 


Figure 7-59 Regulation of the site of 
RNA cleavage and poly-A addition 
determines whether an antibody 
molecule is secreted or remains 
membrane-bound. In unstimulated B 
lymphocytes (left), a long RNA transcript is 
produced, and the intron sequence (yellow) 
near its 3’ end is removed by RNA splicing 
to provide an MRNA molecule that codes 
for a membrane-bound antibody molecule. 
Only a portion of the antibody gene is 
shown in the figure; the actual gene and 
its MRNA would extend further to the 

left of the diagram. After antigen stimulation 
(right), the RNA transcript is cleaved 

and polyadenylated upstream from the 
intron’s 3’ splice site. As a result, some 

of the intron sequence remains as a 
coding sequence in the short transcript 
and specifies the hydrophilic C-terminal 
portion of the secreted antibody molecule 
(brown). (Adapted from D. Di Giammartino 
et al., Mol. Cell 43:853-866, 2011. With 
permission from Elsevier.) 
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Figure 7-60 Mechanism of A-to-| RNA 
editing in mammals. Typically, a Sequence 
complementary to the position of the edit 
is present in an intron, and the resulting 
double-stranded RNA structure attracts an 
A-to-| editing enzyme (ADAR). In the case 
illustrated, the edit is made in an exon; 

in most cases, however, this occurs in 
noncoding portions of the MRNA. Editing 
by ADAR takes place in the nucleus, before 
the pre-mRNA has been fully processed. 
Mice and humans have two ADAR genes: 
ADR1 is expressed in many tissues and is 
required in the liver for proper red blood 
cell development; ADR2 is expressed only 
in the brain, where it is required for proper 
brain development. 
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produces a shorter form of the protein. In cells of the liver, the editing enzyme is 
not expressed, and the full-length apolipoprotein B is produced. The two protein 
isoforms have different properties, and each plays a role in lipid metabolism that 
is specific to the organ that produces it (Figure 7-61). 

Why RNA editing exists at all is a mystery. One idea is that it arose in evolution 
to correct “mistakes” in the genome. Another is that it arose as a somewhat slap- 
dash way for the cell to produce subtly different proteins from the same gene. A 
third possibility is that RNA editing originally evolved as a defense mechanism 
against retroviruses and retrotransposons and was later adapted by the cell to 
change the meanings of certain mRNAs. Indeed, RNA editing still plays important 
roles in cell defense. Some retroviruses, including HIV, are extensively edited after 
they infect cells. This hyperediting creates many harmful mutations in the viral 
RNA genome and also causes viral mRNAs to be retained in the nucleus, where 
they are eventually degraded. Although some modern retroviruses protect them- 
selves against this defense mechanism, RNA editing presumably helps to hold 
many viruses in check. 


RNA Transport from the Nucleus Can Be Regulated 


It has been estimated that in mammals only about one-twentieth of the total mass 
of RNA synthesized ever leaves the nucleus. We saw in Chapter 6 that most mam- 
malian RNA molecules undergo extensive processing and that the “leftover” RNA 
fragments (excised introns and RNA sequences 3’ to the cleavage/poly-A site) are 
degraded in the nucleus. Incompletely processed and otherwise damaged RNAs 
are also eventually degraded as part of the quality control system for RNA produc- 
tion. 

As described in Chapter 6, the export of RNA molecules from the nucleus is 
delayed until processing has been completed. However, mechanisms that delib- 
erately override this control point can be used to regulate gene expression. This 
strategy forms the basis for one of the best-understood examples of regulated 
nuclear transport of mRNA, which occurs in the human AIDS virus, HIV. 

As we saw in Chapter 5, HIV, once inside the cell, directs the formation of a 
double-stranded DNA copy of its genome, which is then inserted into the genome 
of the host (see Figure 5-62). Once inserted, the viral DNA can be transcribed as 
one long RNA molecule by the host cell’s RNA polymerase II. This transcript is 
then spliced in many different ways to produce over 30 different species of mRNA, 
which in turn are translated into a variety of different proteins (Figure 7-62). In 
order to make progeny virus, entire, unspliced viral transcripts must be exported 
from the nucleus to the cytosol, where they are packaged into viral capsids and 
serve as the viral genome. This large transcript, as well as alternatively spliced 
HIV mRNAs that the virus needs to move to the cytoplasm for protein synthesis, 
still carries complete introns. The host cell’s normal block to the nuclear export of 
unspliced RNAs therefore presents a special problem for HIV. 


Figure 7-61 C-to-U RNA editing 
produces a truncated form of 
apolipoprotein B. 
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The block is overcome in an ingenious way. The virus encodes a protein (called 
Rev) that binds to a specific RNA sequence (called the Rev responsive element, 
RRE) located within a viral intron. The Rev protein interacts with a nuclear export 
receptor (Crm1), which directs the movement of viral RNAs through nuclear pores 
into the cytosol despite the presence of intron sequences. We discuss in detail the 
way in which export receptors function in Chapter 12. 

The regulation of nuclear export by Rev has several important consequences 
for HIV growth and pathogenesis. In addition to ensuring the nuclear export 
of specific unspliced RNAs, it divides the viral infection into an early phase (in 
which Rev is translated from a fully spliced RNA and all of the intron-containing 
viral RNAs are retained in the nucleus and degraded) and a late phase (in which 
unspliced RNAs are exported due to Rev function). This timing helps the virus 
replicate by providing the gene products in roughly the order in which they are 
needed (Figure 7-63). Regulation by Rev and by Tat, the HIV protein that counter- 
acts premature transcription termination (see p. 414), allows the virus to achieve 
latency, a condition in which the HIV genome has become integrated into the 
host-cell genome but the production of viral proteins has temporarily ceased. 
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Figure 7-62 The compact genome of 
HIV, the human AIDS virus. The positions 
of the nine HIV genes are shown in green. 
The red double line indicates a DNA copy 
of the viral genome that has become 
integrated into the host DNA (gray). Note 
that the coding regions of many genes 
overlap, and that those of Tat and Rev are 
split by introns. The blue line at the bottom 
of the figure represents the pre-mRNA 
transcript of the viral DNA and shows 

the locations of all the possible splice 
sites (arrows). There are many alternative 
ways of splicing the viral transcript; for 
example, the Env mRNAs retain the intron 
that has been spliced out of the Tat and 
Rev mRNAs. The Rev response element 
(RRE) is indicated by a blue ball and stick. 
It is a 234-nucleotide-long stretch of RNA 
that folds into a defined structure; Rev 
recognizes a particular hairpin within this 
larger structure. 

The Gag gene codes for a protein that 
is cleaved into several smaller proteins 
that form the viral capsid. The Pol gene 
codes for a protein that is cleaved to 
produce reverse transcriptase (which 
transcribes RNA into DNA), as well as the 
integrase involved in integrating the viral 
genome (as double-stranded DNA) into 
the host genome. The Env gene codes for 
the envelope proteins (See Figure 5-62). 
Tat, Rev, Vif, Vor, Vou, and Nef are small 
proteins with a variety of functions. For 
example, Rev regulates nuclear export 
(see Figure 7-63) and Tat regulates the 
elongation of transcription across the 
integrated viral genome (see p. 414). 


Figure 7-63 Regulation of nuclear export 
by the HIV Rev protein. (A) Early in HIV 
infection, only the fully spliced RNAs (which 
contain the coding sequences for Rev, Tat, 
and Nef) are exported from the nucleus and 
translated. (B) Once sufficient Rev protein 
has accumulated and been transported 
into the nucleus, unspliced viral RNAs can 
be exported from the nucleus. Many of 
these RNAs are translated into protein, and 
the full-length transcripts are packaged into 
new viral particles. 
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If, after its initial entry into a host cell, conditions become unfavorable for viral 
transcription and replication, Rev and Tat are made at levels too low to promote 
transcription and export of unspliced RNA. This situation stalls the viral growth 
cycle until conditions improve, whereupon Rev and Tat levels increase, and the 
virus enters the replication cycle. 


some mRNAs Are Localized to Specific Regions of the Cytosol 


Once a newly made eukaryotic mRNA molecule has passed through a nuclear 
pore and entered the cytosol, it is typically met by ribosomes, which translate 
it into a polypeptide chain (see Figure 6-8). Once the first round of translation 
“passes” the nonsense-mediated decay test (see Figure 6-76), the mRNA is usu- 
ally translated in earnest. If the mRNA encodes a protein that is destined to be 
secreted or expressed on the cell surface, a signal sequence at the protein’s N-ter- 
minus will direct it to the endoplasmic reticulum (ER). In this case, as discussed 
in Chapter 12, components of the cell’s protein-sorting apparatus recognize the 
signal sequence as soon as it emerges from the ribosome and direct the entire 
complex of ribosome, mRNA, and nascent protein to the membrane of the ER, 
where the remainder of the polypeptide chain is synthesized. In other cases, free 
ribosomes in the cytosol synthesize the entire protein, and signals in the com- 
pleted polypeptide chain may then direct the protein to other sites in the cell. 

Many mRNAs are themselves directed to specific intracellular locations before 
their efficient translation begins, allowing the cell to position its mRNAs close 
to the sites where the encoded protein is needed. RNA localization has been 
observed in many organisms, including unicellular fungi, plants, and animals, 
and it is likely to be a common mechanism that cells use to concentrate high-level 
production of proteins at specific sites. This strategy also provides the cell with 
other advantages. For example, it allows the establishment of asymmetries in the 
cytosol of the cell, a key step in many stages of development. Localized mRNA, 
coupled with translational control, also allows the cell to regulate gene expression 
independently in different regions. This feature is particularly important in large, 
highly polarized cells such as neurons, where it plays a central role in synaptic 
function. 

Several mechanisms for mRNA localization have been discovered (Figure 
7-64), all of which require specific signals in the mRNA itself. These signals are 
usually concentrated in the 3’ untranslated region (UTR), the region of RNA that 
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Figure 7-64 Mechanisms for the 
localization of mRNAs. The mRNA to 

be localized leaves the nucleus through 
nuclear pores (top). Some localized mRNAs 
(left diagram) travel to their destination by 
associating with cytoskeletal motors, which 
use the energy of ATP hydrolysis to move 
the mRNAs unidirectionally along filaments 
in the cytoskeleton (red) (see Chapter 16). 
At their destination, the mRNAs are held 

in place by anchor proteins (black). Other 
mRNAs randomly diffuse through the 
cytosol and are simply trapped by anchor 
proteins and at their sites of localization 
(center diagram). Some mRNAs (right 
diagram) are degraded in the cytosol unless 
they have bound, through random diffusion, 
a localized protein complex that anchors 
and protects the mRNA from degradation 
(black). Each mechanism requires 

specific signals on the MRNA, which are 
typically located in the 3’ UTR. Additional 
components can block the translation of the 
mRNA until it is properly localized. (Adapted 
from H.D. Lipshitz and C.A. Smibert, Curr. 
Opin. Genet. Dev. 10:476—488, 2000. With 
permission from Elsevier.) 
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Figure 7-65 An experiment demonstrating the importance of the 3’ 

UTR in localizing mRNAs to specific regions of the cytoplasm. For 

this experiment, two different fluorescently labeled RNAs were prepared by 
transcribing DNA in vitro in the presence of fluorescently labeled derivatives of 
UTP. One RNA (labeled with a red fluorochrome) contains the coding region 
for the Drosophila Hairy protein and includes the adjacent 3’ UTR (see Figure 
6-21). The other RNA (labeled green) contains the Hairy coding region with 
the 3’ UTR deleted. The two RNAs were mixed and injected into a Drosophila 
embryo at a stage of develooment when multiple nuclei reside in a common 
cytoplasm (see Figure 7-26). When the fluorescent RNAs were visualized 

10 minutes later, the full-length hairy RNA (red) was localized to the apical 
side of nuclei (b/ue) but the transcript missing the 3’ UTR (green) failed to 
localize. Hairy is one of many transcriptional regulators that specify positional 
information in the developing Drosophila embryo (discussed in Chapter 21), 
and the localization of its mRNA (Shown in this experiment to depend on its 
3’ UTR) is critical for proper fly development. (Courtesy of Simon Bullock and 
David Ish-Horowicz.) 


extends from the stop codon that terminates protein synthesis to the start of the 
poly-A tail (Figure 7-65). The mRNA localization is usually coupled with trans- 
lational controls to ensure that the mRNA remains quiescent until it has been 
moved into place. 

The Drosophila egg exhibits an especially striking example of mRNA local- 
ization. The mRNA encoding the Bicoid transcription regulator is localized by 
attachment to the cytoskeleton at the anterior tip of the developing egg. When 
fertilization triggers the translation of this mRNA, it generates a gradient of the 
Bicoid protein that plays a crucial part in directing the development of the ante- 
rior part of the embryo (see Figure 7-26). Many mRNAs in somatic cells are also 
localized in a similar way. The mRNA that encodes actin, for example, is localized 
to the actin-filament-rich cell cortex in mammalian fibroblasts by means of a 3’ 
UTR signal. 

We saw in Chapter 6 that mRNA molecules exit from the nucleus bearing 
numerous markings in the form of RNA modifications (the 5’ cap and the 3’ 
poly-A tail) and bound proteins (exon-junction complexes, for example) that sig- 
nify the successful completion of the different pre-mRNA processing steps. As just 
described, the 3’ UTR of an mRNA can be thought of as a “zip code” that directs 
mRNAs to different places in the cell. Below, we will also see that mRNAs carry 
information specifying their average lifetime in the cytosol and the efficiency with 
which they are translated into protein. In a broad sense, the untranslated regions 
of eukaryotic mRNAs resemble the transcriptional control regions of genes: their 
nucleotide sequences contain information specifying the way the RNA is to be 
used, and proteins interpret this information by binding specifically to these 
sequences. Thus, over and above the specification of the amino acid sequences of 
proteins, mRNA molecules are rich with information. 


The 5’ and 3’ Untranslated Regions of mRNAs Control Their 
Translation 


Once an MRNA has been synthesized, one of the most common ways of regulating 
the levels ofits protein product is to control the step that initiates translation. Even 
though the details of translation initiation differ between eukaryotes and bacteria 
(as we saw in Chapter 6), they each use some of the same basic regulatory strat- 
egies. 

In bacterial mRNAs, a conserved stretch of nucleotides, the Shine-Dalgarno 
sequence, is always found a few nucleotides upstream of the initiating AUG codon. 
In bacteria, translational control mechanisms are carried out by proteins or by 
RNA molecules, and they generally involve either exposing or blocking the Shine- 
Dalgarno sequence (Figure 7-66). 

Eukaryotic mRNAs do not contain such a sequence. Instead, as discussed in 
Chapter 6, the selection of an AUG codon as a translation start site is largely deter- 
mined by its proximity to the cap at the 5’ end of the mRNA molecule, which is the 
site at which the small ribosomal subunit binds to the mRNA and begins scanning 
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for an initiating AUG codon. In eukaryotes, translational repressors can bind to 
the 5’ end of the mRNA and thereby inhibit translation initiation. Other repressors 
recognize nucleotide sequences in the 3’ UTR of specific mRNAs and decrease 
translation initiation by interfering with the communication between the 5’ cap 
and 3’ poly-A tail, a step required for efficient translation (see Figure 6-70). A 
particularly important type of translational control in eukaryotes relies on small 
RNAs (termed microRNAs or miRNAs) that bind to mRNAs and reduce protein 
output, as described later in this chapter. 


The Phosphorylation of an Initiation Factor Regulates Protein 
synthesis Globally 


Eukaryotic cells decrease their overall rate of protein synthesis in response to a 
variety of situations, including deprivation of growth factors or nutrients, infec- 
tion by viruses, and sudden increases in temperature. Much of this decrease is 
caused by the phosphorylation of the translation initiation factor eIF2 by specific 
protein kinases that respond to the changes in conditions. 

The normal function of eIF2 was outlined in Chapter 6. It forms a complex 
with GTP and mediates the binding of the methionyl initiator tRNA to the small 
ribosomal subunit, which then binds to the 5’ end of the mRNA and begins scan- 
ning along the MRNA. When an AUG codon is recognized, the eIF2 protein hydro- 
lyzes the bound GTP to GDP, causing a conformational change in the protein and 
releasing it from the small ribosomal subunit. The large ribosomal subunit then 
joins the small one to form a complete ribosome that begins protein synthesis. 
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Figure 7-66 Mechanisms of translational control. Although these examples are from bacteria, many of the same principles operate in eukaryotes. 
(A) Sequence-specific RNA-binding proteins repress translation of specific mRNAs by blocking access of the ribosome to the Shine-Dalgarno 
sequence (orange). For example, some ribosomal proteins repress translation of their own RNA. This mechanism allows the cell to maintain correctly 
balanced quantities of the various components needed to form ribosomes. (B) An RNA “thermosensor” permits efficient translation initiation only at 
elevated temperatures at which the stem-loop structure has been melted. An example occurs in the human pathogen Listeria monocytogenes, in 
which the translation of its virulence genes increases at 37°C, the temperature of the host. (C) Binding of a small molecule to a riboswitch causes 

a major rearrangement of the RNA forming a different set of stem-loop structures. In the bound structure, the Shine-Dalgarno sequence (orange) 

is sequestered and translation initiation is thereby blocked. In many bacteria, S-adenosylmethionine acts in this manner to block production of 

the enzymes that synthesize it. (D) An “antisense” RNA produced elsewhere from the genome base-pairs with a specific MRNA and blocks its 
translation. Many bacteria regulate expression of iron-storage proteins in this way. 
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Because elF2 binds very tightly to GDP, a guanine nucleotide exchange fac- 
tor (see p. 157), designated eIF2B, is required to cause GDP release so that a new 
GTP molecule can bind and eIF2 can be reused (Figure 7-67A). The reuse of eIF2 
is inhibited when it is phosphorylated—the phosphorylated eIF2 binds to eIF2B 
unusually tightly, inactivating e[F2B. There is more eIF2 than eIF2B in cells, and 
even a fraction of phosphorylated eIF2 can trap nearly all of the eIF2B. This pre- 
vents the reuse of the nonphosphorylated eIF2 and greatly slows protein synthesis 
(Figure 7-67B). 

Regulation of the level of active eIF2 is especially important in mammalian 
cells; eIF2 is part of the mechanism that allows cells to enter a nonproliferating, 
resting state (called Go) in which the rate of total protein synthesis is reduced to 
about one-fifth the rate in proliferating cells. 


Initiation at AUG Codons Upstream of the Translation Start Can 
Regulate Eukaryotic Translation Initiation 


We saw in Chapter 6 that eukaryotic translation typically begins at the first AUG 
downstream of the 5’ end of the mRNA, which is the first AUG encountered by a 
scanning small ribosomal subunit. But the nucleotides immediately surrounding 
the AUG also influence the efficiency of translation initiation. If the recognition 
site is poor enough, scanning ribosomal subunits will sometimes ignore the first 
AUG codon in the mRNA and skip to the second or third AUG codon instead. This 
phenomenon, known as “leaky scanning,’ is a strategy frequently used to produce 
two or more closely related proteins, differing only in their N-termini, from the 
same mRNA. A particularly important use of this mechanism is the production 
of the same protein with and without a signal sequence attached at its N-termi- 
nus. This allows the protein to be directed to two different locations in the cell (for 
example, to both mitochondria and the cytosol). Cells can regulate the relative 
abundance of the protein isoforms produced by leaky scanning; for example, a 
cell-type-specific increase in the abundance of the initiation factor eIF4F favors 
the use of the AUG closest to the 5’ end of the mRNA. 

Another type of control found in eukaryotes uses one or more short open read- 
ing frames—short stretches of DNA that begin with a start codon (ATG) and end 
with a stop codon, with no stop codons in between—that lie between the 5’ end of 
the mRNA and the beginning of the gene. Often, the amino acid sequences coded 
by these upstream open reading frames (uORFs) are not important; rather, the 
uORFs serve a purely regulatory function. An uORF present on an mRNA mol- 
ecule will generally decrease translation of the downstream gene by trapping a 
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Figure 7-67 The elF2 cycle. (A) The 
recycling of used elF2 by a guanine 
nucleotide exchange factor (elF2B). (B) elF2 
phosphorylation controls protein synthesis 
rates by tying up elF2B. 
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scanning ribosome initiation complex and causing the ribosome to translate the 
uORF and dissociate from the mRNA before it reaches the bona fide protein-cod- 
ing sequence. 

When the activity of a general translation factor (such as the eIF2 discussed 
above) is reduced, one might expect that the translation of all mRNAs would be 
reduced equally. Contrary to this expectation, however, the phosphorylation of 
elF2 can have selective effects, even enhancing the translation of specific mRNAs 
that contain uORFs. This can enable cells, for example, to adapt to starvation for 
specific nutrients by shutting down the synthesis of all proteins except those that 
are required for synthesis of the missing nutrients. The details of this mechanism 
have been worked out for a specific yeast mRNA that encodes a protein called 
Gcn4, a transcription regulator that activates many genes that encode proteins 
that are important for amino acid synthesis. 

The Gcn4 mRNA contains several short uUORFs, and when amino acids are 
abundant, ribosomes translate the uORFs and generally dissociate before they 
reach the Gcn4 coding region. A global decrease in eIF2 activity brought about 
by amino acid starvation makes it more likely that a scanning small ribosomal 
subunit will move across the uORFs (without translating them) before it acquires 
a molecule of eIF2 (see Figure 6-70). Such a ribosomal subunit is then free to ini- 
tiate translation on the actual Gcn4 sequences. The increased level of this tran- 
scription regulator increases production of the amino acid biosynthetic enzymes. 


Internal Ribosome Entry Sites Provide Opportunities for 
Translational Control 


Although approximately 90% of eukaryotic mRNAs are translated beginning with 
the first AUG downstream from the 5’ cap, certain AUGs, as we saw in the pre- 
vious section, can be skipped over during the scanning process. In this section, 
we discuss yet another way that cells can initiate translation at positions distant 
from the 5’ end of the mRNA, using a specialized type of RNA sequence called an 
internal ribosome entry site (IRES). In some cases, two distinct protein-coding 
sequences are carried in tandem on the same eukaryotic mRNA; translation of 
the first occurs by the usual scanning mechanism, and translation of the second 
occurs through an IRES. IRESs are typically several hundred nucleotides in length 
and fold into specific structures that bind many, but not all, of the same proteins 
that are used to initiate normal 5’ cap-dependent translation (Figure 7-68). In 
fact, different IRESs require different subsets of initiation factors. However, all of 
them bypass the need for a 5’ cap structure and the translation initiation factor 
that recognizes it, eIF4E. 
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Figure 7-68 Two mechanisms of 
translation initiation. (A) The normal, cap- 
dependent mechanism requires a set of 
initiation factors whose assembly on the 
mRNA is stimulated by the presence of a 5’ 
cap and a poly-A tail (See also Figure 6-70). 
(B) The IRES-dependent mechanism, seen 
mainly in viruses, requires only a subset of 
the normal translation initiating factors, and 
these assemble directly on the folded IRES. 
(Adapted from A. Sachs, Cell 101:243-245, 
2000. With permission from Elsevier.) 


426 Chapter 7: Control of Gene Expression 


Some viruses use IRESs as part of a strategy to get their own mRNA molecules 
translated while blocking normal 5’ cap-dependent translation of host mRNAs. 
On infection, these viruses produce a protease (encoded in the viral genome) that 
cleaves the host-cell translation factor eIF4G, rendering it unable to bind to eIF4E, 
the cap-binding complex. This shuts down most of the host cell’s translation and 
effectively diverts the translation machinery to the IRES sequences present on the 
viral mRNAs. (The truncated eIF4G remains competent to initiate translation at 
these internal sites.) 

The many ways in which viruses manipulate their host’s protein-synthesis 
machinery for their own advantage continue to surprise cell biologists. Studying 
this “arms race” between humans and pathogens has led to many fundamental 
insights into the workings of the cell, and we revisit this topic in more detail in 
Chapter 23. 


Changes in mRNA Stability Can Regulate Gene Expression 


Most mRNAs in a bacterial cell are very unstable, having half-lives of less than a 
couple of minutes. Exonucleases, which degrade in the 3'-to-5’ direction, are usu- 
ally responsible for the rapid destruction of these mRNAs. Because its mRNAs are 
both rapidly synthesized and rapidly degraded, a bacterium can adapt quickly to 
environmental changes. 

As a general rule, the mRNAs in eukaryotic cells are more stable. Some, such 
as that encoding [ globin, have half-lives of more than 10 hours, but most have 
considerably shorter half-lives, typically less than 30 minutes. The mRNAs that 
code for proteins such as growth factors and transcription regulators, whose pro- 
duction rates need to change rapidly in cells, have especially short half-lives. 

We saw in Chapter 6 that the cell has several mechanisms that rapidly destroy 
incorrectly processed RNAs; here, we consider the fate of the typical “normal” 
eukaryotic mRNA. Two general mechanisms exist for eventually destroying each 
mRNA that is made by the cell. Both begin with the gradual shortening of the 
poly-A tail by an exonuclease, a process that starts as soon as the mRNA reaches 
the cytosol. In a broad sense, this poly-A shortening acts as a timer that counts 
down the lifetime of each mRNA. Once the poly-A tail is reduced to a critical 
length (about 25 nucleotides in humans), the two pathways diverge. In one, the 5’ 
cap is removed (a process called decapping) and the “exposed” mRNA is rapidly 
degraded from its 5’ end. In the other, the mRNA continues to be degraded from 
the 3’ end, through the poly-A tail, into the coding sequences (Figure 7-69). 

Nearly all mRNAs are subject to both types of decay, and the specific sequences 
of each mRNA determine how fast each step occurs and therefore how long 
each mRNA will persist in the cell and be able to produce protein. The 3’ UTR 
sequences are especially important in controlling mRNA lifetimes, and they often 
carry binding sites for specific proteins that increase or decrease the rate of poly-A 
shortening, decapping, or 3'-to-5' degradation. The half-life of an mRNA is also 
affected by how efficiently it is translated. Poly-A shortening and decapping com- 
pete directly with the machinery that translates the mRNA; therefore, any factors 
that affect the translation efficiency of an mRNA will tend to have the opposite 
effect on its degradation (Figure 7-70). 

Although poly-A shortening controls the half-life of most eukaryotic mRNAs, 
some mRNAs can be degraded by a specialized mechanism that bypasses this 
step altogether. In these cases, specific nucleases cleave the mRNA internally, 
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Figure 7-69 Two mechanisms of 
eukaryotic mRNA decay. A critical 
threshold of poly-A tail length induces 
rapid 3'-to-5' degradation, which may 

be triggered by the loss of the poly-A- 
binding proteins. As shown in Figure 7-70, 
a deadenylase associates with both the 

3’ poly-A tail and the 5’ cap, and this 
connection may be involved in signaling 
decapping after poly-A shortening. 
Although 5’-to-3" and 3’-to-5' degradation 
are shown here on separate RNA 
molecules, these two processes can occur 
together on the same molecule. (Adapted 
from C.A. Beelman and R. Parker, Cell 
81:179-183, 1995. With permission from 
Elsevier.) 
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effectively decapping one end and removing the poly-A tail from the other so 
that both halves are rapidly degraded. The mRNAs that are destroyed in this way 
carry specific nucleotide sequences, often in the 3’ UTRs, that serve as recogni- 
tion sequences for these endonucleases. This strategy makes it especially simple 
to tightly regulate the stability of these mRNAs by blocking or exposing the endo- 
nuclease site in response to extracellular signals. For example, the addition of iron 
to cells decreases the stability of the mRNA that encodes the receptor protein that 
binds the iron-transporting protein transferrin, causing less of this receptor to 
be made. This effect is mediated by the iron-sensitive RNA-binding protein aco- 
nitase. Aconitase can bind to the 3’ UTR of the transferrin receptor mRNA and 
increase receptor production by blocking endonucleolytic cleavage of the mRNA. 
On the addition of iron, aconitase is released from the mRNA, exposing the cleav- 
age site and thereby decreasing the stability of the mRNA (Figure 7-71). 


Regulation of mRNA Stability Involves P-bodies and Stress 
Granules 

We saw in Chapters 3 and 6 that large aggregates of proteins and nucleic acids 
that work together are often held in proximity by loose, low-affinity connections 


(see Figure 3-36). In this way, they function as “organelles” even though they are 
not surrounded by membranes. Many of the events discussed in the previous 
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Figure 7-70 The competition between 
mRNA translation and mRNA decay. 
The same two features of an MRNA 
molecule—its 5’ cap and the 3’ poly-A 
tail—are used in both translation initiation 
and deadenylation-dependent mRNA 
decay (see Figure 7-69). The deadenylase 
that shortens the poly-A tail in the 3’-to- 
5' direction associates with the 5’ cap. 
As described in Chapter 6 (See Figure 
6-70), the translation initiation machinery 
also associates with both the 5’ cap and 
the poly-A tail. (Adapted from M. Gao 

et al., Mol. Cell 5:479-488, 2000. With 
permission from Elsevier.) 


Figure 7-71 Two post-translational 
controls mediated by iron. (A) During 
iron starvation, the binding of aconitase 
to the 5’ UTR of the ferritin mRNA blocks 
translation initiation; its binding to the 

3’ UTR of the transferrin receptor MRNA 
blocks an endonuclease cleavage 

site and thereby stabilizes the MRNA. 

(B) In response to an increase in iron 
concentration in the cytosol, a cell 
increases its synthesis of ferritin in order 
to bind the extra iron and decreases its 
synthesis of transferrin receptors in order 
to import less iron across the plasma 
membrane. Both responses are mediated 
by the same iron-responsive regulatory 
protein, aconitase, which recognizes 
common features in a stem-loop structure 
in the mRNAs encoding ferritin and the 
transferrin receptor. Aconitase dissociates 
from the MRNA when it binds iron. But 
because the transferrin receptor and 
ferritin are regulated by different types 

of mechanisms, their levels respond 
oppositely to iron concentrations even 
though they are regulated by the same 
iron-responsive regulatory protein. 
(Adapted from M.W. Hentze et al., Science 
238:1570-1578, 1987 and J.L. Casey 
et al., Science 240:924-928, 1988. With 
permission from AAAS.) 
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section—including decapping and RNA degradation—take place in aggregates 
known as Processing- or P-bodies, which are present in the cytosol (Figure 7-72). 

Although many mRNAs are eventually degraded in P-bodies, some remain 
intact and are later returned to the pool of translating mRNAs. To be “rescued” 
in this way, mRNAs move from P-bodies to another type of aggregate known as a 
stress granule, which contains translation initiation factors, poly-A-binding pro- 
tein, and small ribosomal subunits. Translation itself does not occur in stress 
granules, but mRNAs can become “translation-ready” as the proteins bound to 
them in P-bodies are replaced with those in stress granules. The movement of 
mRNAs between active translation, P-bodies, and stress granules can be seen as 
an mRNA cycle (Figure 7-73) where the competition between translation and 
mRNA degradation is carefully controlled. Thus, when translation initiation is 
blocked (by starvation, drugs, or genetic manipulation), stress granules enlarge 
as more and more nontranslated mRNAs are moved directly into them for storage. 
Clearly, once a cell has made the large investment in producing a properly pro- 
cessed mRNA molecule, it carefully controls its subsequent fate. 


Summary 


Many steps in the pathway from RNA to protein are regulated by cells in order to 
control gene expression. Most genes are regulated at multiple levels, in addition 
to being controlled at the initiation stage of transcription. The regulatory mecha- 
nisms include (1) attenuation of the RNA transcript by its premature termination, 
(2) alternative RNA splice-site selection, (3) control of 3'-end formation by cleavage 
and poly-A addition, (4) RNA editing, (5) control of transport from the nucleus to 
the cytosol, (6) localization of mRNAs to particular parts of the cell, (7) control of 
translation initiation, and (8) regulated mRNA degradation. Most of these control 
processes require the recognition of specific sequences or structures in the RNA mol- 
ecule being regulated, a task performed by either regulatory proteins or regulatory 
RNA molecules. 
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Figure 7-72 Visualization of P-bodies. 
Human cells were stained with antibodies 
to a component of the mRNA decapping 
enzyme Dcpia (left panels) and to the 
Argonaute protein (middle panels). As 
described later in this chapter, Argonaute 
is a key component of RNA interference 
pathways. The merged image (right panels) 
shows that the two proteins co-localize to 
P-bodies in the cytoplasm. (Adapted from 
J. Liu et al., Nat. Cell Biol. 7:719-728, 
2005. With permission from Macmillan 
Publishers Ltd.) 


Figure 7-73 Possible fates of an MRNA 
molecule. An mRNA molecule released 
from the nucleus can be actively translated 
(center), stored in stress granules (right), or 
degraded in P-bodies (left). As the needs 
of the cell change, mRNAs can be shuffled 
from one pool to the next, as indicated by 
the arrows. 
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REGULATION OF GENE EXPRESSION BY 
NONCODING RNAs 


In the previous chapter, we introduced the central dogma, according to which the 
flow of genetic information proceeds from DNA through RNA to protein (Figure 
6-1). But we have seen throughout this book that RNA molecules perform many 
critical tasks in the cell besides serving as intermediate carriers of genetic infor- 
mation. Among these noncoding RNAs are the rRNA and tRNA molecules, which 
are responsible for reading the genetic code and synthesizing proteins. The RNA 
molecule in telomerase serves as a template for the replication of chromosome 
ends, snoRNAs modify ribosomal RNA, and snRNAs carry out the major events of 
RNA splicing. And we saw in the previous section that Xist RNA has an important 
role in inactivating one copy of the X chromosome in females. 

A series of recent discoveries has revealed that noncoding RNAs are even more 
prevalent than previously imagined. We now know that such RNAs play wide- 
spread roles in regulating gene expression and in protecting the genome from 
viruses and transposable elements. These newly discovered RNAs are the subject 
of this section. 


small Noncoding RNA Transcripts Regulate Many Animal and 
Plant Genes Through RNA Interference 


We begin our discussion with a group of short RNAs that carry out RNA inter- 
ference or RNAi. Here, short single-stranded RNAs (20-30 nucleotides) serve as 
guide RNAs that selectively reorganize and bind—through base-pairing—other 
RNAs in the cell. When the target is a mature mRNA, the small noncoding RNAs 
can inhibit its translation or even catalyze its destruction. If the target RNA mole- 
cule is in the process of being transcribed, the small noncoding RNA can bind to 
it and direct the formation of certain types of repressive chromatin on its attached 
DNA template (Figure 7-74). Three classes of small noncoding RNAs work in this 
way—microRNAs (miRNAs), small interfering RNAs (siRNAs), and piwi-interact- 
ing RNAs (piRNAs)—and we discuss them in turn in the next sections. Although 
they differ in the way the short pieces of single-stranded RNA are generated, all 
three types of short RNAs locate their targets through RNA-RNA base-pairing, and 
they generally cause reductions in gene expression. 


miRNAs Regulate mRNA Translation and Stability 


Over 1000 different microRNAs (miRNAs) are produced from the human genome, 
and these appear to regulate at least one-third of all human protein-coding genes. 
Once made, miRNAs base-pair with specific mRNAs and fine-tune their transla- 
tion and stability. The miRNA precursors are synthesized by RNA polymerase II 
and are capped and polyadenylated. They then undergo a special type of process- 
ing, after which the miRNA (typically 23 nucleotides in length) is assembled with 
a set of proteins to form an RNA-induced silencing complex or RISC. Once formed, 
the RISC seeks out its target mRNAs by searching for complementary nucleotide 
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Figure 7-74 RNA interference in 
eukaryotes. Single-stranded interfering 
RNAs are generated from double-stranded 
RNA. They locate target RNAs through 
base-pairing and, at this point, several fates 
are possible, as shown. As described in 
the text, there are several types of RNA 
interference; the way the double-stranded 
RNA is produced and processed and the 
ultimate fate of the target RNA depends on 
the particular system. 
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sequences (Figure 7-75). This search is greatly facilitated by the Argonaute pro- 
tein, a component of RISC, which holds the 5’ region of the miRNA so that it is 
optimally positioned for base-pairing to another RNA molecule (Figure 7-76). In 
animals, the extent of base-pairing is typically at least seven nucleotide pairs, and 
this pairing most often occurs in the 3’ UTR of the target mRNA. 

Once an mRNA has been bound by an miRNA, several outcomes are possible. 
If the base-pairing is extensive (which is unusual in humans but common in many 
plants), the mRNA is cleaved (sliced) by the Argonaute protein, effectively remov- 
ing the mRNA’s poly-A tail and exposing it to exonucleases (see Figure 7-69). Fol- 
lowing cleavage of the mRNA, the RISC with its associated miRNA is released, and 
it can seek out additional mRNAs (see Figure 7-75). Thus, a single miRNA can act 
catalytically to destroy many complementary mRNAs. These miRNAs can be thus 
thought of as guide sequences that bring destructive nucleases into contact with 
specific mRNAs. 

If the base-pairing between the miRNA and the mRNA is less extensive (as 
observed for most human miRNAs), Argonaute does not slice the mRNA; rather, 
translation of the mRNA is repressed and the mRNA is shuttled to P-bodies (see 
Figure 7-73) where, sequestered from ribosomes, it eventually undergoes poly-A 
tail shortening, decapping, and degradation. 

Several features make miRNAs especially useful regulators of gene expression. 
First, a single miRNA can regulate a whole set of different mRNAs, so long as the 
mRNAs carry a common short sequence in their UTRs. This situation is common 
in humans, where a single miRNA can control hundreds of different mRNAs. Sec- 
ond, regulation by miRNAs can be combinatorial. When the base-pairing between 
the miRNA and mRNA fails to trigger cleavage, additional miRNAs binding to the 
same mRNA lead to further reductions in its translation. As discussed earlier for 
transcription regulators, combinatorial control greatly expands the possibilities 
available to the cell by linking gene expression to a combination of different reg- 
ulators rather than a single regulator. Third, an miRNA occupies relatively little 
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Figure 7-75 miRNA processing and 
mechanism of action. The precursor 
miRNA, through complementarity between 
one part of its sequence and another, 
forms a double-stranded structure. This 
RNA is cropped while still in the nucleus 
and then exported to the cytosol, where 

it is further cleaved by the Dicer enzyme 
to form the miRNA proper. Argonaute, 

in conjunction with other components of 
RISC, initially associates with both strands 
of the miRNA and then cleaves and 
discards one of them. The other strand 
guides RISC to specific mRNAs through 
base-pairing. If the RNA-RNA match is 
extensive, as is commonly seen in plants, 
Argonaute cleaves the target mRNA, 
causing Its rapid degradation. In mammals, 
the miRNA—-mRNA match often does not 
extend beyond a short seven-nucleotide 
“seed” region near the 5’ end of the 
miRNA. This less extensive base-pairing 
leads to inhibition of translation, mRNA 
destabilization, and transfer of the mRNA to 
P-bodies, where it is eventually degraded. 
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space in the genome when compared with a protein. Indeed, their small size is 
one reason that miRNAs were discovered only recently. Although we are only 
beginning to appreciate the full impact of miRNAs, it is clear that they represent 
an important part of the cell’s equipment for regulating the expression of genes. 
We discuss specific examples of miRNAs that have key roles in development in 
Chapter 21. 


RNA Interference Is Also Used as a Cell Defense Mechanism 


Many of the proteins that participate in the miRNA regulatory mechanisms just 
described also serve a second function as a defense mechanism: they orches- 
trate the degradation of foreign RNA molecules, specifically those that occur in 
double-stranded form. Many transposable elements and viruses produce dou- 
ble-stranded RNA, at least transiently, in their life cycles, and RNA interference 
helps to keep these potentially dangerous invaders in check. As we shall see, this 
form of RNAi also provides scientists with a powerful experimental technique to 
turn off the expression of individual genes. 

The presence of double-stranded RNA in the cell triggers RNAi by attract- 
ing a protein complex containing Dicer, the same nuclease that processes miR- 
NAs (see Figure 7-75). This protein cleaves the double-stranded RNA into small 
fragments (approximately 23 nucleotide pairs) called small interfering RNAs 
(siRNAs). These double-stranded siRNAs are then bound by Argonaute and other 
components of RISC. As we saw above for miRNAs, one strand of the duplex RNA 
is then cleaved by Argonaute and discarded. The single-stranded siRNA molecule 
that remains directs RISC back to complementary RNA molecules produced by 
the virus or transposable element. Because the match is usually exact, Argonaute 
cleaves these molecules, leading to their rapid destruction. 

Each time RISC cleaves a new RNA molecule, the RISC is released; thus, as 
we saw for miRNAs, a single RNA molecule can act catalytically to destroy many 
complementary RNAs. Some organisms employ an additional mechanism that 
amplifies the RNAi response even further. In these organisms, RNA-dependent 
RNA polymerases use siRNAs as primers to produce additional copies of dou- 
ble-strand RNAs which are then cleaved into siRNAs. This amplification ensures 
that, once initiated, RNA interference can continue even after all the initiating 
double-stranded RNA has been degraded or diluted out. For example, it permits 
progeny cells to continue carrying out the specific RNA interference that was pro- 
voked in the parent cells. 

In some organisms, the RNA interference activity can be spread by the transfer 
of RNA fragments from cell to cell. This is particularly important in plants (whose 
cells are linked by fine connecting channels, as discussed in Chapter 19), because 
it allows an entire plant to become resistant to an RNA virus after only a few of 
its cells have been infected. In a broad sense, the RNAi response resembles cer- 
tain aspects of the animal immune system; in both, an invading organism elicits a 
customized response, and—through amplification of the “attack” molecules—the 
host becomes systemically protected. 
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Figure 7-76 Human Argonaute protein 
carrying an miRNA. The protein is folded 
into four structural domains, each indicated 
by a different color. The miRNA is held in an 
extended form that is optimal for forming 
RNA-RNA base pairs. The active site 

of Argonaute that “slices” a target RNA, 
when it is extensively base-paired with the 
miRNA, is indicated in red. Many Argonaute 
proteins (three out of the four human 
proteins, for example) lack the catalytic site 
and therefore bind target RNAs without 
slicing them. (Adapted from C.D. Kuhn 

and L. Joshua-Tor, Trends Biochem. Sci. 
38:263-271, 2013. With permission from 
Cell Press.) 
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We have seen that although miRNAs and siRNAs are generated in slightly dif- 
ferent ways, they rely on the same proteins and seek out their targets in a funda- 
mentally similar manner. Because siRNAs are found in widespread species, they 
are believed to be the most ancient form of RNA interference, with miRNAs being 
a later refinement. These siRNA-mediated defense mechanisms are crucial for 
plants, worms, and insects. In mammals, a protein-based system (described in 
Chapter 24) has largely taken over the task of fighting off viruses. 


RNA Interference Can Direct Heterochromatin Formation 


The siRNA interference pathway just described does not necessarily stop with 
the destruction of target RNA molecules. In some cases, the RNA interference 
machinery can also selectively shut off synthesis of the target RNAs. For this to 
occur, the short siRNAs produced by the Dicer protein are assembled with a group 
of proteins (including Argonaute) to form the RITS (RNA-induced transcriptional 
silencing) complex. Using single-stranded siRNA as a guide sequence, this com- 
plex binds complementary RNA transcripts as they emerge from a transcribing 
RNA polymerase II (Figure 7-77). Positioned on the genome in this way, the RITS 
complex attracts proteins that covalently modify nearby histones and eventually 
direct the formation of heterochromatin to prevent further transcription initia- 
tion. In some cases, an RNA-dependent RNA polymerase and a Dicer enzyme are 
also recruited by the RITS complex to continually generate additional siRNAs in 
situ. This positive feedback loop ensures continued repression of the target gene 
even after the initiating siRNA molecules have disappeared. 

RNAi-directed heterochromatin formation is an important cell defense mech- 
anism that limits the spread of transposable elements in genomes by maintain- 
ing their DNA sequences in a transcriptionally silent form. However, this same 
mechanism is also used in some normal processes in the cell. For example, in 
many organisms, the RNA interference machinery maintains the heterochroma- 
tin formed around centromeres. Centromeric DNA sequences are transcribed 
in both directions, producing complementary RNA transcripts that can base- 
pair to form double-stranded RNA. This double-stranded RNA triggers the RNA 
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Figure 7-77 RNA interference directed 
by siRNAs. In many organisms, double- 
stranded RNA can trigger both the 
destruction of complementary mRNAs 
(left) and transcriptional silencing (right). 
The change in chromatin structure 
induced by the bound RITS (RNA- 
induced transcriptional silencing) complex 
resembles that in Figure 7—45. 
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interference pathway and stimulates formation of the heterochromatin that sur- 
rounds centromeres, which is necessary for the centromeres to segregate chromo- 
somes accurately during mitosis. 


pIRNAs Protect the Germ Line from Transposable Elements 


A third system of RNA interference relies on piRNAs (piwi-interacting RNAs, 
named for Piwi, a class of proteins related to Argonaute). piRNAs are made specif- 
ically in the germ line, where they block the movement of transposable elements. 
Found in many organisms, including humans, genes coding for piRNAs consist 
largely of sequence fragments of transposable elements. These clusters of frag- 
ments are transcribed and broken up into short, single-stranded piRNAs. The pro- 
cessing differs from that for miRNAs and siRNAs (for one thing, the Dicer enzyme 
is not involved), and the resulting piRNAs are slightly longer than miRNAs and 
siRNAs; moreover, they are complexed with Piwi rather than Argonaute proteins. 
Once formed, the piRNAs seek out RNA targets by base-pairing and, much like 
siRNAs, transcriptionally silence intact transposon genes and destroy any RNA 
(including mRNAs) produced by them. 

Many mysteries surround piRNAs. Over a million piRNA species are coded in 
the genomes of many mammals and expressed in the testes, yet only a small frac- 
tion seem to be directed against the transposons present in those genomes. Are 
the piRNAs remnants of past invaders? Do they cover so much “sequence space” 
that they are broadly protective for any foreign DNA? Another curious feature 
of piRNAs is that many of them (particularly if base-pairing does not have to be 
perfect) should, in principle, attack the normal mRNAs made by the organism, 
yet they do not. It has been proposed that these large numbers of piRNAs may 
form a system to distinguish “self” RNAs from “foreign” RNAs and attack only the 
latter. If this is the case, there must be a special way for the cell to spare its own 
RNAs. One idea is that RNAs produced in the previous generation of an organism 
are somehow registered and set aside from piRNA attack in subsequent genera- 
tions. Whether or not this mechanism truly exists, and, if so, how it might work, 
are questions that demonstrate our incomplete understanding of the full implica- 
tions of RNA interference. 


RNA Interference Has Become a Powerful Experimental Tool 


Although it likely arose as a defense mechanism against viruses and transposable 
elements, RNA interference, as we have seen, has become thoroughly integrated 
into many aspects of normal cell biology, ranging from the control of gene expres- 
sion to the structure of chromosomes. It has also been developed by scientists 
into a powerful experimental tool that allows almost any gene to be inactivated by 
evoking an RNAi response to it. This technique, which can be readily carried out 
in cultured cells and, in many cases, whole animals and plants, has made possi- 
ble new genetic approaches in cell and molecular biology. We shall discuss it in 
detail in the following chapter where we cover modern genetic methods used to 
study cells (see pp. 499-501). RNAi also has potential in treating human disease. 
Since many human disorders result from the misexpression of genes, the ability to 
turn these genes off by experimentally introducing complementary siRNA mole- 
cules holds great medical promise. Although the mechanism of RNA interference 
was discovered a few decades ago, we are still being surprised by its mechanistic 
details and by its broad biological implications. 


Bacteria Use Small Noncoding RNAs to Protect Themselves from 
Viruses 


Bacteria make up the vast majority of the Earth’s biomass and, not surprisingly, 
viruses that infect bacteria greatly outnumber plant and animal viruses. These 
viruses generally have DNA genomes. A recent discovery revealed that many 
species of bacteria (and almost all species of archaebacteria) use a repository of 
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small noncoding RNA molecules to seek out and destroy the DNA of the invading 
viruses. Many features of this defense mechanism, known as the CRISPR system, 
resemble those we saw above for miRNAs and siRNAs, but there are two import- 
ant differences. First, when bacteria and archaea are first infected by a virus, they 
have a mechanism that causes short fragments of that viral DNA to become inte- 
erated into their genomes. These serve as “vaccinations, in the sense that they 
become the templates for producing small noncoding RNAs known as crRNAs 
(CRISPR RNAs) that will thereafter destroy the virus should it reinfect the descen- 
dants of the original cell. This aspect of the CRISPR system is similar in principle 
to adaptive immunity in mammals, in that the cell carries a record of past expo- 
sures that is used to protect against future exposures. The second distinguishing 
feature of the CRISPR system is that these crRNAs then become associated with 
special proteins that allow them to seek out and destroy double-stranded DNA 
molecules, rather than single-stranded RNA molecules. 

Although many details of CRISPR-mediated immunity remain to be discov- 
ered, we can outline the general process in three steps (Figure 7-78). In the first, 
viral DNA sequences are integrated into special regions of the bacterial genome 
known as CRISPR (clustered regularly interspersed short palindromic repeat) 
loci, named for the peculiar structure that first drew the attention of scientists. 
In its simplest form, a CRISPR locus consists of several hundred repeats of a host 
DNA sequence interspersed with a large collection of sequences (typically 25-70 
nucleotide pairs each) that has been derived from prior exposures to viruses and 
other foreign DNA. The newest viral sequence is always integrated at the 5’ end of 
the CRISPR locus, the end that is transcribed first. Each locus, therefore, carries 
a temporal record of prior infections. Many bacterial and archaeal species carry 
several large CRISPR loci in their genomes and are thus immune to a wide range 
of viruses. 

In the second step, the CRISPR locus is transcribed to produce a long RNA mol- 
ecule, which is then processed into the much shorter (approximately 30 nucleo- 
tides) crRNAs. In the third step, ccRNAs complexed with Cas (CRISPR-associated) 
proteins seek out complementary viral DNA sequences and direct their destruc- 
tion by nucleases. Although structurally dissimilar, Cas proteins are analogous to 
the Argonaute and Piwi proteins discussed above: they hold small single-stranded 
RNAs in an extended configuration that is optimized, in this case, for seeking and 
forming complementary base pairs with DNA. 

We still have much to learn about CRISPR-based immunity in bacteria and 
archaebacteria. The mechanism through which viral sequences are first identified 
and integrated into the host genome is poorly understood, as is the way that the 
crRNAs find their complementary sequences in double-stranded DNA. Moreover, 
in different species of bacteria and archaebacteria, crRNAs are processed in dif- 
ferent ways, and in some cases, the crRNAs can attack viral RNAs as well as DNAs. 

We shall see in the following chapter that bacterial CRISPR systems have 
already been artificially “moved” into plants and animals, where they have 
become very powerful experimental tools for manipulating genomes. 
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Figure 7-78 CRISPR-mediated immunity 
in bacteria and archaebacteria. After 
infection by a virus (left panel), a small bit of 
DNA from the viral genome is inserted into 
the CRISPR locus. For this to happen, a 
small fraction of infected cells must survive 
the initial viral infection. The surviving 

cells, or more generally their descendants, 
transcribe the CRISPR locus and process 
the transcript into crRNAs (middle panel). 
Upon reinfection with a virus that the 
population has already been “vaccinated” 
against, the incoming viral DNA is 
destroyed by a complementary crRNA 
(right panel). For a CRISPR system to be 
effective, the crRNAs must not destroy 
the CRISPR locus itself, even though the 
crRNAs are complementary in sequence 
to it. In many species, in order for crRNAs 
to attack an invading DNA molecule, 

there must be additional short nucleotide 
sequences that are carried by the target 
molecule. Because these sequences, 
known as PAMS (protospacer adjacent 
motifs), lie outside the crRNA sequences, 
the host CRISPR locus is spared (see 
Figure 8-55). 
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Long Noncoding RNAs Have Diverse Functions in the Cell 


In this and the preceding chapters, we have seen that noncoding RNA molecules 
have many functions in the cell. Yet, as is the case with proteins, there remain 
many noncoding RNAs whose function is still unknown. Many RNAs of unknown 
function belong to a group known as long noncoding RNA (IncRNA). These are 
arbitrarily defined as RNAs longer than 200 nucleotides that do not code for pro- 
tein. As methods have improved for determining the nucleotide sequences of all 
the RNA molecules produced by a cell line or tissue, the sheer number of IncRNAs 
(an estimated 8000 for the human genome, for example) came as a surprise to sci- 
entists. Most IncRNAs are transcribed by RNA polymerase II and have 5’ caps and 
poly-A tails, and, in many cases, they are spliced. It has been difficult to annotate 
IncRNAs because low levels of RNA are now known to be made from 75% of the 
human genome. Most of these RNAs are thought to result from the background 
“noise” of transcription and RNA processing. According to this idea, such non- 
functional RNAs provide no fitness advantage or disadvantage to the organism 
and are a tolerated by-product of the complex patterns of gene expression that 
need to be produced in multicellular organisms. For these reasons, it is difficult to 
estimate the number of IncRNAs that are likely to have a function in the cell and to 
distinguish them from the background transcription. 

We have already encountered a few IncRNAs, including the RNA in telomerase 
(see Figure 5-33), Xist RNA (see Figure 7-52), and an RNA involved in imprinting 
(see Figure 7-49). Other IncRNAs have been implicated in controlling the enzy- 
matic activity of proteins, inactivating transcription regulators, affecting splicing 
patterns, and blocking translation of certain mRNAs. 

In terms of biological function, IncRNA should be considered a catch-all phrase 
encompassing a great diversity of functions. Nevertheless, there are two unifying 
features of IncRNAs that can account for their many roles in the cell. The first is 
that IncRNAs can function as scaffold RNA molecules, holding together groups of 
proteins to coordinate their functions (Figure 7-79A). We have already seen an 
example in telomerase, where the RNA molecule holds together and organizes 
protein components. These RNA-based scaffolds are analogous to protein scaf- 
folds we discussed in Chapter 3 (see Figure 3-78) and Chapter 6 (see Figure 6-47). 
RNA molecules are well suited to act as scaffolds: small bits of RNA sequence, 
often those portions that form stem-loop structures, can serve as binding sites 
for proteins, and these can be strung together with random sequences of RNA 
in between. This property may be one reason that IncRNAs show relatively little 
primary-sequence conservation across species. 

The second key feature of IncRNAs is their ability to serve as guide sequences, 
binding to specific RNA or DNA target molecules through base-pairing. By doing 
so, they bring proteins that are bound to them into close proximity with the DNA 
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Figure 7-79 Roles of long noncoding 
RNA (IncRNA). (A) IncRNAs can serve as 
scaffolds, bringing together proteins that 
function in the same process. As described 
in Chapter 6, RNAs can fold into specific 
three-dimensional structures that are often 
recognized by proteins. (B) In addition to 
serving as scaffolds, IncRNAs can, through 
formation of complementary base pairs, 
localize proteins to specific sequences on 
RNA or DNA molecules. (C) In some cases, 
INCRNAs act only in cis, for example, 

when the RNA is held in place by RNA 
polymerase (top). Other IncRNAs, however, 
diffuse from their sites of synthesis and 
therefore act in trans. 


controls transcription of 
genes on other chromosomes, 





chromosome B 
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and RNA sequences (Figure 7-79B). This behavior is similar to that of snoRNAs 
(see Figure 6-41), crRNAs (see Figure 7-78), and miRNAs (see Figure 7-75), all of 
which act in this way to guide protein enzymes to specific nucleic acid sequences. 

In some cases, IncRNAs work simply by base-pairing, without bringing in 
enzymes or other proteins. For example, a number of IncRNA genes are embed- 
ded in protein-coding genes, but they are transcribed in the “wrong direction.” 
These antisense RNAs can form complementary base pairs with the mRNA (tran- 
scribed in the “correct” direction) and block its translation into protein (see Figure 
7-66D). Other antisense IncRNAs base-pair with pre-mRNAs as they are synthe- 
sized and change the pattern of RNA splicing by masking splice-site sequences. 
Still others act as “sponges,” base-pairing with miRNAs and thereby reducing their 
effects. 

Finally, we note that some IncRNAs can act only in cis; that is, they affect only 
the chromosome from which they are transcribed. This readily occurs when the 
transcribed RNA has not yet been released from RNA polymerases (Figure 7-79C). 
Many IncRNAs, however, diffuse from their site of synthesis and act in trans. 
Although the best understood IncRNAs work in the nucleus, many are found in the 
cytosol. The functions—if any—of the great majority of these cytosolic IncRNAs 
remain undiscovered. 


Summary 


RNA molecules have many uses in the cell besides carrying the information needed 
to specify the order of amino acids during protein synthesis. Although we have 
encountered noncoding RNAs in other chapters (tRNAs, rRNAs, snoRNAs, for exam- 
ple), the sheer number of noncoding RNAs produced by cells has surprised scien- 
tists. One well understood use of noncoding RNAs occurs in RNA interference, where 
guide RNAs (miRNAs, siRNAs, piRNAs) base-pair with mRNAs. RNA interference 
can cause MRNAs to be either destroyed or translationally repressed. It can also 
cause specific genes to be packaged into heterochromatin suppressing their tran- 
scription. In bacteria and archaebacteria, RNA interference is used as an adaptive 
immune response to destroy viruses that infect them. A large family of large noncod- 
ing RNAs (IncRNAs) has recently been discovered. Although the function of most of 
these RNAs is unknown, some serve as RNA scaffolds to bring specific proteins and 
RNA molecules together to speed up needed reactions. 


PROBLEMS 


WHAT WE DON’T KNOW 


e How is the final rate of transcription 
of a gene specified by the hundreds of 
proteins that assemble on its control 
regions? Will we ever be able to predict 
this rate from inspection of the DNA 
sequences of control regions? 


e How does the collection of cis- 
regulatory sequences embedded in a 
genome orchestrate the developmental 
program of a multicellular organism? 


e How much of the human genome 
sequence is functional, and why is the 
remainder retained? 


e Which of the thousands of unstudied 
noncoding RNAs have functions in the 
cell, and what are these functions? 


e Were introns present in early cells 
(and subsequently lost in some 
organisms), or did they arise at later 
times? 


Which statements are true? Explain why or why not. 


7-1 In terms of the way it interacts with DNA, the 
helix-loop-helix motif is more closely related to the leu- 
cine zipper motif than it is to the helix-turn-helix motif. 


7-2 Once cells have differentiated to their final spe- 
cialized forms, they never again alter expression of their 
genes. 


7-3 CG islands are thought to have arisen during evo- 
lution because they were associated with portions of the 
genome that remained unmethylated in the germ line. 


7-4 In most differentiated tissues, daughter cells retain 


Discuss the following problems. 


7-5 A small portion of a two-dimensional display 
of proteins from human brain is shown in Figure Q7-1. 
These proteins were separated on the basis of size in one 
dimension and electrical charge (isoelectric point) in the 
other. Not all protein spots on such displays are products 
Figure Q7-1 Two- 
dimensional separation of 
proteins from the human 
brain (Problem 7-5). The 
proteins were displayed 
using two-dimensional 

gel electrophoresis. Only 
a small portion of the 


larger 


protein spectrum is shown. 
(Courtesy of Tim Myers and 
Leigh Anderson, Large Scale 
Biology Corporation.) 





a memory of gene expression patterns that were present 
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of different genes; some represent modified forms of a pro- 
tein that migrate to different positions. Pick out a couple 
of sets of spots that could represent proteins that differ by 
the number of phosphates they carry. Explain the basis for 
your selection. 


7-6 Comparisons of the patterns of mRNA levels 
across different human cell types show that the level of 
expression of almost every active gene is different. The 
patterns of mRNA abundance are so characteristic of cell 
type that they can be used to determine the tissue of origin 
of cancer cells, even though the cells may have metasta- 
sized to different parts of the body. By definition, however, 
cancer cells are different from their noncancerous precur- 
sor cells. How do you suppose then that patterns of mRNA 
expression might be used to determine the tissue source of 
a human cancer? 


7-7 What are the two fundamental components of a 
genetic switch? 


7-8 The nucleus of a eukaryotic cell is much larger 
than a bacterium, and it contains much more DNA. As a 
consequence, a transcription regulator in a eukaryotic cell 
must be able to select its specific binding site from among 
many more unrelated sequences than does a transcription 
regulator in a bacterium. Does this present any special 
problems for eukaryotic gene regulation? 

Consider the following situation. Assume that the 
eukaryotic nucleus and the bacterial cell each have a sin- 
gle copy of the same DNA binding site. In addition, assume 
that the nucleus is 500 times the volume of the bacterium, 
and has 500 times as much DNA. Ifthe concentration of the 
transcription regulator that binds the site were the same 
in the nucleus and in the bacterium, would the regulator 
occupy its binding site equally as well in the eukaryotic 
nucleus as it does in the bacterium? Explain your answer. 


7-9 Some transcription regulators bind to DNA and 
cause the double helix to bend at a sharp angle. Such 
“pending proteins” can affect the initiation of transcrip- 
tion without directly contacting any other protein. Can you 
devise a plausible explanation for how such proteins might 
work to modulate transcription? Draw a diagram that illus- 
trates your explanation. 


7-10 How is it that protein-protein interactions that 
are too weak to cause proteins to assemble in solution 
can nevertheless allow the same proteins to assemble into 
complexes on DNA? 


7-11 Imagine the two situations shown in Figure Q7-2. 
In cell 1, a transient signal induces the synthesis of pro- 
tein A, which is a transcription activator that turns on 
many genes including its own. In cell 2, a transient signal 
induces the synthesis of protein R, which is a transcription 
repressor that turns off many genes including its own. In 
which, if either, of these situations will the descendants of 
the original cell “remember” that the progenitor cell had 
experienced the transient signal? Explain your reasoning. 
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Figure Q7-2 Gene regulatory circuits and cell memory (Problem 7-11). 
(A) Induction of synthesis of transcription activator A by a transient 
signal. (B) Induction of synthesis of transcription repressor R by a 
transient signal. 


7-12 Examine the two pedigrees shown in Figure Q7-3. 
One results from deletion of a maternally imprinted auto- 
somal gene. The other pedigree results from deletion of a 
paternally imprinted autosomal gene. In both pedigrees, 
affected individuals (red symbols) are heterozygous for 
the deletion. These individuals are affected because one 
copy of the chromosome carries an imprinted, inactive 
gene, while the other carries a deletion of the gene. Dotted 
yellow symbols indicate individuals that carry the deleted 
locus, but do not display the mutant phenotype. Which 
pedigree is based on paternal imprinting and which on 
maternal imprinting? Explain your answer. 
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Figure Q7-3 Pedigrees reflecting maternal and paternal imprinting 
(Problem 7-12). In one pedigree, the gene is paternally imprinted; in 
the other, it is maternally imprinted. In generations 3 and 4, only one 
of the two parents in the indicated matings is shown; the other parent 
is anormal individual from outside this pedigree. Affected individuals 
are represented by red circles for females and red squares for males. 
Dotted yellow symbols indicate individuals that carry the deletion but 
do not display the phenotype. 


7-13 Ifyou insert a §-galactosidase gene lacking its own 
transcription control region into a cluster of piRNA genes 
in Drosophila, you find that B-galactosidase expression 
from a normal copy elsewhere in the genome is strongly 
inhibited in the fly’s germ cells. If the inactive 6-galactosi- 
dase gene is inserted outside the piRNA gene cluster, the 
normal gene is properly expressed. What do you suppose 
is the basis for this observation? How would you test your 
hypothesis? 
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Analyzing Cells, Molecules, 
and Systems 


Progress in science is often driven by advances in technology. The entire field 
of cell biology, for example, came into being when optical craftsmen learned to 
grind small lenses of sufficiently high quality to observe cells and their substruc- 
tures. Innovations in lens grinding, rather than any conceptual or philosophical 
advance, allowed Hooke and van Leeuwenhoek to discover a previously unseen 
cellular world, where tiny creatures tumble and twirl in a small droplet of water 
(Figure 8-1). 

The twenty-first century is a particularly exciting time for biology. New meth- 
ods for analyzing cells, proteins, DNA, and RNA are fueling an information explo- 
sion and allowing scientists to study cells and their macromolecules in previ- 
ously unimagined ways. We now have access to the sequences of many billions of 
nucleotides, providing the complete molecular blueprints for hundreds of organ- 
isms—from microbes and mustard weeds to worms, flies, mice, dogs, chimpan- 
zees, and humans. And powerful new techniques are helping us to decipher that 
information, allowing us not only to compile huge, detailed catalogs of genes and 
proteins but also to begin to unravel how these components work together to form 
functional cells and organisms. The long-range goal is nothing short of obtaining 
a complete understanding of what takes place inside a cell as it responds to its 
environment and interacts with its neighbors. 

In this chapter, we present some of the principal methods used to study cells 
and their molecular components. We consider how to separate cells of different 
types from tissues, how to grow cells outside the body, and how to disrupt cells 
and isolate their organelles and constituent macromolecules in pure form. We 
also present the techniques used to determine protein structure, function, and 
interactions, and we discuss the breakthroughs in DNA technology that continue 
to revolutionize our understanding of cell function. We end the chapter with 
an overview of some of the mathematical approaches that are helping us deal 
with the enormous complexity of cells. By considering cells as dynamic systems 
with many moving parts, mathematical approaches can reveal hidden insights 
into how the many components of cells work together to produce the special 
qualities of life. 
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ISOLATING CELLS AND GROWING THEM IN CULTURE 


Although the organelles and large molecules in a cell can be visualized with 
microscopes, understanding how these components function requires a detailed 
biochemical analysis. Most biochemical procedures require that large numbers of 
cells be physically disrupted to gain access to their components. If the sample is 
a piece of tissue, composed of different types of cells, heterogeneous cell popula- 
tions will be mixed together. To obtain as much information as possible about the 
cells in a tissue, biologists have developed ways of dissociating cells from tissues 
and separating them according to type. These manipulations result in a relatively 
homogeneous population of cells that can then be analyzed—either directly or 
after their number has been greatly increased by allowing the cells to proliferate 
in culture. 


Cells Can Be Isolated from Tissues 


Intact tissues provide the most realistic source of material, as they represent the 
actual cells found within the body. The first step in isolating individual cells is to 
disrupt the extracellular matrix and cell-cell junctions that hold the cells together. 
For this purpose, a tissue sample is typically treated with proteolytic enzymes 
(such as trypsin and collagenase) to digest proteins in the extracellular matrix and 
with agents (such as ethylenediaminetetraacetic acid, or EDTA) that bind, or che- 
late, the Ca** on which cell-cell adhesion depends. The tissue can then be teased 
apart into single cells by gentle agitation. 

For some biochemical preparations, the protein of interest can be obtained in 
sufficient quantity without having to separate the tissue or organ into cell types. 
Examples include the preparation of histones from calf thymus, actin from rab- 
bit muscle, or tubulin from cow brain. In other cases, obtaining the desired pro- 
tein requires enrichment for a specific cell type of interest. Several approaches 
are used to separate the different cell types from a mixed cell suspension. One 
of the most sophisticated cell-separation techniques uses an antibody coupled 
to a fluorescent dye to label specific cells. An antibody is chosen that specifi- 
cally binds to the surface of only one cell type in the tissue. The labeled cells can 
then be separated from the unlabeled ones in a fluorescence-activated cell sorter. 
In this remarkable machine, individual cells traveling single file in a fine stream 
pass through a laser beam, and the fluorescence of each cell is rapidly measured. 
A vibrating nozzle generates tiny droplets, most containing either one cell or no 
cells. The droplets containing a single cell are automatically given a positive or a 
negative charge at the moment of formation, depending on whether the cell they 
contain is fluorescent; they are then deflected by a strong electric field into an 
appropriate container. Occasional clumps of cells, detected by their increased 
light scattering, are left uncharged and are discarded into a waste container. Such 
machines can accurately select 1 fluorescent cell from a pool of 1000 unlabeled 
cells and sort several thousand cells each second (Figure 8-2). 


Cells Can Be Grown in Culture 


Although molecules can be extracted from whole tissues, this is often not the 
most convenient or useful source of material. The complexity of intact tissues and 
organs is an inherent disadvantage when trying to purify particular molecules. 
Cells grown in culture provide a more homogeneous population of cells from 
which to extract material, and they are also much more convenient to work with 
in the laboratory. Given appropriate surroundings, most plant and animal cells 
can live, multiply, and even express differentiated properties in a culture dish. The 
cells can be watched continuously under the microscope or analyzed biochemi- 
cally, and the effects of adding or removing specific molecules, such as hormones 
or growth factors, can be systematically explored. 

Experiments performed on cultured cells are sometimes said to be carried out 
in vitro (literally, “in glass” ) to contrast them with experiments using intact organ- 
isms, which are said to be carried out in vivo (literally, “in the living organism”). 


(A) 





(B) 


Figure 8-1 Microscopic life. A sample 

of “diverse animalcules” seen by van 
Leeuwenhoek using his simple microscope. 
(A) Bacteria seen in material he excavated 
from between his teeth. Those in fig. B he 
described as “swimming first forward and 
then backwards” (1692). (B) The eukaryotic 
green alga Volvox (1700). (Courtesy of the 
John Innes Foundation.) 
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These terms can be confusing, however, because they are often used in a very dif- 
ferent sense by biochemists. In the biochemistry lab, in vitro refers to reactions 
carried out in a test tube in the absence of living cells, whereas in vivo refers to 
any reaction taking place inside a living cell, even if that cell is growing in culture. 

Tissue culture began in 1907 with an experiment designed to settle a con- 
troversy in neurobiology. The hypothesis under examination was known as the 
neuronal doctrine, which states that each nerve fiber is the outgrowth of a single 
nerve cell and not the product of the fusion of many cells. To test this contention, 
small pieces of spinal cord were placed on clotted tissue fluid in a warm, moist 
chamber and observed at regular intervals under the microscope. After a day or 
so, individual nerve cells could be seen extending long, thin filaments (axons) into 
the clot. Thus, the neuronal doctrine received strong support, and the foundation 
was laid for the cell-culture revolution. 

These original experiments on nerve fibers used cultures of small tissue frag- 
ments called explants. Today, cultures are more commonly made from suspen- 
sions of cells dissociated from tissues. Unlike bacteria, most tissue cells are not 
adapted to living suspended in fluid and require a solid surface on which to grow 
and divide. For cell cultures, this support is usually provided by the surface of a 
plastic culture dish. Cells vary in their requirements, however, and many do not 
proliferate or differentiate unless the culture dish is coated with materials that 
cells like to adhere to, such as polylysine or extracellular matrix components. 

Cultures prepared directly from the tissues of an organism are called primary 
cultures. These can be made with or without an initial fractionation step to sepa- 
rate different cell types. In most cases, cells in primary cultures can be removed 
from the culture dish and recultured repeatedly in so-called secondary cultures; 
in this way, they can be repeatedly subcultured (passaged) for weeks or months. 
Such cells often display many of the differentiated properties appropriate to their 
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Figure 8-2 A fluorescence-activated 
cell sorter. A cell passing through the 
laser beam is monitored for fluorescence. 
Droplets containing single cells are given a 
negative or positive charge, depending on 
whether the cell is fluorescent or not. The 
droplets are then deflected by an electric 
field into collection tubes according to their 
charge. Note that the cell concentration 
must be adjusted so that most droplets 
contain no cells and flow to a waste 
container together with any cell clumps. 
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Figure 8-3 Light micrographs of cells in culture. (A) Mouse fibroblasts. 

(B) Chick myoblasts fusing to form multinucleate muscle cells. (C) Purified rat 
retinal ganglion nerve cells. (D) Tobacco cells in liquid culture. (A, courtesy of 
Daniel Zicha; B, courtesy of Rosalind Zalin; C, from A. Meyer-Franke et al., 
Neuron 15:805-819, 1995. With permission from Elsevier; D, courtesy of 
Gethin Roberts.) 


origin (Figure 8-3): fibroblasts continue to secrete collagen; cells derived from 
embryonic skeletal muscle fuse to form muscle fibers that contract spontaneously 
in the culture dish; nerve cells extend axons that are electrically excitable and 
make synapses with other nerve cells; and epithelial cells form extensive sheets 
with many of the properties of an intact epithelium. Because these properties are 
maintained in culture, they are accessible to study in ways that are often not pos- 
sible in intact tissues. 

Cell culture is not limited to animal cells. When a piece of plant tissue is cul- 
tured in a sterile medium containing nutrients and appropriate growth regula- 
tors, many of the cells are stimulated to proliferate indefinitely in a disorganized 
manner, producing a mass of relatively undifferentiated cells called a callus. If 
the nutrients and growth regulators are carefully manipulated, one can induce 
the formation of a shoot and then root apical meristems within the callus, and, in 
many species, regenerate a whole new plant. Similar to animal cells, callus cul- 
tures can be mechanically dissociated into single cells, which will grow and divide 
as a suspension culture (see Figure 8-3D). 


Eukaryotic Cell Lines Are a Widely Used Source of Homogeneous 
Cells 


The cell cultures obtained by disrupting tissues tend to suffer from a problem— 
eventually the cells die. Most vertebrate cells stop dividing after a finite number of 
cell divisions in culture, a process called replicative cell senescence (discussed in 
Chapter 17). Normal human fibroblasts, for example, typically divide only 25-40 
times in culture before they stop. In these cells, the limited proliferation capacity 
reflects a progressive shortening and uncapping of the cell’s telomeres, the repet- 
itive DNA sequences and associated proteins that cap the ends of each chromo- 
some (discussed in Chapter 5). Human somatic cells in the body have turned off 
production of the enzyme, called telomerase, that normally maintains the telo- 
meres, which is why their telomeres shorten with each cell division. Human fibro- 
blasts can often be coaxed to proliferate indefinitely by providing them with the 
gene that encodes the catalytic subunit of telomerase; in this case, they can be 
propagated as an “immortalized” cell line. 

Some human cells, however, cannot be immortalized by this trick. Although 
their telomeres remain long, they still stop dividing after a limited number of divi- 
sions because culture conditions cause excessive mitogenic stimulation, which 
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activates a poorly understood protective mechanism (discussed in Chapter 17) 
that stops cell division—a process sometimes called “culture shock.” To immor- 
talize these cells, one has to do more than introduce telomerase. One must also 
inactivate the protective mechanisms, which can be done by introducing certain 
cancer-promoting oncogenes (discussed in Chapter 20). Unlike human cells, 
most rodent cells do not turn off production of telomerase and therefore their 
telomeres do not shorten with each cell division. Therefore, if culture shock can 
be avoided, some rodent cell types will divide indefinitely in culture. In addition, 
rodent cells often undergo spontaneous genetic changes in culture that inactivate 
their protective mechanisms, thereby producing immortalized cell lines. 

Cell lines can often be most easily generated from cancer cells, but these cul- 
tures—referred to as transformed cell lines—diffter from those prepared from nor- 
mal cells in several ways. Transformed cell lines often grow without attaching to a 
surface, for example, and they can proliferate to a much higher density in a culture 
dish. Similar properties can be induced experimentally in normal cells by trans- 
forming them with a tumor-inducing virus or chemical. The resulting transformed 
cell lines can usually cause tumors if injected into a susceptible animal. 

Transformed and nontransformed cell lines are extremely useful in cell 
research as sources of very large numbers of cells of a uniform type, especially 
since they can be stored in liquid nitrogen at -196°C for an indefinite period and 
retain their viability when thawed. It is important to keep in mind, however, that 
cell lines nearly always differ in important ways from their normal progenitors in 
the tissues from which they were derived. 

Some widely used cell lines are listed in Table 8-1. Different lines have dif- 
ferent advantages; for example, the PtK epithelial cell lines derived from the rat 


TABLE 8-1 


*Many of these cell lines were derived from tumors. All of them are capable of indefinite 
replication in culture and express at least some of the special characteristics of their 
cells of origin. 
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kangaroo, unlike many other cell types, remain flat during mitosis, allowing the 
mitotic apparatus to be readily observed in action. 


Hybridoma Cell Lines Are Factories That Produce Monoclonal 
Antibodies 


As we see throughout this book, antibodies are particularly useful tools for cell 
biology. Their great specificity allows precise visualization of selected proteins 
among the many thousands that each cell typically produces. Antibodies are 
often produced by inoculating animals with the protein of interest and subse- 
quently isolating the antibodies specific to that protein from the serum of the 
animal. However, only limited quantities of antibodies can be obtained from a 
single inoculated animal, and the antibodies produced will be a heterogeneous 
mixture of antibodies that recognize a variety of different antigenic sites on a mac- 
romolecule that differs from animal to animal. Moreover, antibodies specific for 
the antigen will constitute only a fraction of the antibodies found in the serum. An 
alternative technology, which allows the production of an unlimited quantity of 
identical antibodies and greatly increases the specificity and convenience of anti- 
body-based methods, is the production of monoclonal antibodies by hybridoma 
cell lines. 

This technology, developed in 1975, revolutionized the production of antibod- 
ies for use as tools in cell biology, as well as for the diagnosis and treatment of cer- 
tain diseases, including rheumatoid arthritis and cancer. The procedure requires 
hybrid cell technology (Figure 8-4), and it involves propagating a clone of cells 
from a single antibody-secreting B lymphocyte to obtain a homogeneous prepa- 
ration of antibodies in large quantities. B lymphocytes normally have a limited 
life-span in culture, but individual antibody-producing B lymphocytes from an 
immunized mouse, when fused with cells derived from a transformed B lympho- 
cyte cell line, can give rise to hybrids that have both the ability to make a particular 
antibody and the ability to multiply indefinitely in culture. These hybridomas are 
propagated as individual clones, each of which provides a permanent and stable 
source of a single type of monoclonal antibody. Each type of monoclonal anti- 
body recognizes a single type of antigenic site—for example, a particular cluster 
of five or six amino acid side chains on the surface of a protein. Their uniform 
specificity makes monoclonal antibodies much more useful than conventional 
antisera for many purposes. 

An important advantage of the hybridoma technique is that monoclonal anti- 
bodies can be made against molecules that constitute only a minor component 
of a complex mixture. In an ordinary antiserum made against such a mixture, the 
proportion of antibody molecules that recognize the minor component would be 
too small to be useful. But if the B lymphocytes that produce the various compo- 
nents of this antiserum are made into hybridomas, it becomes possible to screen 
individual hybridoma clones from the large mixture to select one that produces 
the desired type of monoclonal antibody and to propagate the selected hybridoma 
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Figure 8-4 The production of hybrid 
cells. It is possible to fuse one cell 

with another to form a heterokaryon, a 
combined cell with two separate nuclei. 
Typically, a Suspension of cells is treated 
with certain inactivated viruses or with 
polyethylene glycol, each of which alters 
the plasma membranes of cells in a way 
that induces them to fuse. Eventually, 

a heterokaryon proceeds to mitosis 

and produces a hybrid cell in which the 
two separate nuclear envelopes have 
been disassembled, allowing all the 
chromosomes to be brought together in a 
single large nucleus. Such hybrid cells can 
give rise to immortal hybrid cell lines. If one 
of the parent cells was from a tumor cell 
line, the hybrid cell is called a hybridoma. 


PURIFYING PROTEINS 


indefinitely so as to produce that antibody in unlimited quantities. In principle, 
therefore, a monoclonal antibody can be made against any protein in a biological 
sample. Once an antibody has been made, it can be used to localize the protein 
in cells and tissues, to follow its movement, and to purify the protein to study its 
structure and function. 


Summary 


Tissues can be dissociated into their component cells, from which individual cell 
types can be purified and used for biochemical analysis or for the establishment of 
cell cultures. Many animal and plant cells survive and proliferate in a culture dish if 
they are provided with a suitable culture medium containing nutrients and appro- 
priate signal molecules. Although many animal cells stop dividing after a finite 
number of cell divisions, cells that have been immortalized through spontaneous 
mutations or genetic manipulation can be maintained indefinitely as cell lines. 
Hybridoma cells are widely employed to produce unlimited quantities of uniform 
monoclonal antibodies, which are used to detect and purify cell proteins, as well as 
to diagnose and treat diseases. 


PURIFYING PROTEINS 


The challenge of isolating a single type of protein from the thousands of other 
proteins present in a cell is a formidable one, but must be overcome in order to 
study protein function in vitro. As we shall see later in this chapter, recombinant 
DNA technology can enormously simplify this task by “tricking” cells into produc- 
ing large quantities of a given protein, thereby making its purification much eas- 
ier. Whether the source of the protein is an engineered cell or a natural tissue, a 
purification procedure usually starts with subcellular fractionation to reduce the 
complexity of the material, and is then followed by purification steps of increasing 
specificity. 


Cells Can Be Separated into Their Component Fractions 


To purify a protein, it must first be extracted from inside the cell. Cells can be 
broken up in various ways: they can be subjected to osmotic shock or ultrasonic 
vibration, forced through a small orifice, or ground up in a blender. These proce- 
dures break many of the membranes of the cell (including the plasma membrane 
and endoplasmic reticulum) into fragments that immediately reseal to form small 
closed vesicles. If carefully carried out, however, the disruption procedures leave 
organelles such as nuclei, mitochondria, the Golgi apparatus, lysosomes, and 
peroxisomes largely intact. The suspension of cells is thereby reduced to a thick 
slurry (called a homogenate or extract) that contains a variety of membrane-en- 
closed organelles, each with a distinctive size, charge, and density. Provided that 
the homogenization medium has been carefully chosen (by trial and error for 
each organelle), the various components—including the vesicles derived from 
the endoplasmic reticulum, called microsomes—retain most of their original bio- 
chemical properties. 

The different components of the homogenate must then be separated. Such 
cell fractionations became possible only after the commercial development in 
the early 1940s of an instrument known as the preparative ultracentrifuge, which 
rotates extracts of broken cells at high speeds (Figure 8-5). This treatment sep- 
arates cell components by size and density: in general, the largest objects expe- 
rience the largest centrifugal force and move the most rapidly. At relatively low 
speed, large components such as nuclei sediment to form a pellet at the bottom 
of the centrifuge tube; at slightly higher speed, a pellet of mitochondria is depos- 
ited; and at even higher speeds and with longer periods of centrifugation, first the 
small closed vesicles and then the ribosomes can be collected (Figure 8-6). All 
of these fractions are impure, but many of the contaminants can be removed by 
resuspending the pellet and repeating the centrifugation procedure several times. 


445 


446 Chapter 8: Analyzing Cells, Molecules, and Systems 


(A) armored 
chamber 


sedimenting (B) 
material 


sedimenting 
material 


hinge 
























































vacuum 


| 


refrigeration refrigeration vacuum 


motor motor 


Figure 8-5 The preparative ultracentrifuge. (A) The sample is contained in tubes that are 
inserted into a ring of angled cylindrical holes in a metal rotor. Rapid rotation of the rotor generates 
enormous centrifugal forces, which cause particles in the sample to sediment against the bottom 
sides of the sample tubes, as shown here. The vacuum reduces friction, preventing heating of the 
rotor and allowing the refrigeration system to maintain the sample at 4°C. (B) Some fractionation 
methods require a different type of rotor called a swinging-bucket rotor. In this case, the sample 
tubes are placed in metal tubes on hinges that allow the tubes to swing outward when the rotor 
spins. Sample tubes are therefore horizontal during spinning, and samples are sedimented toward 
the bottom, not the sides, of the tube, providing better separation of differently sized components 
(see Figures 8-6 and 8-7). 


Centrifugation is the first step in most fractionations, but it separates only com- 
ponents that differ greatly in size. A finer degree of separation can be achieved by 
layering the homogenate in a thin band on top of a salt solution that fills a centri- 
fuge tube. When centrifuged, the various components in the mixture move as a 
series of distinct bands through the solution, each at a different rate, in a process 
called velocity sedimentation (Figure 8-7A). For the procedure to work effectively, 
the bands must be protected from convective mixing, which would normally 
occur whenever a denser solution (for example, one containing organelles) finds 
itself on top of a lighter one (the salt solution). This is achieved by augmenting 
the solution in the tube with a shallow gradient of sucrose prepared by a special 
mixing device. The resulting density gradient—with the dense end at the bottom 
of the tube—keeps each region of the solution denser than any solution above it, 
and it thereby prevents convective mixing from distorting the separation. 

When sedimented through sucrose gradients, different cell components sepa- 
rate into distinct bands that can be collected individually. The relative rate at which 
each component sediments depends primarily on its size and shape—normally 
being described in terms of its sedimentation coefficient, or S value. Present-day 
ultracentrifuges rotate at speeds of up to 80,000 rpm and produce forces as high as 
500,000 times gravity. These enormous forces drive even small macromolecules, 
such as tRNA molecules and simple enzymes, to sediment at an appreciable rate 
and allow them to be separated from one another by size. 

The ultracentrifuge is also used to separate cell components on the basis of 
their buoyant density, independently of their size and shape. In this case, the 


Figure 8-6 Cell fractionation by centrifugation. Repeated centrifugation 
at progressively higher speeds will fractionate homogenates of cells into their 
components. In general, the smaller the subcellular component, the greater 
the centrifugal force required to sediment it. Typical values for the various 
centrifugation steps referred to in the figure are: 

low speed: 1000 times gravity for 10 minutes 

medium speed: 20,000 times gravity for 20 minutes 

high speed: 80,000 times gravity for 1 hour 

very high speed: 150,000 times gravity for 3 hours 
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sample is sedimented through a steep density gradient that contains a very high 
concentration of sucrose or cesium chloride. Each cell component begins to move 
down the gradient as in Figure 8-7A, but it eventually reaches a position where the 
density of the solution is equal to its own density. At this point, the component 
floats and can move no farther. A series of distinct bands is thereby produced in 
the centrifuge tube, with the bands closest to the bottom of the tube containing 
the components of highest buoyant density (Figure 8-7B). This method, called 
equilibrium sedimentation, is so sensitive that it can separate macromolecules 
that have incorporated heavy isotopes, such as !C or !N, from the same mac- 
romolecules that contain the lighter, common isotopes (1°C or 14N). In fact, the 
cesium-chloride method was developed in 1957 to separate the labeled from the 
unlabeled DNA produced after exposure of a growing population of bacteria to 
nucleotide precursors containing !°N; this classic experiment provided direct evi- 
dence for the semiconservative replication of DNA (see Figure 5-5). 


Cell Extracts Provide Accessible Systems to Study Cell Functions 


Studies of organelles and other large subcellular components isolated in the ultra- 
centrifuge have contributed enormously to our understanding of the functions of 
different cell components. Experiments on mitochondria and chloroplasts puri- 
fied by centrifugation, for example, demonstrated the central function of these 
organelles in converting energy into forms that the cell can use. Similarly, resealed 
vesicles formed from fragments of rough and smooth endoplasmic reticulum 
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Figure 8-7 Comparison of velocity 
sedimentation and equilibrium 
sedimentation. (A) In velocity 
sedimentation, subcellular components 
sediment at different speeds according to 
their size and shape when layered over a 
solution containing sucrose. To stabilize 
the sedimenting bands against convective 
mixing caused by small differences in 
temperature or solute concentration, 

the tube contains a continuous shallow 
gradient of sucrose, which increases 

in concentration toward the bottom 

of the tube (typically from 5% to 20% 
sucrose). After centrifugation, the different 
components can be collected individually, 
most simply by puncturing the plastic 
centrifuge tube with a needle and collecting 
drops from the bottom, as illustrated here. 
(B) In equilibrium sedimentation, subcellular 
components move up or down when 
centrifuged in a gradient until they reach 

a position where their density matches 
their surroundings. Although a sucrose 
gradient is shown here, denser gradients, 
which are especially useful for protein and 
nucleic acid separation, can be formed 
from cesium chloride. The final bands, at 
equilibrium, can be collected as in (A). 
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(microsomes) have been separated from each other and analyzed as functional 
models of these compartments of the intact cell. 

Similarly, highly concentrated cell extracts, especially extracts of Xenopus lae- 
vis (African clawed frog) oocytes, have played a critical role in the study of such 
complex and highly organized processes as the cell-division cycle, the separa- 
tion of chromosomes on the mitotic spindle, and the vesicular-transport steps 
involved in the movement of proteins from the endoplasmic reticulum through 
the Golgi apparatus to the plasma membrane. 

Cell extracts also provide, in principle, the starting material for the complete 
separation of all of the individual macromolecular components of the cell. We 
now consider how this separation is achieved, focusing on proteins. 


Proteins Can Be Separated by Chromatography 


Proteins are most often fractionated by column chromatography, in which a 
mixture of proteins in solution is passed through a column containing a porous 
solid matrix. Different proteins are retarded to different extents by their interac- 
tion with the matrix, and they can be collected separately as they flow out of the 
bottom of the column (Figure 8-8). Depending on the choice of matrix, proteins 
can be separated according to their charge (ion-exchange chromatography), their 
hydrophobicity (hydrophobic chromatography), their size (gel-filtration chroma- 
tography), or their ability to bind to particular small molecules or to other macro- 
molecules (affinity chromatography). 

Many types of matrices are available. lon-exchange columns (Figure 8-9A) 
are packed with small beads that carry either a positive or a negative charge, so 
that proteins are fractionated according to the arrangement of charges on their 
surface. Hydrophobic columns are packed with beads from which hydrophobic 
side chains protrude, selectively retarding proteins with exposed hydrophobic 
regions. Gel-filtration columns (Figure 8-9B), which separate proteins according 
to their size, are packed with tiny porous beads: molecules that are small enough 
to enter the pores linger inside successive beads as they pass down the column, 
while larger molecules remain in the solution flowing between the beads and 
therefore move more rapidly, emerging from the column first. Besides providing 
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Figure 8-8 The separation of molecules 
by column chromatography. The sample, 
a solution containing a mixture of different 
molecules, is applied to the top of a 
cylindrical glass or plastic column filled with 
a permeable solid matrix, such as cellulose. 
A large amount of solvent is then passed 
slowly through the column and collected 

in separate tubes as it emerges from the 
bottom. Because various components of 
the sample travel at different rates through 
the column, they are fractionated into 
different tubes. 
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a means of separating molecules, gel-filtration chromatography is a convenient 
way to estimate their size. 

Affinity chromatography (Figure 8-9C) takes advantage of the biologically 
important binding interactions that occur on protein surfaces. If a substrate mol- 
ecule is covalently coupled to an inert matrix such as a polysaccharide bead, the 
enzyme that operates on that substrate will often be specifically retained by the 
matrix and can then be eluted (washed out) in nearly pure form. Likewise, short 
DNA oligonucleotides of a specifically designed sequence can be immobilized in 
this way and used to purify DNA-binding proteins that normally recognize this 
sequence of nucleotides in chromosomes. Alternatively, specific antibodies can 
be coupled to a matrix to purify protein molecules recognized by the antibodies. 
Because of the great specificity of all such affinity columns, 1000- to 10,000-fold 
purifications can sometimes be achieved in a single pass. 

If one starts with a complex mixture of proteins, a single passage through an 
ion-exchange or a gel-filtration column does not produce very highly purified 
fractions, since these methods individually increase the proportion of a given pro- 
tein in the mixture no more than twentyfold. Because most individual proteins 
represent less than 1/1000 of the total cell protein, it is usually necessary to use 
several different types of columns in succession to attain sufficient purity, with 
affinity chromatography being the most efficient (Figure 8-10). 

Inhomogeneities in the matrices (such as cellulose), which cause an uneven 
flow of solvent through the column, limit the resolution of conventional column 
chromatography. Special chromatography resins (usually silica-based) com- 
posed of tiny spheres (3-10 um in diameter) can be packed with a special appa- 
ratus to form a uniform column bed. Such high-performance liquid chromatog- 
raphy (HPLC) columns attain a high degree of resolution. In HPLC, the solutes 
equilibrate very rapidly with the interior of the tiny spheres, and so solutes with 
different affinities for the matrix are efficiently separated from one another even 
at very fast flow rates. HPLC is therefore the method of choice for separating many 
proteins and small molecules. 


Immunoprecipitation Is a Rapid Affinity Purification Method 


Immunoprecipitation is a useful variation on the theme of affinity chromatogra- 
phy. Specific antibodies that recognize the protein to be purified are attached to 
small agarose beads. Rather than being packed into a column, as in affinity chro- 
matography, a small quantity of the antibody-coated beads is simply added to a 
protein extract in a test tube and mixed in suspension for a short period of time— 
thereby allowing the antibodies to bind the desired protein. The beads are then 
collected by low-speed centrifugation, and the unbound proteins in the super- 
natant are discarded. This method is commonly used to purify small amounts of 
enzymes from cell extracts for analysis of enzymatic activity or for studies of asso- 
ciated proteins. 


Figure 8-9 Three types of matrices 
used for chromatography. (A) In ion- 
exchange chromatography, the insoluble 
matrix carries ionic charges that retard 

the movement of molecules of opposite 
charge. Matrices used for separating 
proteins include diethylaminoethylcellulose 
(DEAE-cellulose), which is positively 
charged, and carboxymethylcellulose 
(CM-cellulose) and phosphocellulose, 
which are negatively charged. Analogous 
matrices based on agarose or other 
polymers are also frequently used. The 
strength of the association between the 
dissolved molecules and the ion-exchange 
matrix depends on both the ionic strength 
and the pH of the solution that is passing 
down the column, which may therefore be 
varied systematically (as in Figure 8—10) 

to achieve an effective separation. (B) In 
gel-filtration chromatography, the small 
beads that form the matrix are inert but 
porous. Molecules that are small enough 
to penetrate into the matrix beads are 
thereby delayed and travel more slowly 
through the column than larger molecules 
that cannot penetrate. Beads of cross- 
linked polysaccharide (dextran, agarose, 
or acrylamide) are available commercially 
in a wide range of pore sizes, making them 
suitable for the fractionation of molecules of 
various molecular mass, from less than 
500 daltons to more than 5 x 10® daltons. 
(C) Affinity chromatography uses an 
insoluble matrix that is covalently linked 

to a specific ligand, such as an antibody 
molecule or an enzyme substrate, that will 
bind a specific protein. Enzyme molecules 
that bind to immobilized substrates 

on such columns can be eluted with a 
concentrated solution of the free form of 
the substrate molecule, while molecules 
that bind to immobilized antibodies can be 
eluted by dissociating the antibody—antigen 
complex with concentrated salt solutions or 
solutions of high or low pH. High degrees 
of purification can be achieved in a single 
pass through an affinity column. 
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Genetically Engineered Tags Provide an Easy Way to Purify 
Proteins 


Using the recombinant DNA methods discussed in subsequent sections, any gene 
can be modified to produce its protein with a special recognition tag attached to 
it, so as to make subsequent purification of the protein simple and rapid. Often 
the recognition tag is itself an antigenic determinant, or epitope, which can be 
recognized by a highly specific antibody. The antibody can then be used to purify 
the protein by affinity chromatography or immunoprecipitation (Figure 8-11). 
Other types of tags are specifically designed for protein purification. For exam- 
ple, a repeated sequence of the amino acid histidine binds to certain metal ions, 
including nickel and copper. If genetic engineering techniques are used to attach 
a short string of histidines to one end of a protein, the slightly modified protein 
can be retained selectively on an affinity column containing immobilized nickel 
ions. Metal affinity chromatography can thereby be used to purify the modified 
protein from a complex molecular mixture. 


Figure 8-10 Protein purification by 
chromatography. Typical results obtained 
when three different chromatographic steps 
are used in succession to purify a protein. 
In this example, a homogenate of cells was 
first fractionated by allowing it to percolate 
through an ion-exchange resin packed into 
a column (A). The column was washed to 
remove all unbound contaminants, and the 
bound proteins were then eluted by pouring 
a solution containing a gradually increasing 
concentration of salt onto the top of the 
column. Proteins with the lowest affinity 

for the ion-exchange resin passed directly 
through the column and were collected in 
the earliest fractions eluted from the bottom 
of the column. The remaining proteins 
were eluted in sequence according to 

their affinity for the resin—those proteins 
binding most tightly to the resin requiring 
the highest concentration of salt to remove 
them. The protein of interest was eluted 

in several fractions and was detected by 
its enzymatic activity. The fractions with 
activity were pooled and then applied 

to a gel-filtration column (B). The elution 
position of the still-impure protein was 
again determined by its enzymatic activity, 
and the active fractions were pooled and 
purified to homogeneity on an affinity 
column (C) that contained an immobilized 
substrate of the enzyme. 
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Figure 8-11 Epitope tagging for the 
purification of proteins. Using standard 
genetic engineering techniques, a 

short peptide tag can be added to a 
protein of interest. If the tag is Itself an 
antigenic determinant, or epitope, it can 
be targeted by an appropriate antibody, 
which can be used to purify the protein 
by immunoprecipitation or affinity 
chromatography. 


PURIFYING PROTEINS 


In other cases, an entire protein is used as the recognition tag. When cells 
are engineered to synthesize the small enzyme glutathione S-transferase (GST) 
attached to a protein of interest, the resulting fusion protein can be purified from 
the other contents of the cell with an affinity column containing glutathione, a 
substrate molecule that binds specifically and tightly to GST. 

As a further refinement of purification methods using recognition tags, an 
amino acid sequence that forms a cleavage site for a highly specific proteolytic 
enzyme can be engineered between the protein of choice and the recognition tag. 
Because the amino acid sequences at the cleavage site are very rarely found by 
chance in proteins, the tag can later be cleaved off without destroying the purified 
protein. 

This type of specific cleavage is used in an especially powerful purification 
methodology known as tandem affinity purification tagging (TAP-tagging). Here, 
one end of a protein is engineered to contain two recognition tags that are sepa- 
rated by a protease cleavage site. The tag on the very end of the construct is cho- 
sen to bind irreversibly to an affinity column, allowing the column to be washed 
extensively to remove all contaminating proteins. Protease cleavage then releases 
the protein, which is then further purified using the second tag. Because this two- 
step strategy provides an especially high degree of protein purification with rela- 
tively little effort, it is used extensively in cell biology. Thus, for example, a set of 
approximately 6000 yeast strains, each with a different gene fused to DNA that 
encodes a TAP-tag, has been constructed to allow any yeast protein to be rapidly 
purified. 


Purified Cell-free Systems Are Required for the Precise Dissection 
of Molecular Functions 


Purified cell-free systems provide a means of studying biological processes free 
from all of the complex side reactions that occur in a living cell. To make this pos- 
sible, cell homogenates are fractionated with the aim of purifying each of the indi- 
vidual macromolecules that are needed to catalyze a biological process of interest. 
For example, the experiments to decipher the mechanisms of protein synthesis 
began with a cell homogenate that could translate RNA molecules to produce 
proteins. Fractionation of this homogenate, step by step, produced in turn the 
ribosomes, tRNAs, and various enzymes that together constitute the protein-syn- 
thetic machinery. Once individual pure components were available, each could 
be added or withheld separately to define its exact role in the overall process. 

A major goal for cell biologists is the reconstitution of every biological process 
in a purified cell-free system. Only in this way can we define all of the components 
needed for the process and control their concentrations, which is required to work 
out their precise mechanism of action. Although much remains to be done, a 
great deal of what we know today about the molecular biology of the cell has been 
discovered by studies in such cell-free systems. They have been used, for example, 
to decipher the molecular details of DNA replication and DNA transcription, RNA 
splicing, protein translation, muscle contraction, and particle transport along 
microtubules, and many other processes that occur in cells. 


Summary 


Populations of cells can be analyzed biochemically by disrupting them and fraction- 
ating their contents, allowing functional cell-free systems to be developed. Highly 
purified cell-free systems are needed for determining the molecular details of com- 
plex cell processes, and the development of such systems requires extensive purifi- 
cation of all the proteins and other components involved. The proteins in soluble 
cell extracts can be purified by column chromatography; depending on the type of 
column matrix, biologically active proteins can be separated on the basis of their 
molecular weight, hydrophobicity, charge characteristics, or affinity for other mol- 
ecules. In a typical purification, the sample is passed through several different col- 
umns in turn, with the enriched fractions obtained from one column being applied 
to the next. Recombinant DNA techniques (described later) allow special recogni- 
tion tags to be attached to proteins, thereby greatly simplifying their purification. 
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ANALYZING PROTEINS 


Proteins perform most cellular processes: they catalyze metabolic reactions, use 
nucleotide hydrolysis to do mechanical work, and serve as the major structural 
elements of the cell. The great variety of protein structures and functions has stim- 
ulated the development of a multitude of techniques to study them. 


Proteins Can Be Separated by SDS Polyacrylamide-Gel 
Electrophoresis 


Proteins usually possess a net positive or negative charge, depending on the mix- 
ture of charged amino acids they contain. An electric field applied to a solution 
containing a protein molecule causes the protein to migrate at a rate that depends 
on its net charge and on its size and shape. The most popular application of this 
property is SDS polyacrylamide-gel electrophoresis (SDS-PAGE). It uses a highly 
cross-linked gel of polyacrylamide as the inert matrix through which the proteins 
migrate. The gel is prepared by polymerization of monomers; the pore size of the 
gel can be adjusted so that it is small enough to retard the migration of the pro- 
tein molecules of interest. The proteins are dissolved in a solution that includes a 
powerful negatively charged detergent, sodium dodecyl sulfate, or SDS (Figure 
8-12). Because this detergent binds to hydrophobic regions of the protein mol- 
ecules, causing them to unfold into extended polypeptide chains, the individual 
protein molecules are released from their associations with other proteins or lipid 
molecules and rendered freely soluble in the detergent solution. In addition, a 
reducing agent such as B-mercaptoethanol (see Figure 8-12) is usually added to 
break any S-S linkages in the proteins, so that all of the constituent polypeptides 
in multisubunit proteins can be analyzed separately. 

What happens when a mixture of SDS-solubilized proteins is run through 
a slab of polyacrylamide gel? Each protein molecule binds large numbers of 
the negatively charged detergent molecules, which mask the protein’s intrin- 
sic charge and cause it to migrate toward the positive electrode when a voltage 
is applied. Proteins of the same size tend to move through the gel with similar 
speeds because (1) their native structure is completely unfolded by the SDS, so 
that their shapes are the same, and (2) they bind the same amount of SDS and 
therefore have the same amount of negative charge. Larger proteins, with more 
charge, are subjected to larger electrical forces but also to a larger drag. In free 
solution, the two effects would cancel out, but, in the mesh of the polyacrylamide 
gel, which acts as a molecular sieve, large proteins are retarded much more than 
small ones. As a result, a complex mixture of proteins is fractionated into a series 
of discrete protein bands arranged in order of molecular weight (Figure 8-13). 
The major proteins are readily detected by staining the proteins in the gel with 
a dye such as Coomassie blue. Even minor proteins are seen in gels treated with 
a silver stain, so that as little as 10 ng of protein can be detected in a band. For 
some purposes, specific proteins can also be labeled with a radioactive isotope 
tag; exposure of the gel to film results in an autoradiograph on which the labeled 
proteins are visible (see Figure 8-16). 

SDS-PAGE is widely used because it can separate all types of proteins, includ- 
ing those that are normally insoluble in water—such as the many proteins in 
membranes. And because the method separates polypeptides by size, it provides 
information about the molecular weight and the subunit composition of proteins. 
Figure 8-14 presents a photograph of a gel that has been used to analyze each of 
the successive stages in the purification of a protein. 


Two-Dimensional Gel Electrophoresis Provides Greater Protein 
Separation 
Because different proteins can have similar sizes, shapes, masses, and overall 


charges, most separation techniques such as SDS polyacrylamide-gel electropho- 
resis or ion-exchange chromatography cannot typically separate all the proteins 





Figure 8-12 The detergent sodium 
dodecyl sulfate (SDS) and the reducing 
agent B-mercaptoethanol. These two 
chemicals are used to solubilize proteins for 
SDS polyacrylamide-gel electrophoresis. 
The SDS is shown here in its ionized form. 
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Figure 8-13 SDS polyacrylamide-gel electrophoresis (SDS-PAGE). (A) An 
electrophoresis apparatus. (B) Individual polypeptide chains form a complex 
with negatively charged molecules of sodium dodecyl sulfate (SDS) and 
therefore migrate as a negatively charged SDS-protein complex through a 
porous gel of polyacrylamide. Because the speed of migration under these 
conditions is greater the smaller the polypeptide, this technique can be used 
to determine the approximate molecular weight of a polypeptide chain as 
well as the subunit composition of a protein. If the protein contains a large 
amount of carbohydrate, however, it will move anomalously on the gel and its 
apparent molecular weight estimated by SDS-PAGE will be misleading. Other 
modifications, such as phosphorylation, can also cause small changes in a 
protein’s migration in the gel. 


in a cell or even in an organelle. In contrast, two-dimensional gel electrophore- 
sis, which combines two different separation procedures, can resolve up to 2000 
proteins in the form of a two-dimensional protein map. 

In the first step, the proteins are separated by their intrinsic charges. The sam- 
ple is dissolved in a small volume of a solution containing a nonionic (uncharged) 
detergent, together with B-mercaptoethanol and the denaturing reagent urea. 
This solution solubilizes, denatures, and dissociates all the polypeptide chains 
but leaves their intrinsic charge unchanged. The polypeptide chains are then 
separated in a pH gradient by a procedure called isoelectric focusing, which takes 
advantage of the variation in the net charge on a protein molecule with the pH of 
its surrounding solution. Every protein has a characteristic isoelectric point, the 
pH at which the protein has no net charge and therefore does not migrate in an 
electric field. In isoelectric focusing, proteins are separated electrophoretically in 
a narrow tube of polyacrylamide gel in which a gradient of pH is established by 
a mixture of special buffers. Each protein moves to a position in the gradient that 


Figure 8-14 Analysis of protein samples by SDS polyacrylamide-gel 
electrophoresis. The photograph shows a Coomassie-stained gel that 
has been used to detect the proteins present at Successive stages in the 
purification of an enzyme. The leftmost lane (lane 1) contains the complex 
mixture of proteins in the starting cell extract, and each succeeding lane 
analyzes the proteins obtained after a chromatographic fractionation of the 
protein sample analyzed in the previous lane (See Figure 8-10). The same 
total amount of protein (10 ug) was loaded onto the gel at the top of each 
lane. Individual proteins normally appear as sharp, dye-stained bands; a band 
broadens, however, when it contains a large amount of protein. (From 

T. Formosa and B.M. Alberts, J. Biol. Chem. 261:6107-6118, 1986.) 
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corresponds to its isoelectric point and remains there (Figure 8-15). This is the 
first dimension of two-dimensional polyacrylamide-gel electrophoresis. 

In the second step, the narrow tube gel containing the separated proteins is 
again subjected to electrophoresis but in a direction that is at a right angle to the 
direction used in the first step. This time SDS is added, and the proteins separate 
according to their size, as in one-dimensional SDS-PAGE: the original tube gel is 
soaked in SDS and then placed along the top edge of an SDS polyacrylamide-gel 
slab, through which each polypeptide chain migrates to form a discrete spot. This 
is the second dimension of two-dimensional polyacrylamide-gel electrophore- 
sis. The only proteins left unresolved are those that have both identical sizes and 
identical isoelectric points, a relatively rare situation. Even trace amounts of each 
polypeptide chain can be detected on the gel by various staining procedures—or 
by autoradiography if the protein sample was initially labeled with a radioisotope 
(Figure 8-16). The technique has such great resolving power that it can distin- 
guish between two proteins that differ in only a single charged amino acid, or a 
single negatively charged phosphorylation site. 


Specific Proteins Can Be Detected by Blotting with Antibodies 


A specific protein can be identified after its fractionation on a polyacrylamide 
gel by exposing all the proteins present on the gel to a specific antibody that has 
been labeled with a radioactive isotope or a fluorescent dye. This procedure is 
normally carried out after transferring all of the separated proteins present in the 
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Figure 8-15 Separation of protein 
molecules by isoelectric focusing. 

At low pH (high H+ concentration), the 
carboxylic acid groups of proteins tend to 
be uncharged (-COOH) and their nitrogen- 
containing basic groups fully charged (for 
example, -NH3*), giving most proteins 

a net positive charge. At high pH, the 
carboxylic acid groups are negatively 
charged (-COO’) and the basic groups 
tend to be uncharged (for example, —NHp), 
giving most proteins a net negative charge. 
At its isoelectric pH, a protein has no net 
charge since the positive and negative 
charges balance. Thus, when a tube 
containing a fixed pH gradient is subjected 
to a strong electric field in the appropriate 
direction, each protein species migrates 
until it forms a sharp band at its isoelectric 
OH, as shown. 


Figure 8-16 Two-dimensional 
polyacrylamide-gel electrophoresis. 

All the proteins in an E. coli bacterial cell are 
separated in this gel, in which each spot 
corresponds to a different polypeptide chain. 
The proteins were first separated on the 
basis of their isoelectric points by isoelectric 
focusing in the horizontal dimension. They 
were then further fractionated according to 
their molecular mass by electrophoresis from 
top to bottom in the presence of SDS. Note 
that different proteins are present in very 
different amounts. The bacteria were fed with 
a mixture of radioisotope-labeled amino acids 
so that all of their proteins were radioactive 
and could be detected by autoradiography. 
(Courtesy of Patrick O’Farrell.) 


ANALYZING PROTEINS 455 





gel onto a sheet of nitrocellulose paper or nylon membrane. Placing the mem- Figure 8-17 Western blotting. All 
brane over the gel and driving the proteins out of the gel with a strong electric the Proteins from dividing tobacco 
t transfers the protein onto the membrane. The membrane is then soaked 0°87 culture were first separated by 

ee ; p ; an j , two-dimensional polyacrylamide-gel 
in a solution of labeled antibody to reveal the protein of interest. This method of electrophoresis. In (A), the positions of 
detecting proteins is called Western blotting, or immunoblotting (Figure 8-17). the proteins are revealed by a sensitive 
Sensitive Western-blotting methods can detect very small amounts of a specific protein stain. In (B), the separated proteins 
protein (1 nanogram or less) in a total cell extract or some other heterogeneous 2" 2 Centical gel were then transferred 

; l Th Wod b ful wh l h f to a sheet of nitrocellulose and exposed 
protein mixture. The method can be very useful when assessing the amounts ofa tg an antibody that recognizes only those 
specific protein in the cell or when measuring changes in those amounts under proteins that are phosphorylated on 


various conditions. threonine residues during mitosis. The 
positions of the few proteins that are 


: recognized by this antibody are revealed b 
Hydrodynamic Measurements Reveal the Size and Shape of a a eT ain 


Protein Complex J.A. Traas et al., Plant J. 2:723-732, 1992. 
With permission from Blackwell Publishing.) 
Most proteins in a cell act as part of larger complexes, and knowledge of the size 


and shape of these complexes often leads to insights regarding their function. This 
information can be obtained in several important ways. Sometimes, a complex 
can be directly visualized using electron microscopy, as described in Chapter 9. 
A complementary approach relies on the hydrodynamic properties of a complex; 
that is, its behavior as it moves through a liquid medium. Usually, two separate 
measurements are made. One measure is the velocity of a complex as it moves 
under the influence of a centrifugal field produced by an ultracentrifuge (see Fig- 
ure 8-7A). The sedimentation coefficient (or S value) obtained depends on both 
the size and the shape of the complex and does not, by itself, convey especially 
useful information. However, once a second hydrodynamic measurement is per- 
formed—by charting the migration of a complex through a gel-filtration chroma- 
tography column (see Figure 8-9B)—both the approximate shape of a complex 
and its molecular weight can be calculated. 

Molecular weight can also be determined more directly by using an analytical 
ultracentrifuge, a complex device that allows protein absorbance measurements 
to be made on a sample while it is subjected to centrifugal forces. In this approach, 
the sample is centrifuged until it reaches equilibrium, where the centrifugal force 
on a protein complex exactly balances its tendency to diffuse away. Because this 
balancing point is dependent on a complex’s molecular weight but not on its par- 
ticular shape, the molecular weight can be directly calculated. 


Mass Spectrometry Provides a Highly Sensitive Method for 
Identifying Unknown Proteins 


A frequent problem in cell biology and biochemistry is the identification of a pro- 
tein or collection of proteins that has been obtained by one of the purification pro- 
cedures discussed in the preceding pages. Because the genome sequences of most 
experimental organisms are now known, catalogs of all the proteins produced in 
those organisms are available. The task of identifying an unknown protein (or col- 
lection of unknown proteins) thus reduces to matching some of the amino acid 
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Figure 8-18 The mass spectrometer. (A) Mass spectrometers used in biology contain an ion source that generates gaseous 
peptides or other molecules under conditions that render most molecules positively charged. The two major types of ion source 
are MALDI and electrospray, as described in the text. lons are accelerated into a mass analyzer, which separates the ions 

on the basis of their mass and charge by one of three major methods: 1. Time-of-flight (TOF) analyzers determine the mass- 
to-charge ratio of each ion in the mixture from the rate at which it travels from the ion source to the detector. 2. Quadropole 
mass filters contain a long chamber lined by four electrodes that produce oscillating electric fields that govern the trajectory of 
ions; by varying the properties of the electric field over a wide range, a spectrum of ions with specific mass-to-charge ratios is 
allowed to pass through the chamber to the detector, while other ions are discarded. 3. lon traps contain doughnut-shaped 
electrodes producing a three-dimensional electric field that traps all ions in a circular chamber; the properties of the electric 

field can be varied over a wide range to eject a spectrum of specific ions to a detector. (B) Tandem mass spectrometry typically 
involves two mass analyzers separated by a collision chamber containing an inert, high-energy gas. The electric field of the 

first mass analyzer is adjusted to select a specific peptide ion, called a precursor ion, which is then directed to the collision 
chamber. Collision of the peptide with gas molecules causes random peptide fragmentation, primarily at peptide bonds, 
resulting in a highly complex mixture of fragments containing one or more amino acids from throughout the original peptide. The 
second mass analyzer is then used to measure the masses of the fragments (called product or daughter ions). With computer 
assistance, the pattern of fragments can be used to deduce the amino acid sequence of the original peptide. 


sequences present in the unknown sample with known cataloged genes. This task 
is now performed almost exclusively by using mass spectrometry in conjunction 
with computer searches of databases. 

Charged particles have very precise dynamics when subjected to electrical and 
magnetic fields in a vacuum. Mass spectrometry exploits this principle to separate 
ions according to their mass-to-charge (m/z) ratio. It is an enormously sensitive 
technique. It requires very little material and is capable of determining the precise 
mass of intact proteins and of peptides derived from them by enzymatic or chem- 
ical cleavage. Masses can be obtained with great accuracy, often with an error of 
less than one part in a million. 

Mass spectrometry is performed using complex instruments with three major 
components (Figure 8-18A). The first is the ion source, which transforms tiny 
amounts of a peptide sample into a gas containing individual charged peptide 
molecules. These ions are accelerated by an electric field into the second compo- 
nent, the mass analyzer, where electric or magnetic fields are used to separate the 
ions on the basis of their mass-to-charge ratios. Finally, the separated ions collide 
with a detector, which generates a mass spectrum containing a series of peaks rep- 
resenting the masses of the molecules in the sample. 

There are many different types of mass spectrometer, varying mainly in the 
nature of their ion sources and mass analyzers. One of the most common ion 
sources depends on a technique called matrix-assisted laser desorption ionization 
(MALDI). In this approach, the proteins in the sample are first cleaved into short 
peptides by a protease such as trypsin. These peptides are mixed with an organic 
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acid and then dried onto a metal or ceramic slide. A brief laser burst is directed 
toward the sample, producing a gaseous puff of ionized peptides, each carrying 
one or more positive charges. In many cases, the MALDI ion source is coupled to 
a mass analyzer called a time-of-flight (TOF) analyzer, which is a long chamber 
through which the ionized peptides are accelerated by an electric field toward a 
detector. Their mass and charge determine the time it takes them to reach the 
detector: large peptides move more slowly, and more highly charged molecules 
move more quickly. By analyzing those ionized peptides that bear a single charge, 
the precise masses of peptides present in the original sample can be determined. 
This information is then used to search genomic databases, in which the masses 
of all proteins and of all their predicted peptide fragments have been tabulated 
from the genomic sequences of the organism. An unambiguous match to a par- 
ticular open reading frame can often be made by knowing the mass of only a few 
peptides derived from a given protein. 

By employing two mass analyzers in tandem (an arrangement known as MS/ 
MS; Figure 8-18B), it is possible to directly determine the amino acid sequences of 
individual peptides in a complex mixture. The MALDI-TOF instrument described 
above is not ideal for this method. Instead, MS/MS typically involves an electro- 
spray ion source, which produces a continuous thin stream of peptides that are 
ionized and accelerated into the first mass analyzer. The mass analyzer is typi- 
cally either a guadropole or ion trap, which employs large electrodes to produce 
oscillating electric fields inside the chamber containing the ions. These instru- 
ments act as mass filters: the electric field is adjusted over a broad range to select 
a single peptide ion and discard all the others in the peptide mixture. In tandem 
mass spectrometry, this single ion is then exposed to an inert, high-energy gas, 
which collides with the peptide, resulting in fragmentation, primarily at peptide 
bonds. The second mass analyzer then determines the masses of the peptide frag- 
ments, which can be used by computational methods to determine the amino 
acid sequence of the original peptide and thereby identify the protein from which 
it came. 

Tandem mass spectrometry is also useful for detecting and precisely mapping 
post-translational modifications of proteins, such as phosphorylations or acetyl- 
ations. Because these modifications impart a characteristic mass increase to an 
amino acid, they are easily detected during the analysis of peptide fragments in 
the second mass analyzer, and the precise site of the modification can often be 
deduced from the spectrum of peptide fragments. 

A powerful, “two-dimensional” mass spectrometry technique can be used to 
determine all of the proteins present in an organelle or another complex mixture 
of proteins. First, the mixture of proteins present is digested with trypsin to pro- 
duce short peptides. Next, these peptides are separated by automated high-per- 
formance liquid chromatography (LC). Every peptide fraction from the chromato- 
graphic column is injected directly into an electrospray ion source on a tandem 
mass spectrometer (MS/MS), providing the amino acid sequence and post-trans- 
lational modifications for every peptide in the mixture. This method, often called 
LC-MS/MS, is used to identify hundreds or thousands of proteins in complex pro- 
tein mixtures from specific organelles or from whole cells. It can also be used to 
map all of the phosphorylation sites in the cell, or all of the proteins targeted by 
other post-translational modifications such as acetylation or ubiquitylation. 


sets of Interacting Proteins Can Be Identified by Biochemical 
Methods 


Because most proteins in the cell function as part of complexes with other pro- 
teins, an important way to begin to characterize the biological role of an unknown 
protein is to identify all of the other proteins to which it specifically binds. 

A key method for identifying proteins that bind to one another tightly is co- 
immunoprecipitation. A specific target protein is immunoprecipitated from a cell 
lysate using specific antibodies coupled to beads, as described earlier. If the target 
protein is associated tightly enough with another protein when it is captured by 
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the antibody, the partner precipitates as well and can be identified by mass spec- 
trometry. This method is useful for identifying proteins that are part of a complex 
inside cells, including those that interact only transiently—for example, when 
extracellular signal molecules stimulate cells (discussed in Chapter 15). 

In addition to capturing protein complexes on columns or in test tubes, 
researchers are developing high-density protein arrays to investigate protein 
interactions. These arrays, which contain thousands of different proteins or anti- 
bodies spotted onto glass slides or immobilized in tiny wells, allow one to exam- 
ine the biochemical activities and binding profiles of a large number of proteins 
at once. For example, if one incubates a fluorescently labeled protein with arrays 
containing thousands of immobilized proteins, the spots that remain fluorescent 
after extensive washing each contain a protein that specifically binds the labeled 
protein. 


Optical Methods Can Monitor Protein Interactions 


Once two proteins—or a protein and a small molecule—are known to associate, 
it becomes important to characterize their interaction in more detail. Proteins 
can associate with each other more or less permanently (like the subunits of RNA 
polymerase or the proteasome), or engage in transient encounters that may last 
only a few milliseconds (like a protein kinase and its substrate). To understand 
how a protein functions inside a cell, we need to determine how tightly it binds to 
other proteins, how rapidly it dissociates from them, and how covalent modifica- 
tions, small molecules, or other proteins influence these interactions. 

As we discussed in Chapter 3 (see Figure 3-44), the extent to which two pro- 
teins interact is determined by the rates at which they associate and dissociate. 
These rates depend, respectively, on the association rate constant (kon) and dis- 
sociation rate constant (koff). The kinetic rate constant kop is a particularly useful 
number because it provides valuable information about how long two proteins 
remain bound to one another. The ratio of the two kinetic constants (Kon/Kogf) 
yields another very useful number called the equilibrium constant (K, also known 
as Keg or Ka), the inverse of which is the more commonly used dissociation con- 
stant Kg. The equilibrium constant is useful as a general indicator of the affinity 
of the interaction, and it can be used to estimate the amount of bound complex at 
different concentrations of the two protein partners—thereby providing insights 
into the importance of the interaction at the protein concentrations found inside 
the cell. 

A wide range of methods can be used to determine binding constants for a 
two-protein complex. In a simple equilibrium binding experiment, two proteins 
are mixed at a range of concentrations, allowed to reach equilibrium, and the 
amount of bound complex is measured; half of the protein complex will be bound 
at a concentration that is equal to Kg. Equilibrium experiments often involve the 
use of radioactive or fluorescent tags on one of the protein partners, coupled with 
biochemical or optical methods for measuring the amount of bound protein. In 
a more complex kinetic binding experiment, the kinetic rate constants are deter- 
mined using rapid methods that allow real-time measurement of the formation 
of a bound complex over time (to determine kon) or the dissociation of a bound 
complex over time (to determine koff). 

Optical techniques provide particularly rapid, convenient, and accurate bind- 
ing measurements, and in some cases the proteins do not even need to be labeled. 
Certain amino acids (tryptophan, for example) exhibit weak fluorescence that can 
be detected with sensitive fluorimeters. In many cases, the fluorescence intensity, 
or the emission spectrum of fluorescent amino acids located in a protein-pro- 
tein interface, will change when two proteins associate. When this change can 
be detected by fluorimetry, it provides a simple and sensitive measure of protein 
binding that is useful in both equilibrium and kinetic binding experiments. A 
related but more widely useful optical binding technique is based on fluorescence 
anisotropy, a change in the polarized light that is emitted by a fluorescently tagged 
protein in the bound and free states (Figure 8-19). 
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Figure 8-19 Measurement of binding with fluorescence anisotropy. This method depends on a fluorescently tagged 
protein that is illuminated with polarized light at the appropriate wavelength for excitation; a fluorimeter is used to measure the 
intensity and polarization of the emitted light. If the fluorescent protein is fixed in position and therefore does not rotate during 


the brief period between excitation and emission, then the emitted light will be polarized at the same angle as the excitation 
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light. This directional effect is called fluorescence anisotropy. Protein molecules in solution rotate or tumble rapidly, however, 

so that there is a decrease in the amount of anisotropic fluorescence. Larger molecules tumble at a slower rate and therefore 
have higher fluorescence anisotropy. (A) To measure the binding between a small molecule and a large receptor protein, the 
smaller molecule is labeled with a fluorophore. In the absence of its binding partner, the molecule tumbles rapidly, resulting in 
low fluorescence anisotropy (top). When the small molecule binds to its larger partner, however, it tumbles less rapidly, resulting 
in an increase in fluorescence anisotropy (bottom). (B) In the equilibrium binding experiment shown here, a small, fluorescent 
peptide ligand was present at a low concentration, and the amount of fluorescence anisotropy (in millipolarization units, mP) was 
measured after incubation with various concentrations of a larger protein receptor for the ligand. From the hyperbolic curve that 
fits the data, it can be seen that 50% binding occurred at about 10 uM, which is equal to the dissociation constant Kg for the 


binding interaction. 


Another optical method for probing protein interactions uses green fluorescent 
protein (discussed in detail below) and its derivatives of different colors. In this 
application, two proteins of interest are each labeled with a different fluorescent 
protein, such that the emission spectrum of one fluorescent protein overlaps the 
absorption spectrum of the second. If the two proteins come very close to each 
other (within about 1-5 nm), the energy of the absorbed light is transferred from 
one fluorescent protein to the other. The energy transfer, called fluorescence reso- 
nance energy transfer (FRET), is determined by illuminating the first fluorescent 
protein and measuring emission from the second (see Figure 9-26). When com- 
bined with fluorescence microscopy, this method can be used to characterize 
protein-protein interactions at specific locations inside living cells (discussed in 
Chapter 9). 


Protein Function Can Be Selectively Disrupted With Small 
Molecules 


Small chemical inhibitors of specific proteins have contributed a great deal to the 
development of cell biology. For example, the microtubule inhibitor colchicine 
is routinely used to test whether microtubules are required for a given biological 
process; it also led to the first purification of tubulin several decades ago. In the 
past, these small molecules were usually natural products; that is, they were syn- 
thesized by living creatures. Although natural products have been extraordinarily 
useful in science and medicine (see, for example, Table 6-4, p. 352), they act on 
a limited number of biological processes. However, the recent development of 
methods to synthesize hundreds of thousands of small molecules and to carry out 
large-scale automated screens holds the promise of identifying chemical inhibi- 
tors for virtually any biological process. In such approaches, large collections of 
small chemical compounds are simultaneously tested, either on living cells or in 
cell-free assays. Once an inhibitor is identified, it can be used as a probe to iden- 
tify, through affinity chromatography or other means, the protein to which the 
inhibitor binds. This general strategy, sometimes called chemical biology, has 
successfully identified inhibitors of many proteins that carry out key processes 
in cell biology. An inhibitor of a kinesin protein that functions in mitosis, for 
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Figure 8-20 Small-molecule inhibitors for manipulating living cells. 
(A) Chemical structure of monastrol, a kinesin inhibitor identified in a large- 
scale screen for small molecules that disrupt mitosis. (B) Normal mitotic 
spindle seen in an untreated cell. The microtubules are stained green and 
chromosomes blue. (C) Monopolar spindle that forms in cells treated with 
monastrol, which inhibits a kinesin protein required for separation of the 
spindle poles in early mitosis. (B and C, from T.U. Mayer et al., Science 
286:971-974, 1999. With permission from AAAS.) 


example, was identified by this method (Figure 8-20). Chemical inhibitors give 
the cell biologist great control over the timing of inhibition, as drugs can be rap- 
idly added to or removed from cells, allowing protein function to be switched on 
or off quickly. 


Protein Structure Can Be Determined Using X-Ray Diffraction 


The main technique that has been used to discover the three-dimensional struc- 
ture of molecules, including proteins, at atomic resolution is x-ray crystallogra- 
phy. X-rays, like light, are a form of electromagnetic radiation, but they have a 
much shorter wavelength, typically around 0.1 nm (the diameter of a hydrogen 
atom). Ifa narrow beam of parallel x-rays is directed at a sample of a pure protein, 
most of the x-rays pass straight through it. A small fraction, however, are scattered 
by the atoms in the sample. If the sample is a well-ordered crystal, the scattered 
waves reinforce one another at certain points and appear as diffraction spots 
when recorded by a suitable detector (Figure 8-21). 

The position and intensity of each spot in the x-ray diffraction pattern con- 
tain information about the locations of the atoms in the crystal that gave rise to it. 
Deducing the three-dimensional structure of a large molecule from the diffrac- 
tion pattern ofits crystal is a complex task and was not achieved for a protein mol- 
ecule until 1960. But in recent years x-ray diffraction analysis has become increas- 
ingly automated, and now the slowest step is likely to be the generation of suitable 
protein crystals. This step requires large amounts of very pure protein and often 
involves years of trial and error to discover the proper crystallization conditions; 
the pace has greatly accelerated with the use of recombinant DNA techniques to 
produce pure proteins and robotic techniques to test large numbers of crystalli- 
zation conditions. 

Analysis of the resulting diffraction pattern produces a complex three-dimen- 
sional electron-density map. Interpreting this map—translating its contours into 
a three-dimensional structure—is a complicated procedure that requires knowl- 
edge of the amino acid sequence of the protein. Largely by trial and error, the 
sequence and the electron-density map are correlated by computer to give the 
best possible fit. The reliability of the final atomic model depends on the reso- 
lution of the original crystallographic data: 0.5 nm resolution might produce a 
low-resolution map of the polypeptide backbone, whereas a resolution of 0.15 nm 
allows all of the non-hydrogen atoms in the molecule to be reliably positioned. 

A complete atomic model is often too complex to appreciate directly, but sim- 
plified versions that show a protein’s essential structural features can be readily 
derived from it (see Panel 3-2, pp. 142-143). The three-dimensional structures of 
tens of thousands of different proteins have been determined by x-ray crystallog- 
raphy or by NMR spectroscopy (see page 461 )— enough to allow the grouping of 
common structures into families (Movie 8.1). These structures or protein folds 
often seem to be more conserved in evolution than are the amino acid sequences 
that form them (see Figure 3-13). 

X-ray crystallographic techniques can also be applied to the study of macro- 
molecular complexes. The method was used, for example, to determine the struc- 
ture of the ribosome, a large and complex machine made of several RNAs and 
more than 50 proteins (see Figure 6-62). The determination required the use of a 
synchrotron, a radiation source that generates x-rays with the intensity needed to 
analyze the crystals of such large macromolecular complexes. 
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Figure 8-21 X-ray crystallography. (A) A narrow beam of x-rays is directed (C) 
at a well-ordered crystal (B). Shown here is a protein crystal of ribulose 
bisphosphate carboxylase, an enzyme with a central role in COs fixation 
during photosynthesis. The atoms in the crystal scatter some of the beam, 
and the scattered waves reinforce one another at certain points and appear 
as a pattern of diffraction spots (C). This diffraction pattern, together with 

the amino acid sequence of the protein, can be used to produce an atomic 
model (D). The complete atomic model is hard to interpret, but this simplified 
version, derived from the x-ray diffraction data, shows the protein’s structural 
features clearly (a helices, green; B strands, red). The components pictured in 
A to D are not shown to scale. (B, courtesy of C. Branden; C, courtesy of J. 
Hajdu and |. Andersson; D, adapted from original provided by B. Furugren.) 








NMR Can Be Used to Determine Protein Structure in Solution 


Nuclear magnetic resonance (NMR) spectroscopy has been widely used for 
many years to analyze the structure of small molecules, small proteins, or protein 
domains. Unlike x-ray crystallography, NMR does not depend on having a crys- (p) 
talline sample. It simply requires a small volume of concentrated protein solu- 
tion that is placed in a strong magnetic field; indeed, it is the main technique that 
yields detailed evidence about the three-dimensional structure of molecules in 
solution. 

Certain atomic nuclei, particularly hydrogen nuclei, have a magnetic moment 
or spin: that is, they have an intrinsic magnetization, like a bar magnet. The spin 
aligns along the strong magnetic field, but it can be changed to a misaligned, 
excited state in response to applied radiofrequency (RF) pulses of electromag- 
netic radiation. When the excited hydrogen nuclei return to their aligned state, 
they emit RF radiation, which can be measured and displayed as a spectrum. The 
nature of the emitted radiation depends on the environment of each hydrogen 
nucleus, and if one nucleus is excited, it influences the absorption and emission 
of radiation by other nuclei that lie close to it. It is consequently possible, by an 
ingenious elaboration of the basic NMR technique known as two-dimensional 
NMR, to distinguish the signals from hydrogen nuclei in different amino acid resi- 
dues, and to identify and measure the small shifts in these signals that occur when 
these hydrogen nuclei lie close enough together to interact. Because the size of 
such a shift reveals the distance between the interacting pair of hydrogen atoms, 
NMR can provide information about the distances between the parts of the pro- 
tein molecule. By combining this information with a knowledge of the amino acid 
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sequence, it is possible in principle to compute the three-dimensional structure 
of the protein (Figure 8-22). 

For technical reasons, the structure of small proteins of about 20,000 daltons or 
less can be most readily determined by NMR spectroscopy. Resolution decreases 
as the size of amacromolecule increases. But recent technical advances have now 
pushed the limit to about 100,000 daltons, thereby making the majority of pro- 
teins accessible for structural analysis by NMR. 

Because NMR studies are performed in solution, this method also offers a con- 
venient means of monitoring changes in protein structure, for example during 
protein folding or when the protein binds to another molecule. NMR is also used 
widely to investigate molecules other than proteins and is valuable, for example, 
as a method to determine the three-dimensional structures of RNA molecules 
and the complex carbohydrate side chains of glycoproteins. 

A third major method for the determination of protein structure, and partic- 
ularly the structure of large protein complexes, is single-particle analysis by elec- 
tron microscopy. We discuss this approach in Chapter 9. 


Protein Sequence and Structure Provide Clues About Protein 
Function 


Having discussed methods for purifying and analyzing proteins, we now turn to 
a common situation in cell and molecular biology: an investigator has identified 
a gene important for a biological process but has no direct knowledge of the bio- 
chemical properties of its protein product. 

Thanks to the proliferation of protein and nucleic acid sequences that are cat- 
aloged in genome databases, the function of a gene—and its encoded protein— 
can often be predicted by simply comparing its sequence with those of previously 
characterized genes. Because amino acid sequence determines protein structure, 
and structure dictates biochemical function, proteins that share a similar amino 
acid sequence usually have the same structure and usually perform similar bio- 
chemical functions, even when they are found in distantly related organisms. In 
modern cell biology, the study of a newly discovered protein usually begins with 
a search for previously characterized proteins that are similar in their amino acid 
sequences. 

Searching a collection of known sequences for similar genes or proteins is 
typically done over the Internet, and it simply involves selecting a database and 
entering the desired sequence. A sequence-alignment program—the most popu- 
lar is BLAST—scans the database for similar sequences by sliding the submitted 


Figure 8-22 NMR spectroscopy. 

(A) An example of the data from an NMR 
machine. This two-dimensional NMR 
spectrum is derived from the C-terminal 
domain of the enzyme cellulase. The spots 
represent interactions between hydrogen 
atoms that are near neighbors in the 
protein and hence reflect the distance 

that separates them. Complex computing 
methods, in conjunction with the known 
amino acid sequence, enable possible 
compatible structures to be derived. 

(B) Ten structures of the enzyme, which all 
satisfy the distance constraints equally well, 
are shown superimposed on one another, 
giving a good indication of the probable 
three-dimensional structure. (Courtesy of 
P. Kraulis.) 
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sequence along the archived sequences until a cluster of residues falls into full or 
partial alignment (Figure 8-23). Such comparisons can predict the functions of 
individual proteins, families of proteins, or even most of the protein complement 
of a newly sequenced organism. 

As was explained in Chapter 3, many proteins that adopt the same confor- 
mation and have related functions are too distantly related to be identified from 
a comparison of their amino acid sequences alone (see Figure 3-13). Thus, an 
ability to reliably predict the three-dimensional structure of a protein from its 
amino acid sequence would improve our ability to infer protein function from the 
sequence information in genomic databases. In recent years, major progress has 
been made in predicting the precise structure of a protein. These predictions are 
based, in part, on our knowledge of the thousands of protein structures that have 
already been determined by x-ray crystallography and NMR spectroscopy and, 
in part, on computations using our knowledge of the physical forces acting on 
the atoms. However, it remains a substantial and important challenge to predict 
the structures of proteins that are large or have multiple domains, or to predict 
structures at the very high levels of resolution needed to assist in computer-based 
drug discovery. 

While finding related sequences and structures for a new protein will provide 
many clues about its function, it is usually necessary to test these insights through 
direct experimentation. However, the clues generated from sequence compar- 
isons typically point the investigator in the correct experimental direction, and 
their use has therefore become one of the most important strategies in modern 
cell biology. 


Summary 


Many methods exist for identifying proteins and analyzing their biochemical prop- 
erties, structures, and interactions with other proteins. Small-molecule inhibitors 
allow the functions of proteins they act upon to be studied in living cells. Because 
proteins with similar structures often have similar functions, the biochemical activ- 
ity of a protein can often be predicted by searching databases for previously charac- 
terized proteins that are similar in their amino acid sequences. 


ANALYZING AND MANIPULATING DNA 


Until the early 1970s, DNA was the most difficult biological molecule for the bio- 
chemist to analyze. Enormously long and chemically monotonous, the string of 
nucleotides that forms the genetic material of an organism could be examined 
only indirectly, by protein sequencing or by genetic analysis. Today, the situation 
has changed entirely. From being the most difficult macromolecule of the cell to 
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Figure 8-23 Results of a BLAST search. 
Sequence databases can be searched 

to find similar amino acid or nucleic acid 
sequences. Here, a search for proteins 
similar to the human cell-cycle regulatory 
protein Cdc2 (Query) locates maize Cdc2 
(Sbjct), which is 68% identical to human 
Cdc2 in its amino acid sequence. The 
alignment begins at residue 57 of the Query 
protein, Suggesting that the human protein 
has an N-terminal region that is absent 
from the maize protein. The green blocks 
indicate differences in Sequence, and the 
yellow bar summarizes the similarities: 
when the two amino acid sequences are 
identical, the residue is shown; similar 
amino acid substitutions are indicated 

by a plus sign (+). Only one small gap 

has been introduced —indicated by the 
red arrow at position 194 in the Query 
sequence—to align the two sequences 
maximally. The alignment score (Score), 
which is expressed in two different types 
of units, takes into account penalties for 
substitutions and gaps; the higher the 
alignment score, the better the match. The 
significance of the alignment is reflected in 
the Expectation (E) value, which specifies 
how often a match this good would be 
expected to occur by chance. The lower 
the E value, the more significant the 
match; the extremely low value here (e7111) 
indicates certain significance. E values 
much higher than 0.1 are unlikely to reflect 
true relatedness. For example, an E value 
of 0.1 means there is a 1 in 10 likelihood 
that such a match would arise solely by 
chance. 
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analyze, DNA has become the easiest. It is now possible to determine the entire 
nucleotide sequence of a bacterial or fungal genome in a matter of hours, and the 
sequence of an individual human genome in less than a day. Once the nucleotide 
sequence of a genome is known, any individual gene can be easily isolated, and 
large quantities of the gene product (be it RNA or protein) can be made either by 
introducing the gene into bacteria or animal cells and coaxing these cells to over- 
express the foreign gene or by synthesizing the gene product in vitro. In this way, 
proteins and RNA molecules that might be present in only tiny amounts in living 
cells can be produced in large quantities for biochemical and structural analyses. 
And this approach can also be used to produce large quantities of human proteins 
(such as insulin, or interferon, or blood-clotting proteins) for use as human phar- 
maceuticals. As we will see later in this chapter, it is also possible for scientists to 
alter an isolated gene and transfer it back into the germ line of an animal or plant, 
so as to become a functional and heritable part of the organism’s genome. In this 
way, the biological roles of any gene can be assessed by observing—in the whole 
organism—the results of modifying it. 

The ability to manipulate DNA with precision in a test tube or an organism, 
known as recombinant DNA technology has had a dramatic impact on all aspects 
of cell and molecular biology, allowing us to routinely study cells and their mac- 
romolecules in ways that were unimaginable even twenty years ago. Central to the 
technology are the following manipulations: 


1. Cleavage of DNA at specific sites by restriction nucleases, which greatly 
facilitates the isolation and manipulation of individual pieces of a genome. 


2. DNA ligation, which makes it possible to seamlessly join together DNA 
molecules from widely different sources. 


3. DNA cloning (through the use of either cloning vectors or the polymerase 
chain reaction) in which a portion of a genome (often an individual gene) 
is “purified” away from the remainder of the genome by repeatedly copying 
it to generate many billions of identical molecules. 


4. Nucleic acid hybridization, which makes it possible to identify any specific 
sequence of DNA or RNA with great accuracy and sensitivity based on its 
ability to selectively bind a complementary nucleic acid sequence. 


5. DNA synthesis, which makes it possible to chemically synthesize DNA 
molecules with any sequence of nucleotides, whether or not the sequence 
occurs in nature. 


6. Rapid determination of the sequence of nucleotides of any DNA or RNA 
molecule. 
In the following sections, we describe each of these basic techniques which, 
together, have revolutionized the study of cell and molecular biology. 


Restriction Nucleases Cut Large DNA Molecules into Specific 
Fragments 


Unlike a protein, a gene does not exist as a discrete entity in cells, but rather as a 
small region of a much longer DNA molecule. Although the DNA molecules in a 
cell can be randomly broken into small pieces by mechanical force, a fragment 
containing a single gene in a mammalian genome would still be only one among 
a hundred thousand or more DNA fragments, indistinguishable in their average 
size. How could such a gene be separated from all the others? Because all DNA 
molecules consist of an approximately equal mixture of the same four nucleo- 
tides, they cannot be readily separated, as proteins can, on the basis of their dif- 
ferent charges and biochemical properties. The solution to this problem began 
to emerge with the discovery of restriction nucleases. These enzymes, which are 
purified from bacteria, cut the DNA double helix at specific sites defined by the 
local nucleotide sequence, thereby cleaving a long, double-stranded DNA mole- 
cule into fragments of strictly defined sizes. 

Like many of the tools of recombinant DNA technology, restriction nucleases 
were discovered by researchers trying to understand an intriguing biological 
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phenomenon. It had been observed that certain bacteria always degraded “for- 
eign” DNA that was introduced into them experimentally. A search for the mech- 
anism responsible revealed a then unanticipated class of bacterial nucleases that 
cleave DNA at specific nucleotide sequences. The bacterium’s own DNA is pro- 
tected from cleavage by methylation of these same sequences, thereby protecting 
a bacterium’s own genome from being overrun by foreign DNA. Because these 
enzymes restrict the transfer of DNA into bacteria, they were called restriction 
nucleases. The pursuit of this seemingly arcane biological puzzle set off the devel- 
opment of technologies that have forever changed the way cell and molecular 
biologists study living things. 

Different bacterial species produce different restriction nucleases, each cut- 
ting at a different, specific nucleotide sequence (Figure 8-24). Because these 
target sequences are short—generally four to eight nucleotide pairs—many sites 
of cleavage will occur, purely by chance, in any long DNA molecule. The reason 
restriction nucleases are so useful in the laboratory is that each enzyme will 
always cut a particular DNA molecule at the same sites. Thus for a given sample of 
DNA (which contains many identical molecules), a particular restriction nuclease 
will reliably generate the same set of DNA fragments. 

The size of the resulting fragments depends on the length of the target 
sequences of the restriction nucleases. As shown in Figure 8-24, the enzyme 
Haelll cuts at a sequence of four nucleotide pairs; a sequence this long would 
be expected to occur purely by chance approximately once every 256 nucleotide 
pairs (1 in 4*). In comparison, a restriction nuclease with a target sequence that 
is eight nucleotides long would be expected to cleave DNA on average once every 
65,536 nucleotide pairs (1 in 4°). This difference in sequence selectivity makes it 
possible to cleave a long DNA molecule into the fragment sizes that are most suit- 
able for a given application. 


Gel Electrophoresis Separates DNA Molecules of Different Sizes 


The same types of gel-electrophoresis methods that have proved so useful in the 
analysis of proteins (see Figure 8-13) can be applied to DNA molecules. The pro- 
cedure is actually simpler than for proteins: because each nucleotide in a nucleic 
acid molecule already carries a single negative charge (on the phosphate group), 
there is no need to add the negatively charged detergent SDS that is required to 
make protein molecules move uniformly toward the positive electrode. Larger 
DNA fragments will migrate more slowly because their progress is impeded to a 
greater extent by the gel matrix. Over several hours, the DNA fragments become 
spread out across the gel according to size, forming a ladder of discrete bands, 
each composed of a collection of DNA molecules of identical length (Figure 8-25A 
and B). To separate DNA molecules longer than 500 nucleotide pairs, the gel is 
made of a diluted solution of agarose (a polysaccharide isolated from seaweed). 
For DNA fragments less than 500 nucleotides long, specially designed polyacryl- 
amide gels allow the separation of molecules that differ in length by as little as a 
single nucleotide (see Figure 8-25C). 
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Figure 8-24 Restriction nucleases cleave 
DNA at specific nucleotide sequences. 
Like the sequence-specific DNA-binding 
proteins we encountered in Chapter 7 (see 
Figure 7-8), restriction enzymes often work 
as dimers, and the DNA sequence they 
recognize and cleave is often symmetrical 
around a central point. Here, both strands 
of the DNA double helix are cut at specific 
points within the target sequence (orange). 
Some enzymes, such as Haelll, cut straight 
across the double helix and leave two 
blunt-ended DNA molecules; with others, 
such as EcoRI and Hindlll, the cuts on each 
strand are staggered. These staggered 
cuts generate “sticky ends” —short, single- 
stranded overhangs that help the cut DNA 
molecules join back together through 
complementary base-pairing. This rejoining 
of DNA molecules becomes important 

for DNA cloning, as we discuss below. 
Restriction nucleases are usually obtained 
from bacteria, and their names reflect their 
origins: for example, the enzyme EcoRI 
comes from Escherichia coli. There are 
currently hundreds of different restriction 
enzymes available; they can be ordered 
from companies that commercially produce 
them. 
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A variation of agarose-gel electrophoresis, called pulsed-field gel electrophore- 
sis, makes it possible to separate extremely long DNA molecules, even those found 
in whole chromosomes. Ordinary gel electrophoresis fails to separate very large 
DNA molecules because the steady electric field stretches them out so that they 
travel end-first through the gel in snakelike configurations at a rate that is inde- 
pendent of their length. In pulsed-field gel electrophoresis, by contrast, the direc- 
tion of the electric field changes periodically, which forces the molecules to re- 
orient before continuing to move snakelike through the gel. This re-orientation 
takes much more time for larger molecules, so that longer molecules move more 
slowly than shorter ones. As a consequence, entire bacterial or yeast chromo- 
somes separate into discrete bands in pulsed-field gels and so can be sorted and 
identified on the basis of their size (Figure 8-25D). Although a typical mammalian 
chromosome of 108 nucleotide pairs is still too long to be sorted even in this way, 
large segments of these chromosomes are readily separated and identified if the 
chromosomal DNA is first cut with a restriction nuclease selected to recognize 
sequences that occur only rarely. 

The DNA bands on agarose or polyacrylamide gels are invisible unless the 
DNA is labeled or stained in some way. A particularly sensitive method of staining 
DNA is to soak the gel in the dye ethidium bromide, which fluoresces under ultra- 
violet light when it is bound to DNA (see Figure 8-25B and D). Even more sensitive 
detection methods incorporate a radioisotope or chemical marker into the DNA 
molecules before electrophoresis, as we next describe. 
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Figure 8-25 DNA molecules can 

be separated by size using gel 
electrophoresis. (A) Schematic illustration 
comparing the results of cutting the same DNA 
molecule (in this case, the genome of a virus 
that infects wasps) with two different restriction 
nucleases, EcoRI (middle) and Hindlll (right). 
The fragments are then separated by gel 
electrophoresis using a gel matrix of agarose. 
Because larger fragments migrate more slowly 
than smaller ones, the lowermost bands on 
the gel contain the smallest DNA fragments. 
The sizes of the fragments can be estimated 
by comparing them to a set of DNA fragments 
of known sizes (left). (B) Photograph of an 
actual agarose gel showing DNA “bands” that 
have been stained with ethidium bromide. 

(C) A polyacrylamide gel with small pores 

was used to separate short DNA molecules 
that differ by only a single nucleotide. Shown 
here are the results of a dideoxy sequencing 
reaction, explained later in this chapter. From 
left to right, the bands in the four lanes were 
produced by adding G, A, T, and C chain- 
terminating nucleotides (see Panel 8-1). The 
DNA molecules were labeled with 32P and the 
image shown was produced by laying a piece 
of photographic film over the gel and allowing 
the 932P to expose the film, producing the dark 
bands observed when the film was developed. 
(D) The technique of pulsed-field agarose-gel 
electrophoresis was used to separate the 16 
different chromosomes of the yeast species 
Saccharomyces cerevisiae, which range in 
size from 220,000 to 2.5 million nucleotide 
pairs. The DNA was stained as in (B). DNA 
molecules as large as 10’ nucleotide pairs can 
be separated in this way. (B, from U. Albrecht 
et al., J. Gen. Virol. 75:3353-3363, 1994; 

C, courtesy of Leander Lauffer and Peter 
Walter; D, from D. Vollrath and R.W. Davis, 
Nucleic Acids Res. 15:7865-7876, 1987. With 
permission from Oxford University Press.) 
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Purified DNA Molecules Can Be Specifically Labeled with 
Radioisotopes or Chemical Markers in vitro 


The DNA polymerases that synthesize and repair DNA (discussed in Chapter 5) 
have become important tools in experimentally manipulating DNA. Because they 
synthesize sequences complementary to an existing DNA molecule, they are often 
used in the test tube to create exact copies of existing DNA molecules. The copies 
can include specially modified nucleotides (Figure 8-26). To synthesize DNA in 
this way, the DNA polymerase is presented with a template and a pool of nucleo- 
tide precursors that contain the modification. As long as the polymerase can use 
these precursors, it automatically makes new, modified molecules that match the 
sequence of the template. Modified DNA molecules have many uses. DNA labeled 
with the radioisotope *’P can be detected following gel electrophoresis by placing 
the gel next to a piece of photographic film (see Figure 8-25C). The 3P atoms emit 
8 particles which expose the film, producing a visible record of every band on the 
gel. Alternatively, the gel can be scanned by a detector that measures the B emis- 
sions directly. Other types of modified DNA, such as that labeled by digoxigenin 
(see Figure 8-26B), are useful for visualizing DNA molecules in whole cells, a topic 
we discuss later in this chapter. 


Genes Can Be Cloned Using Bacteria 


Any DNA fragment can be cloned. In molecular biology, the term DNA cloning 
is used in two senses. It literally refers to the act of making many identical cop- 
ies (typically billions) of a DNA molecule—the amplification of a particular DNA 
sequence. However, the term also describes the isolation of a particular stretch of 
DNA (often a particular gene) from the rest of the cell’s genome; the same term 
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Figure 8-26 Methods for labeling DNA molecules in vitro. (A) A purified DNA polymerase enzyme can incorporate radiolabeled nucleotides as it 
synthesizes new DNA molecules. In this way, radiolabeled versions of any DNA sequence can be prepared in the laboratory. (B) The method in (A) 
is also used to produce nonradioactive DNA molecules that carry a specific chemical marker that can be detected with an appropriate antibody. 
The base on the nucleoside triphosphate shown is an analog of thymine, in which the methyl group on T has been replaced by a spacer arm linked 
to the plant steroid digoxigenin. An anti-digoxigenin antibody coupled to a visible marker such as a fluorescent dye is then used to visualize the 
DNA. Other chemical labels, such as biotin, can be attached to nucleotides and used in the same way. The only requirements are that the modified 
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nucleotides properly base-pair and appear “normal” to the DNA polymerase. 
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is used because this isolation is usually accomplished by making many identical 
copies of only the DNA of interest. We note that elsewhere in the book, cloning, 
particularly when used in the context of developmental biology, can also refer 
to the generation of many genetically identical cells starting from a single cell or 
even to the generation of genetically identical organisms (see, for example, Figure 
7-2). In all cases, cloning refers to the act of making many identical copies, and 
in this section, we use the term to refer to methods designed to generate many 
identical copies of a defined segment of nucleic acid. 

DNA cloning can be accomplished in several ways. One of the simplest 
involves inserting a particular fragment of DNA into the purified DNA genome of 
a self-replicating genetic element—usually a plasmid. The plasmid vectors most 
widely used for gene cloning are small, circular molecules of double-stranded 
DNA derived from plasmids that occur naturally in bacterial cells. They generally 
account for only a minor fraction of the total host bacterial cell DNA, but owing 
to their small size, they can easily be separated from the much larger chromo- 
somal DNA molecules, which precipitate as a pellet upon centrifugation. For use 
as cloning vectors, the purified plasmid DNA circles are first cut with a restric- 
tion nuclease to create linear DNA molecules. The DNA to be cloned is added to 
the cut plasmid and then covalently joined using the enzyme DNA ligase (Figure 
8-27 and Figure 8-28). As discussed in Chapter 5, this enzyme is used by the cell 
to stitch together the Okazaki fragments produced during DNA replication. The 
recombinant DNA circle is introduced back into bacterial cells that have been 
made transiently permeable to DNA. As the cells grow and divide, doubling in 
number every 30 minutes, the recombinant plasmids also replicate to produce an 
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Figure 8-27 The insertion of a DNA 
fragment into a bacterial plasmid with 
the enzyme DNA ligase. The plasmid is 
cut open with a restriction nuclease (in this 
case, one that produces staggered ends) 
and is mixed with the DNA fragment to be 
cloned (which has been prepared with the 
same restriction nuclease). DNA ligase and 
ATP are added. The staggered ends base- 
pair, and DNA ligase seals the nicks in the 
DNA backbone, producing a complete 
recombinant DNA molecule. In the 
accompanying micrographs, the inserted 
DNA is colored red. (Micrographs courtesy 
of Huntington Potter and David Dressler.) 


Figure 8-28 DNA ligase can join 
together any two DNA fragments in 
vitro to produce recombinant DNA 
molecules. ATP provides the energy 
necessary to reseal the sugar-phosphate 
backbone of DNA (see Figure 5-12). 

(A) DNA ligase can readily join two 

DNA fragments produced by the same 
restriction nuclease, in this case EcoRI. 
Note that the staggered ends produced 

by this enzyme enable the ends of the two 
fragments to base-pair correctly with each 
other, greatly facilitating their rejoining. 

(B) DNA ligase can also be used to 

join DNA fragments produced by 

different restriction nucleases — for 
example, EcoRI and Haelll. In this 

case, before the fragments undergo 
ligation, DNA polymerase plus a mixture 

of deoxyribonucleoside triphosphates 
(dNTPs) are used to fill in the staggered cut 
produced by EcoRI. Each DNA fragment 
shown in the figure is oriented so that its 5’ 
ends are at the left end of the upper strand 
and the right end of the lower strand, as 
indicated. 
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enormous number of copies of DNA circles containing the foreign DNA (Figure 
8-29). Once the cells are lysed and the plasmid DNA isolated, the cloned DNA 
fragment can be readily recovered by cutting it out of the plasmid DNA with the 
same restriction nuclease that was used to insert it, and then separating it from 
the plasmid DNA by gel electrophoresis. Together, these steps allow the amplifi- 
cation and purification of any segment of DNA from the genome of any organism. 

A particularly useful plasmid vector is based on the naturally occurring F 
plasmid of E. coli. Unlike smaller bacterial plasmids, the F plasmid—and its engi- 
neered derivative, the bacterial artificial chromosome (BAC)—is present in only 
one or two copies per E. coli cell. The fact that BACs are kept in such low numbers 
means that they can stably maintain very long DNA sequences, up to 1 million 
nucleotide pairs in length. With only a few BACs present per bacterium, it is less 
likely that the cloned DNA fragments will become scrambled by recombination 
with sequences carried on other copies of the plasmid. Because of their stability, 
ability to accept large DNA inserts, and ease of handling, BACs are now the pre- 
ferred vector for handling large fragments of foreign DNA. As we will see below, 
BACs were instrumental in determining the complete nucleotide sequence of the 
human genome. 


An Entire Genome Can Be Represented in a DNA Library 


Often it is useful to break up a genome into much smaller fragments and clone 
every fragment, separately, using a plasmid vector. This approach is useful 
because it allows scientists to work with easily managed, discrete pieces of a 
genome instead of whole, unwieldy chromosomes. 

This strategy involves cleaving genomic DNA into small pieces using a restric- 
tion nuclease (or, in some cases, by mechanically shearing the DNA) and ligat- 
ing the entire collection of DNA fragments into plasmid vectors, using conditions 
that favor the insertion of a single DNA fragment into each plasmid molecule. 
These recombinant plasmids are then introduced into E. coli at a concentration 
that ensures that no more than one plasmid molecule is taken up by each bac- 
terium. The collection of cloned plasmid molecules is known as a DNA library. 
Because the DNA fragments were derived directly from the chromosomal DNA of 
the organism of interest, the resulting collection—called a genomic library—will 
represent the entire genome of that organism (Figure 8-30), spread out over tens 
of thousands of individual bacterial colonies. 

An alternative strategy, one that enriches for protein-coding genes, is to 
begin the cloning process by selecting only those DNA sequences that are tran- 
scribed into mRNA and thus correspond to protein-encoding genes. This is done 
by extracting the mRNA from cells and then making a DNA copy of each mRNA 


Figure 8-30 Human genomic libraries containing DNA fragments 

that represent the whole human genome can be constructed using 
restriction nucleases and DNA ligase. Such a genomic library consists 

of a set of bacteria, each carrying a different fragment of human DNA. For 
simplicity, only the colored DNA fragments are shown in the library; in reality, 
all of the different gray fragments will also be represented. 
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Figure 8-29 A DNA fragment can be 
replicated inside a bacterial cell. To 
clone a particular fragment of DNA, it is first 
inserted into a plasmid vector, as shown 

in Figure 8-27. The resulting recombinant 
plasmid DNA is then introduced into a 
bacterium, where it is replicated many 
millions of times as the bacterium 
multiplies. For simplicity, the genome of the 
bacterial cell is not shown. 
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molecule present—a so-called complementary DNA, or cDNA. The copying reac- 
tion is catalyzed by the reverse transcriptase enzyme of retroviruses, which syn- 
thesizes a complementary DNA chain on an RNA template. The single-stranded 
cDNA molecules synthesized by the reverse transcriptase are converted by DNA 
polymerase into double-stranded cDNA molecules, and these molecules are 
inserted into a plasmid or virus vector and cloned (Figure 8-31). Each clone 
obtained in this way is called a cDNA clone, and the entire collection of clones 
derived from one mRNA preparation constitutes a cDNA library. 

Figure 8-32 illustrates some important differences between genomic DNA 
clones and cDNA clones. Genomic clones represent a random sample of all of 
the DNA sequences in an organism—both coding and noncoding—and, with very 
rare exceptions, are the same regardless of the cell type used to prepare them. By 
contrast, cDNA clones contain only those regions of the genome that have been 
transcribed into mRNA. Because the cells of different tissues produce distinct sets 
of mRNA molecules, a distinct cDNA library is obtained for each type of cell used 
to prepare the library. 
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Figure 8-31 The synthesis of cDNA. Total mRNA is extracted from a particular tissue, and the 
enzyme reverse transcriptase (see Figure 5-62) is used to produce DNA copies (CDNA) of the mRNA 
molecules. For simplicity, the copying of just one of these mRNAs into cDNA is illustrated. A short 
oligonucleotide complementary to the poly-A tail at the 3’ end of the mRNA (discussed in Chapter 
6) is first hybridized to the RNA to act as a primer for the reverse transcriptase, which then copies 
the RNA into a complementary DNA chain, thereby forming a DNA-RNA hybrid helix. Treating the 
DNA-RNA hybrid with a specialized nuclease (RNAse H) that attacks only the RNA produces nicks 
and gaps in the RNA strand. DNA polymerase then copies the remaining single-stranded cDNA 

into double-stranded cDNA. Because DNA polymerase can synthesize through the bound RNA 
molecules, the RNA fragment that is base-paired to the 3’ end of the first DNA strand usually acts 
as the primer for the second strand synthesis, as shown. Any remaining RNA is eventually degraded 
during subsequent cloning steps. As a result, the nucleotide sequences at the extreme 5’ ends of 
the original MRNA molecules are often absent from cDNA libraries. 
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Genomic libraries are especially useful in determining the nucleotide sequences 
of a whole genome. For example, to determine the nucleotide sequence of the 
human genome, it was broken up into roughly 100,000-nucleotide-pair pieces, 
each of which was inserted into a BAC plasmid and amplified in E. coli. The result- 
ing genomic library consisted of tens of thousands of bacterial colonies, each 
containing a different human DNA insert. The nucleotide sequence of each insert 
was determined separately and the sequence of the entire genome was stitched 
together from the pieces. 

The most important advantage of cDNA clones, over genomic clones, is that 
they contain the uninterrupted coding sequence of a gene. When the aim of the 
cloning, for example, is to produce the protein in large quantities by expressing 
the cloned gene in a bacterial or yeast cell, itis more preferable to start with cDNA. 

Genomic and cDNA libraries are inexhaustible resources, which are widely 
shared among investigators. Today, many such libraries are also available from 
commercial sources. Because the identity of each insert in a library is often known 
(through sequencing the insert), it is often possible to order a particular region of 
a chromosome (or, in the case of cDNA, a complete, intron-less protein-coding 
gene) and have it delivered by mail. 

Cloning DNA by using bacteria revolutionized the study of genomes and is still 
in wide use today. However, there is an even simpler way to clone DNA, one that 
can be carried out entirely in vitro. We discuss this approach, called the polymerase 
chain reaction, below. However, first we need to review a fundamental, far-reach- 
ing property of DNA and RNA called hybridization. 
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Figure 8-32 The differences between 
cDNA clones and genomic DNA clones 
derived from the same region of DNA. 

In this example, gene A is infrequently 
transcribed, whereas gene B is frequently 
transcribed, and both genes contain introns 
(orange). In the genomic DNA library, both 
the introns and the nontranscribed DNA 
(gray) are included in the clones, and most 
clones contain, at most, only part of the 
coding sequence of a gene (red). In the 
cDNA clones, the intron sequences (yellow) 
have been removed by RNA splicing during 
the formation of the mRNA (blue), and a 
continuous coding sequence is therefore 
present in each clone. Because gene B 

is transcribed more frequently than gene 

A in the cells from which the cDNA library 
was made, it is represented much more 
frequently than A in the cDNA library. In 
contrast, A and B are represented equally 
in the genomic DNA library. 
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Hybridization Provides a Powerful, But Simple Way to Detect 
Specific Nucleotide Sequences 


Under normal conditions, the two strands of a DNA double helix are held together 
by hydrogen bonds between the complementary base pairs (see Figure 4-3). But 
these relatively weak, noncovalent bonds can be fairly easily broken. Such DNA 
denaturation will release the two strands from each other, but does not break the 
covalent bonds that link together the nucleotides within each strand. Perhaps 
the simplest way to achieve this separation involves heating the DNA to around 
90°C. When the conditions are reversed—by slowly lowering the temperature— 
the complementary strands will readily come back together to re-form a double 
helix. This hybridization, or DNA renaturation, is driven by the re-formation of 
the hydrogen bonds between complementary base pairs (Figure 8-33). We saw 
in Chapter 5 that DNA hybridization underlies the crucial process of homologous 
recombination (see Figure 5-47). 

This fundamental capacity of a single-stranded nucleic acid molecule, either 
DNA or RNA, to form a double helix with a single-stranded molecule of a com- 
plementary sequence provides a powerful and sensitive technique for detecting 
specific nucleotide sequences. Today, one simply designs a short, single-stranded 
DNA molecule (often called a DNA probe) that is complementary to the nucleo- 
tide sequence of interest. Because the nucleotide sequences of so many genomes 
are known—and are stored in publicly accessible databases—designing a probe 
to hybridize anywhere in a genome is straightforward. Probes are single-stranded, 
typically 30 nucleotides in length, and are usually synthesized chemically by a 
commercial service for pennies per nucleotide. ADNA sequence of 30 nucleotides 
will occur by chance only once every 1 x 10!8 nucleotides (430); so, even in the 
human genome of 3 x 10° nucleotide pairs, a DNA probe designed to match a par- 
ticular 30-nucleotide sequence will be highly unlikely to hybridize—by chance— 
anywhere else on the genome. This, of course, presumes that the sequence com- 
plementary to the probe does not occur multiple times in the genome, a condition 
that can be checked beforehand by scanning the genomic sequence in silico (using 
a computer) and designing probes that match only one spot. The hybridization 
conditions can be set so that even a single mismatch will prevent hybridization 
to “near-miss” sequences. The exquisite specificity of nucleic acid hybridization 
can be easily appreciated by the in situ (Latin for “in place”) hybridization exper- 
iment shown in Figure 8-34. As we will see throughout this chapter, nucleic acid 


Figure 8-34 In situ hybridization can be used to locate genes on 
isolated chromosomes. Here, six different DNA probes have been used 
to mark the locations of their complementary nucleotide sequences on 
human Chromosome 5, isolated from a mitotic cell in metaphase (see 
Figure 4-59 and Panel 17-1, pp. 980-981). The DNA probes have been 
labeled with different chemical groups (See Figure 8—26B) and are detected 
using fluorescent antibodies specific for those groups. The chromosomal 
DNA has been partially denatured to allow the probes to base-pair with 
their complementary sequences. Both the maternal and paternal copies of 
Chromosome 5 are shown, aligned side by side. Each probe produces two 
dots on each chromosome because chromosomes undergoing mitosis have 
already replicated their DNA; therefore, each chromosome contains two 
identical DNA helices. The technique employed here is nicknamed FISH, for 
fluorescence in situ hybridization. (Courtesy of David C. Ward.) 


Figure 8-33 A molecule of DNA can 
undergo denaturation and renaturation 
(hybridization). For two single-stranded 
molecules to hybridize, they must have 
complementary nucleotide sequences that 
allow base-pairing. In this example, the red 
and orange strands are complementary to 
each other, and the blue and green strands 
are complementary to each other. Although 
denaturation by heating is shown, DNA can 
also be renatured after being denatured by 
alkali treatment. 
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hybridization has many uses in modern cell and molecular biology; one of the 
most powerful is in the cloning of DNA by the polymerase chain reaction, as we 
next discuss. 


Genes Can Be Cloned in vitro Using PCR 


Genomic and cDNA libraries were once the only route to cloning genes and they 
are still used for cloning very large genes and for sequencing whole genomes. 
However, a powerful and versatile method for amplifying DNA, known as the 
polymerase chain reaction (PCR), provides a more rapid and straightforward 
approach to DNA cloning, particularly in organisms whose complete genome 
sequence is known. Today, since genome sequences are abundant, most cloning 
is carried out by PCR. 

Invented in the 1980s, PCR revolutionized the way that DNA and RNA are 
analyzed. The technique can amplify any nucleotide sequence selectively and is 
performed entirely in a test tube. Eliminating the need for bacteria makes PCR 
convenient and rapid—billions of copies of a nucleotide can be generated in a 
matter of hours. Starting with an entire genome, PCR allows DNA from a specified 
region—selected by the experimenter—to be greatly amplified, effectively “puri- 
fying” this DNA away from the remainder of the genome, which remains unam- 
plified. Because of its power to greatly amplify nucleic acids, PCR is remarkably 
sensitive: the method can be used to detect the trace amounts of DNA in a drop of 
blood left at a crime scene or in a few copies of a viral genome in a patient’s blood 
sample. 

The success of PCR depends both on the selectivity of DNA hybridization 
and on the ability of DNA polymerase to copy a DNA template faithfully through 
repeated rounds of replication in vitro. As discussed in Chapter 5, this enzyme 
adds nucleotides to the 3’ end of a growing strand of DNA (see Figure 5-4). To copy 
DNA, the polymerase requires a primer—a short nucleotide sequence that pro- 
vides a 3’ end from which synthesis can begin. For PCR, the primers are designed 
by the experimenter, synthesized chemically, and, by hybridizing to genomic 
DNA, “tell” the polymerase which part of the genome to copy. As discussed in the 
previous section, DNA primers (in essence, the same type of molecules as DNA 
probes but without a radioactive or fluorescent label) can be designed to uniquely 
locate any position on a genome. 

PCR is an iterative process in which the cycle of amplification is repeated doz- 
ens of times. At the start of each cycle, the two strands of the double-stranded 
DNA template are separated and a different primer is annealed to each. These 
primers mark the right and left boundaries of the DNA to be amplified. DNA poly- 
merase is then allowed to replicate each strand independently (Figure 8-35). In 
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Figure 8-35 A pair of primers directs the synthesis of a desired segment of DNA in a test tube. Each cycle of PCR includes three steps: 
(1) The double-stranded DNA is heated briefly to separate the two strands. (2) The DNA is exposed to a large excess of a pair of specific primers — 
designed to bracket the region of DNA to be amplified —and the sample is cooled to allow the primers to hybridize to complementary sequences 
in the two DNA strands. (8) This mixture is incubated with DNA polymerase and the four deoxyribonucleoside triphosphates so that DNA can be 
synthesized, starting from the two primers. To amplify the DNA, the cycle is repeated many times by reheating the sample to separate the newly 
synthesized DNA strands (see Figure 8-36). 

The technique depends on the use of a special DNA polymerase isolated from a thermophilic bacterium; this polymerase is stable at much higher 
temperatures than eukaryotic DNA polymerases, so it is not denatured by the heat treatment shown in step 1. The enzyme therefore does not have 
to be added again after each cycle. 
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Figure 8-36 PCR uses repeated rounds of strand separation, hybridization, and synthesis to amplify DNA. As the 
procedure outlined in Figure 8-35 is repeated, all the newly synthesized fragments serve as templates in their turn. Because the 
polymerase and the primers remain in the sample after the first cycle, PCR involves simply heating and then cooling the same 
sample, in the same test tube, again and again. Each cycle doubles the amount of DNA synthesized in the previous cycle, so 
that within a few cycles, the predominant DNA is identical to the sequence bracketed by and including the two primers in the 
original template. In the example illustrated here, three cycles of reaction produce 16 DNA chains, 8 of which (boxed in yellow) 
correspond exactly to one or the other strand of the original bracketed sequence. After four more cycles, 240 of the 256 DNA 
chains will correspond exactly to the original Sequence, and after several more cycles, essentially all of the DNA strands will be 
this length. Typically, 20-30 cycles are carried out to effectively clone a region of DNA starting from genomic DNA; the rest of the 
genome remains unamplified, and its concentration is therefore negligible compared with that of the amplified region (Movie 8.2). 


subsequent cycles, all the newly synthesized DNA molecules produced by the 
polymerase serve as templates for the next round of replication (Figure 8-36). 
Through this iterative amplification process, many copies of the original sequence 
can be made—billions after about 20 to 30 cycles. 

PCR is now the method of choice for cloning relatively short DNA fragments 
(say, under 10,000 nucleotide pairs). Each cycle takes only about five minutes, 
and automation of the whole procedure enables cell-free cloning of a DNA frag- 
ment in a few hours. The original template for PCR can be either DNA or RNA, so 
this method can be used to obtain either a genomic clone (complete with introns 
and exons) or acDNA copy of an mRNA (Figure 8-37). 


PCR Is Also Used for Diagnostic and Forensic Applications 


The PCR method is extraordinarily sensitive; it can detect a single DNA mole- 
cule in a sample if at least part of the sequence of that molecule is known. Trace 
amounts of RNA can be analyzed in the same way by first transcribing them into 
DNA with reverse transcriptase. For these reasons, PCR is frequently employed for 
uses that go beyond simple cloning. For example, it can be used to detect invading 
pathogens at very early stages of infection. In this case, short sequences comple- 
mentary to a segment of the infectious agent’s genome are used as primers and 
following many cycles of amplification, even a few copies of an invading bacterial 
or viral genome in a patient’s sample can be detected (Figure 8-38). For many 
infections, PCR has replaced the use of antibodies against microbial molecules to 
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source—for example, whether a sample of beef actually came from a cow. 


Finally, PCR is now widely used in forensics. The method’s extreme sensitivity 
allows forensic investigators to isolate DNA from minute traces of human blood or 
other tissue to obtain a DNA fingerprint of the person who left the sample behind. 
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Figure 8-38 PCR can be used to detect the presence of a viral genome in a sample of blood. 
Because of its ability to amplify enormously the signal from a single molecule of nucleic acid, PCR 
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is an extraordinarily sensitive method for detecting trace amounts of virus in a sample of blood or 
tissue, without the need to purify the virus. For HIV, the virus that causes AIDS, the genome is a 
single-stranded molecule of RNA, as illustrated here. In addition to HIV, many other viruses that 


infect humans are now detected in this way. 


475 


Figure 8-37 PCR can be used to 
obtain either genomic or cDNA clones. 
(A) To use PCR to clone a segment of 
chromosomal DNA, total genomic DNA 
is first purified from cells. PCR primers 
that flank the stretch of DNA to be cloned 
are added, and many cycles of PCR are 
completed (see Figure 8-36). Because 
only the DNA between (and including) the 
primers is amplified, PCR provides a way 
to obtain selectively any short stretch of 
chromosomal DNA in an effectively pure 
form. (B) To use PCR to obtain a cDNA 
clone of a gene, total MRNA is first purified 
from cells. The first primer is added to 
the population of mRNAs, and reverse 
transcriptase is used to make a DNA 
strand complementary to the specific RNA 
sequence of interest. The second primer 
is then added, and the DNA molecule is 
amplified through many cycles of PCR. 
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Figure 8-39 PCR is used in forensic science to distinguish one individual from another. The DNA sequences analyzed are short tandem 
repeats (STRs) composed of sequences such as CACACA... or GTGTGT... STRs are found in various positions (loci) in the human genome. The 
number of repeats in each STR locus is highly variable in the population, ranging from 4 to 40 in different individuals. Because of the variability in 
these sequences, individuals will usually inherit a different number of repeats at each STR locus from their mother and from their father; two unrelated 
individuals, therefore, rarely contain the same pair of sequences at a given STR locus. (A) PCR using primers that recognize unique sequences on 
either side of one particular STR locus produces a pair of bands of amplified DNA from each individual, one band representing the maternal STR 
variant and the other representing the paternal STR variant. The length of the amplified DNA, and thus its position after gel electrophoresis, will 
depend on the exact number of repeats at the locus. (B) In the schematic example shown here, the same three STR loci are analyzed in samples 
from three suspects (individuals A, B, and C), producing six bands for each individual. Although different people can have several bands in common, 
the overall pattern is quite distinctive for each person. The band pattern can therefore serve as a DNA fingerprint to identify an individual nearly 
uniquely. The fourth lane (F) contains the products of the same PCR amplifications carried out on a hypothetical forensic DNA sample, which could 
have been obtained from a single hair or a tiny spot of blood left at a crime scene. 

The more loci that are examined, the more confident one can be about the results. When examining the variability at 5-10 different STR loci, the 
odds that two random individuals would share the same fingerprint by chance are approximately one in 10 billion. In the case shown here, individuals 
A and C can be eliminated from inquiries, while B is a clear suspect. A similar approach is used routinely in paternity testing. 
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With the possible exception of identical twins, the genome of each human differs 
in DNA sequence from that of every other person on Earth. Using primer pairs 
targeted at genome sequences that are known to be highly variable in the human 
population, PCR makes it possible to generate a distinctive DNA fingerprint for 
any individual (Figure 8-39). Such forensic analyses can be used not only to help 
identify those who have done wrong, but also—equally important—to exonerate 
those who have been wrongfully accused. 


Both DNA and RNA Can Be Rapidly Sequenced 


Most current methods of manipulating DNA, RNA, and proteins rely on prior 
knowledge of the nucleotide sequence of the genome of interest. But how were 
these sequences determined in the first place? And how are new DNA and RNA 
molecules sequenced today? In the late 1970s, researchers developed several 
strategies for determining, simply and quickly, the nucleotide sequence of any 
purified DNA fragment. The one that became the most widely used is called dide- 
oxy sequencing or Sanger sequencing (Panel 8-1). This method was used to 
determine the nucleotide sequence of many genomes, including those of E. coli, 
fruit flies, nematode worms, mice, and humans. Today, cheaper and faster meth- 
ods are routinely used to sequence DNA, and even mote efficient strategies are 
being developed (see Panel 8-1). The original “reference” sequence of the human 
genome, completed in 2003, cost over $1 billion and required many scientists from 
around the world working together for 13 years. The enormous progress made in 
the past decade makes it possible for a single person to complete the sequence of 
an individual human genome in less than a day. 

The methods summarized in Panel 8-1 for rapidly sequencing DNA can also 
be applied to RNA. Although methods are being developed to sequence RNA 
directly, it is most commonly carried out by converting the RNA to complemen- 
tary DNA (using reverse transcriptase) and using one of the methods described for 
DNA sequencing. It is important to keep in mind that although genomes remain 
the same from cell to cell and from tissue to tissue, the RNA produced from the 
genome can vary enormously. We will see later in this chapter that sequencing 
the entire repertoire of RNA from a cell or tissue (known as deep RNA sequenc- 
ing, or RNA-seq) is a powerful way to understand how the information present in 
the genome is used by different cells under different circumstances. In the next 
section, we shall see how RNA-seq has also become a valuable tool for annotating 
genomes. 


To Be Useful, Genome Sequences Must Be Annotated 


Long strings of nucleotides, at first glance, reveal nothing about how this genetic 
information directs the development of a living organism—or even what types 
of DNA, protein, and RNA molecules are produced by a genome. The process of 
genome annotation attempts to mark out all the genes (both protein-coding and 
noncoding) in a genome and ascribe a role to each. It also seeks to understand 
more subtle types of genome information, such as the cis-regulatory sequences 
that specify the time and place that a given gene is expressed and whether its 
mRNA undergoes alternative splicing to produce different protein isotypes. 
Clearly, this is a daunting task, and we are far short of completing it for any form of 
life, even the simplest bacterium. For many organisms, we know the approximate 
number of genes, and, for very simple organisms, we understand the functions of 
about half their genes. 

In this section, we discuss broadly how genes are identified in genome 
sequences and what clues we can discern about their roles from simply inspect- 
ing their sequences. Later in the chapter, we turn to the more difficult problem of 
experimentally determining gene function. 

How does one begin to make sense of a genome sequence? The first step is 
usually to translate in silico the entire genome into protein. There are six differ- 
ent reading frames for any piece of double-stranded DNA (three on each strand). 
We saw in Chapter 6 that a random sequence of nucleotides, read in frame, will 
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PANEL 8-1: DNA Sequencing Methods 
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MANUAL DIDEOXY SEQUENCING 


To determine the complete sequence of a 
single-stranded fragment of DNA (gray), the DNA 
is first hybridized with a short DNA primer (orange) 
that is labeled with a fluorescent dye or 
radioisotope. DNA polymerase and an excess of all 
four normal deoxyribonucleoside triphosphates 
(blue A, C, G, or T) are added to the primed DNA, 
which is then divided into four reaction tubes. 
Each of these tubes receives a small amount of a 
single chain-terminating dideoxyribonucleoside 
triphosphate (red A, C, G, or T). Because these will 
be incorporated only occasionally, each reaction 
produces a set of DNA copies that terminate at 
different points in the sequence. The products of 
these four reactions are separated by 
electrophoresis in four parallel lanes of a 
polyacrylamide gel (labeled here A, T, C, and G). In 
each lane, the bands represent fragments that 
have terminated at a given nucleotide but at 
different positions in the DNA. By reading off the 
bands in order, starting at the bottom of the gel 
and reading across all lanes, the DNA sequence of 
the newly synthesized strand can be determined 
(see Figure 8—25C). The sequence, which is given in 
the green arrow to the right of the gel, is 
complementary to the sequence of the original 
gray single-stranded DNA. 
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Dideoxy sequencing, or Sanger sequencing (named 
after the scientist who invented it), uses DNA 
polymerase, along with special chain-terminating 
nucleotides called dideoxyribonucleoside 
triphosphates (/eft), to make partial copies of the 
DNA fragment to be sequenced. These ddNTPs are 
derivatives of the normal deoxyribonucleoside 
triphosphates that lack the 3’ hydroxyl group. 
When incorporated into a growing DNA strand, 
they block further elongation of that strand. 
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Fully automated machines can run dideoxy sequencing reactions. 
(A) The automated method uses an excess amount of normal 


° dNTPs plus a mixture of four different chain-terminating ddNTPs, 
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each of which is labeled with a fluorescent tag of a different color. 
The reaction products are loaded onto a long, thin capillary gel 


and separated by electrophoresis. A camera (not shown) reads the 
color of each band as it moves through the gel and feeds the data 
to a computer that assembles the sequence. (B) A tiny part of the 
data from such an automated sequencing run. Each colored peak 
represents a nucleotide in the DNA sequence. 









SEQUENCING WHOLE GENOMES 


Shotgun sequencing: To determine the nucleotide sequence of a whole genome, 
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the genomic DNA is first fragmented into small pieces and a genomic library is = á of genome 
constructed, typically using plasmids and bacteria (see Figure 8-30). In shotgun 

sequencing, the nucleotide sequence of tens of thousands of individual clones is | random 

determined; the full genome sequence is then reconstructed by stitching together fragmentation 

(in silico) the nucleotide sequence of each clone, using the overlaps between clones 





as a guide. The shotgun method works well for small genomes (such as those of —_—_$<——— 
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BAC clones: Most plant and animal genomes are large 
(often over 10? nucleotide pairs) and contain extensive 
amounts of repetitive DNA spread throughout the 
genome. Because a nucleotide sequence of a fragment of 
repetitive DNA will “overlap” every instance of the 
repeated DNA, it is difficult, if not impossible, to assemble 
the fragments into a unique order solely by the shotgun 
method. 
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To circumvent this problem, the human genome was first 
broken down into very large DNA fragments (each 
approximately 100,000 nucleotide pairs) and cloned into 


BACs (see p. 469). The order of the BACs along a obtained that spanned the entire genome, each individual BAC 
chromosome was determined by comparing the pattern of was sequenced by the shotgun method. At the end, the 
restriction enzyme cleavage sites in a given BAC clone with sequences of all the BAC inserts were stitched together using the 
that of the whole genome. In this way, a given BAC clone knowledge of the position of each BAC insert in the human 

can be mapped, say, to the left arm of human genome. In all, approximately 30,000 BAC clones were sequenced 
Chromosome 3. Once a collection of BAC clones was to complete the human genome. 
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SECOND-GENERATION SEQUENCING TECHNOLOGIES 





from people around the world, and uncover the mutations that 










The dideoxy method made it possible to sequence the genomes 


of humans and most of the other organisms discussed in this increase the risk of various diseases, from cancer to autism. 
book. But newer methods, developed since 2005, have made These methods have also made it possible to determine the 
genome sequencing even more rapid—and very much cheaper. genome sequence of extinct species, including Neanderthal man 
With these so-called second-generation sequencing methods, and the wooly mammoth. By sequencing genomes from many 
the cost of sequencing DNA has decreased dramatically. Not closely related species, they have also made it possible to 
surprisingly, the number of genomes that have been sequenced understand the molecular basis of key evolutionary events in the 
has increased enormously. These rapid methods allow multiple tree of life, such as the “inventions” of multicellularity, vision, 
genomes to be sequenced in parallel in a matter of weeks, and language. The ability to rapidly sequence DNA has had 

| enabling investigators to examine thousands of individual major impacts on all branches of biology and medicine; it is 
human genomes, catalog the variation in nucleotide sequences almost impossible to imagine where we would be without it. 

















ILLUMINA® SEQUENCING 


Several second-generation sequencing methods are now in wide 
use, and we will discuss two of the most common. Both rely on 
the construction of libraries of DNA fragments that 
represent—in toto—the DNA of the genome. Instead of using 
bacterial cells to generate these libraries, as we saw in Figure 
8-30), they are made using PCR amplification of billions of DNA 
fragments, each attached to a solid support. The amplification is 







































One method, known as Illumina sequencing, is based on the 
dideoxy method described above, but it incorporates several 
innovations. Here, each nucleotide is attached to a removable 
fluorescent molecule (a different color for each of the four 
bases) as well as a special chain-terminating chemical adduct: 
instead of a 3’-OH group, as in conventional dideoxy sequencing, 
the nucleotides carry a chemical group that blocks elongation by 
DNA polymerase but which can be removed chemically. 
Sequencing is then carried out as follows: the four fluorescently 

I labeled nucleotides along with DNA polymerase are added to 
billions of DNA clusters immobilized on a slide. Only the 
appropriate nucleotide (that is complementary to the next 
nucleotide in the template) is covalently incorporated at each 
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Principle behind Illumina sequencing. This reaction is carried out 
stepwise, on billions of DNA clusters at once. The method relies 
on a color digital camera that rapidly scans all the DNA clusters 
after each round of modified nucleotide incorporation. The DNA 
sequence of each cluster is then determined by the sequence of 
color changes it undergoes as the elongation reaction proceeds 
stepwise. Each round of modified nucleotide incorporation, 








carried out so that the PCR-generated copies, instead of floating 
away in solution, remain bound in proximity to the original DNA 
fragment. This process generates clusters of DNA fragments, 
where each cluster contains about 1000 identical copies of a 
small bit of the genome. These clusters—a billion of which can 
fit in a single slide or plate—are then sequenced at the same 
time; that is, in parallel. 


A slide showing individual 
clusters of PCR-generated 
DNA molecules. Each 
cluster carries about 1000 
identical DNA molecules; 
the four colors are 
produced by incorporation 
of C, G, A, or T, each of 
which has a different color 
fluorophore. The image 
has been taken just after a 
fluorescent nucleotide has 
been incorporated into 
each growing DNA chain. 
(From Illumina Sequencing 
Overview, 2013.) 
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cluster; the unincorporated nucleotides are washed away, and 
a high-resolution digital camera takes an image that registers 
which of the four nucleotides was added to the chain at each 
cluster. The fluorescent label and the 3'-OH blocking group are 
then removed enzymatically, washed away, and the process is 
repeated many times. In this way, billions of sequencing 
reactions are carried out simultaneously. By keeping track of 
the color changes occuring at each cluster, the DNA sequence 
represented by each spot can be read. Although each 
individual sequence read is relatively short (approximately 

200 nucleotides), the billions that are carried out 
simultaneously can produce several human genomes worth 

of sequence in about a day. 





Wily 5’ 
yy YS no 
TTA R Of fluor 
next 
cycle 
n OH y 
= —_ free 3’ end 
ZA 


block fluor 
removed removed 


image acquisition, and removal of the 3’ block and the 
fluorescent group takes less than an hour. Each cluster on the 
slide contains many copies of different, random bits of a 
genome; in preparing the clusters, a DNA sequence (specified 
by the experimenter) is joined to each copy in every cluster, 
and a primer complementary to this sequence is used to begin 
the elongation reaction by DNA polymerase. 







































particular DNA fragment. 


Like eggs in a carton, the beads are placed into individual 
wells on an array that can hold a billion beads in a square 
inch. Beginning with a primer, DNA synthesis is then initiated 
on each bead. A hydrogen ion (H*) is released (along with 
pyrophosphate) each time a nucleotide is incorporated into a 
growing DNA chain (see Figure 5-3), and the ion torrent 
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ION TORRENT™ SEQUENCING 


Another widely used strategy for rapid DNA sequencing is 
called the jon torrent method. Here, a genome is fragmented, 
and the individual fragments are attached to microscopic 
beads. Using PCR, each DNA fragment is then amplified so 
that copies of it eventually coat the bead to which it was 
initially attached. This process produces a library of billions of 
individual beads, each covered with identical copies of a 
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method is based on this simple fact. Each of the four 
nucleotides is washed in, one at a time, over the array of 
beads; when a nucleotide is incorporated in the DNA of a 
given bead, the release of an H* ion changes the pH, which 
is registered by a semiconductor chip placed beneath the 
array of wells. In this way, the DNA sequence on a given 
bead can be read from the pattern of pH changes observed 
as nucleotides are washed over them. Like a high-resolution 
sensor in a digital camera, the ion torrent semiconductor 
chip can register enormous amounts of information and can 
thus keep track of billions of parallel sequencing reactions. 
Using this technology it is currently possible, using a single 
chip, to determine the nucleotide sequences of several 
human genomes in just a few hours. 


DNA sequencing by the ion torrent method. Beads, each coated with a DNA 
molecule that has been amplified many times, are placed in wells along with 
primers and DNA polymerase. As nucleotides are sequentially washed over the 
beads, those incorporated by the polymerase cause a pH change. In the 
example shown, an A is incorporated; thus, the template must have a T in this 
position. As the four nucleotides are sequentially washed over the beads, the 
sequence of the DNA on each bead can be “read” by the pattern of pH 
fluctuations. Billions of beads are monitored at once by a voltage-sensitive 
semiconductor chip placed below the array of beads. 


2009 


THE FUTURE OF DNA SEQUENCING 


Even newer, potentially faster, methods of sequencing DNA are being 
developed. Some of these “third-generation” technologies bypass the 
DNA amplification steps altogether and determine the sequence of single 
molecules of DNA. In one technique, a DNA molecule is pushed through a 
tiny channel, like a thread through the eye of a needle. As the DNA 
molecule moves through the pore, it generates electrical currents that 
depend on its sequence of nucleotides; the pattern of currents can then 
be used to deduce the nucleotide sequence. Other methods visualize 
single DNA molecules using electron or atomic force microscopy; the 
nucleotide sequence is read from the small differences in the 
“appearance” of the DNA as it is scanned. Finally, another method is 
based on immobilizing a single DNA polymerase molecule (with a 
template) and measuring the “dwell” time of each of the four 
nucleotides, which are labeled with different removable fluorescent dyes. 
Nucleotides that reside longer on the polymerase 
(before their dye is removed) are those 
incorporated by the polymerase. Although the 
two methods we have described in detail 
(Illumina and ion torrent) are now used 
extensively, it is likely that faster and cheaper 
methods will continue to be developed. 


Shown here are the costs of sequencing a 
human genome, which was $100 million in 2001 
and about a thousand dollars by the end of 
2014. (Data from the National Human Genome 


Research Initiative.) 
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contain a stop codon about every 20 amino acids; protein-coding regions will, 
in contrast, usually contain much longer stretches without stop codons (Figure 
8-40). Known as open reading frames (ORFs), these usually signify bona fide 
protein-coding genes. This assignment is often “double-checked” by comparing 
the ORF amino acid sequence to the many databases of documented proteins 
from other species. If a match is found, even as imperfect one, it is very likely that 
the ORF will code for a functional protein (see Figure 8-23). 

This strategy works very well for compact genomes, where intron sequences 
are rare and ORFs often extend for many hundreds of amino acids. However, in 
many animals and plants, the average exon size is 150-200 nucleotide pairs (see 
Figure 6-31) and additional information is usually required to unambiguously 
locate all the exons of a gene. Although it is possible to search genomes for splic- 
ing signals and other features that help to identify exons (codon bias, for exam- 
ple), one of the most powerful methods is simply to sequence the total RNA pro- 
duced from the genome in living cells. As can be seen in Figure 7-3, this RNA-seq 
information, when mapped onto the genome sequence, can be used to accurately 
locate all the introns and exons of even complex genes. By sequencing total RNA 
from different cell types, it is also possible to identify cases of alternative splicing 
(see Figure 6-26). 

RNA-seq also identifies noncoding RNAs produced by a genome. Although the 
function of some of them can be readily recognized (tRNAs or snoRNAs, for exam- 
ple), many have unknown functions and still others probably have no function at 
all (discussed in Chapter 7, pp. 429-436). The existence of the many noncoding 
RNAs and our relative ignorance of their function is the main reason that we know 
only the approximate number of genes in the human genome. 

But even for protein-coding genes that have been unambiguously identified, 
we still have much to learn. Thousands of genomes have been sequenced, and we 
know from comparative genomics that many organisms share the same basic set 
of proteins. However, the functions of a very large number of identified proteins 
remain unknown. Depending on the organism, approximately one-third of the 
proteins encoded by a sequenced genome do not clearly resemble any protein 
that has been studied biochemically. This observation underscores a limitation of 
the emerging field of genomics: although comparative analysis of genomes reveals 
a great deal of information about the relationships between genes and organisms, 
it often does not provide immediate information about how these genes function, 
or what roles they have in the physiology of an organism. Comparison of the full 


Figure 8-40 Finding the regions ina 
DNA sequence that encode a protein. 
(A) Any region of the DNA sequence can, 

in principle, code for six different amino 
acid sequences, because any one of three 
different reading frames can be used to 
interpret the nucleotide sequence on each 
strand. Note that a nucleotide sequence 

is always read in the 5’-to-3' direction and 
encodes a polypeptide from the N-terminus 
to the C-terminus. For a random nucleotide 
sequence read in a particular frame, 

a stop signal for protein synthesis is 
encountered, on average, about once every 
20 amino acids. In this sample sequence 
of 48 base pairs, each such signal (stop 
codon) is colored blue, and only reading 
frame 2 lacks a stop signal. (B) Search 

of a 1700-base-pair DNA sequence for a 
possible protein-encoding sequence. The 
information is displayed as in (A), with each 
stop signal for protein synthesis denoted 
by a blue line. In addition, all of the regions 
between possible start and stop signals 

for protein synthesis (see pp. 347-349) are 
displayed as red bars. Only reading frame 

1 actually encodes a protein, which is 475 
amino acid residues long. 


ANALYZING AND MANIPULATING DNA 483 


gene complement of several thermophilic bacteria, for example, does not reveal 
why these bacteria thrive at temperatures exceeding 70°C. And examination of the 
genome of the incredibly radioresistant bacterium Deinococcus radiodurans does 
not explain how this organism can survive a blast of radiation that can shatter 
glass. Further biochemical and genetic studies, like those described in the other 
sections of this chapter, are required to determine how genes, and the proteins 
they produce, function in the context of living organisms. 


DNA Cloning Allows Any Protein to be Produced in Large 
Amounts 


In the last section, we saw how protein-coding genes can be identified in genome 
sequences. Using the genetic code (and provided the intron and exon boundaries 
are known), the amino acid sequence of any protein coded in a genome can be 
deduced. As was discussed earlier, this sequence can often provide an important 
clue to the protein’s function if found to be similar to the amino acid sequence of 
a protein that has already been studied (see Figure 8-23). Although this strategy 
is often successful, it typically provides only the likely biochemical function of the 
protein; for example, whether the protein resembles a kinase or a protease. It usu- 
ally remains for the experimenter to verify (or refute) this assignment and, most 
importantly, to discover the protein’s biological function in the whole organism; 
that is, to what attributes of the organism does the kinase or the protease contrib- 
ute and in what molecular pathways does it function? Nowadays, most new pro- 
teins are “discovered” through genome sequencing, and it often remains a great 
challenge to ascertain their functions. 

An important approach in determining gene function is to alter the gene (or in 
some cases, its expression pattern), to put the altered copy back into the germ line 
of the organism, and to deduce the function of the normal gene by the changes 
caused by its alteration. Various techniques to implement this strategy are dis- 
cussed in the next section of this chapter. But it is equally important to study the 
biochemical and structural properties of a gene product, as outlined in the first 
part of this chapter. One of the most important contributions of DNA cloning to 
cell and molecular biology is the ability to produce any protein, even the rare ones, 
in nearly unlimited amounts—as long as the gene coding for it is known. Such E | 


expression vector 


high-level production is usually carried out in living cells using expression vectors seguencè 

; i . CUT DNA WITH 
(Figure 8-41). These are generally plasmids that have been designed to produce a RESTRICTION NUCLEASE 
large amount of stable mRNA that can be efficiently translated into protein when 


the plasmid is introduced into bacterial, yeast, insect, or mammalian cells. To pre- ( ) 
vent the high level of the foreign protein from interfering with the cell’s growth, 


the expression vector is often designed to delay the synthesis of the foreign mRNA INSERT PROTEIN- 
and protein until shortly before the cells are harvested and lysed (Figure 8-42). CODING DNA SEQUENCE 
Because the desired protein made from an expression vector is produced —r 
inside a cell, it must be purified away from the host-cell proteins by chromatogra- 
phy following cell lysis; but because it is such a plentiful species in the cell (often 
1-10% of the total cell protein), the purification is usually easy to accomplish C ) 
in only a few steps. As we saw in the first part of this chapter, many expression 
RECOMBINANT DNA | 
Figure 8-41 Production of large amounts of a protein from a protein- INTO CELLS 


coding DNA sequence cloned into an expression vector and introduced 
into cells. A plasmid vector has been engineered to contain a highly active 
promoter, which causes unusually large amounts of mRNA to be produced 
from an adjacent protein-coding gene inserted into the plasmid vector. 
Depending on the characteristics of the cloning vector, the plasmid is 
introduced into bacterial, yeast, insect, or mammalian cells, where the inserted 
gene is efficiently transcribed and translated into protein. If the gene to be 
overexpressed has no introns (typical for genes from bacteria, archaea, and 
simple eukaryotes), it can simply be cloned from genomic DNA by PCR. For 
cloned animal and plant genes, it is often more convenient to obtain the gene 
as CDNA, either from a cDNA library (see Figure 8-32) or cloned directly by 
PCR from RNA isolated from the organism (see Figure 8-37). Alternatively, the 
DNA coding for the protein can be made by chemical synthesis (see p. 472). 
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vectors have been designed to add a molecular tag—a cluster of histidine residues 
or a small marker protein—to the expressed protein to facilitate easy purification 
by affinity chromatography (see Figure 8-11). A variety of expression vectors is 
available, each engineered to function in the type of cell in which the protein is 
to be made. 

This technology is also used to make large amounts of many medically use- 
ful proteins, including hormones (such as insulin and growth factors) used as 
human pharmaceuticals, and viral coat proteins for use in vaccines. Expression 
vectors also allow scientists to produce many proteins of biological interest in 
large enough amounts for detailed structural studies. Nearly all three-dimen- 
sional protein structures depicted in this book are of proteins produced in this 
way. Recombinant DNA techniques thus allow scientists to move with ease from 
protein to gene, and vice versa, so that the functions of both can be explored on 
multiple fronts (Figure 8-43). 


Summary 


DNA cloning allows a copy of any specific part of a DNA or RNA sequence to be 
selected from the millions of other sequences in a cell and produced in unlimited 
amounts in pure form. DNA sequences can be amplified after breaking up chro- 
mosomal DNA and inserting the resulting DNA fragments into the chromosome of 
a self-replicating genetic element such as a plasmid. The resulting “genomic DNA 
library” is housed in millions of bacterial cells, each carrying a different cloned DNA 
fragment. Individual cells from this library that are allowed to proliferate produce 
large amounts of a single cloned DNA fragment. Bypassing cloning vectors and bac- 
terial cells altogether, the polymerase chain reaction (PCR) allows DNA cloning to 
be performed directly with a DNA polymerase and DNA primers—provided that 
the DNA sequence of interest is already known. 

The procedures used to obtain DNA clones that correspond in sequence to 
mRNA molecules are the same, except that a DNA copy of the mRNA sequence, 
called cDNA, is first made. Unlike genomic DNA clones, cDNA clones lack intron 
sequences, making them the clones of choice for analyzing the protein product of a 
gene. 

Nucleic acid hybridization reactions provide a sensitive means of detecting 
any nucleotide sequence of interest. The enormous specificity of this hybridization 
reaction allows any single-stranded sequence of nucleotides to be labeled with a 
radioisotope or chemical and used as a probe to find a complementary partner 
strand, even in a cell or cell extract that contains millions of different DNA and RNA 
sequences. DNA hybridization also makes it possible to use PCR to amplify any sec- 
tion of any genome once its sequence is known. 
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Figure 8-42 Production of large 
amounts of a protein by using a plasmid 
expression vector. In this example, an 
expression vector that overoroduces a DNA 
helicase has been introduced into bacteria. 
In this expression vector, transcription from 
this coding sequence is under the control 
of a viral promoter that becomes active 
only at a temperature of 37°C or higher. 
The total cell protein, either from bacteria 
grown at 25°C (no helicase protein made) 
or after a shift of the same bacteria to 
42°C for up to 2 hours (helicase protein 
has become the most abundant protein 
species in the lysate), has been analyzed 
by SDS polyacrylamide-gel electrophoresis. 
(Courtesy of Jack Barry.) 
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Figure 8-43 Recombinant DNA techniques make it possible to move experimentally from gene to protein and from protein to gene. If a 
gene has been identified (right), its protein-coding sequence can be inserted into an expression vector to produce large quantities of the protein 
(see Figure 8—41), which can then be studied biochemically or structurally. If a protein has been purified based on its biochemical properties, 
mass spectrometry (see Figure 8-18) can be used to obtain a partial amino acid sequence, which is used to search a genome sequence for the 
corresponding nucleotide sequence. The complete gene can then be cloned by PCR from a sequenced genome (see Figure 8-37). The gene can 
also be manipulated and introduced into cells or organisms to study its function, a topic covered in the next section of this chapter. 


STUDYING GENE EXPRESSION AND FUNCTION 


The nucleotide sequence of any genome can be determined rapidly and simply 
by using highly automated techniques based on several different strategies. Com- 
parison of the genome sequences of different organisms allows us to trace the evolu- 
tionary relationships among genes and organisms, and it has proved valuable for 
discovering new genes and predicting their functions. 

Taken together, these techniques for analyzing and manipulating DNA have 
made it possible to sequence, identify, and isolate genes from any organism of inter- 
est. Related technologies allow scientists to produce the protein products of these 
genes in the large quantities needed for detailed analyses of their structure and 
function, as well as for medical purposes. 


STUDYING GENE EXPRESSION AND FUNCTION 


Ultimately, one wishes to determine how genes—and the proteins they encode— 
function in the intact organism. Although it may seem counterintuitive, one of the 
most direct ways to find out what a gene does is to see what happens to the organ- 
ism when that gene is missing. Studying mutant organisms that have acquired 
changes or deletions in their nucleotide sequences is a time-honored practice 
in biology and forms the basis of the important field of genetics. Because muta- 
tions can disrupt cell processes, mutants often hold the key to understanding 
gene function. In the classical genetic approach, one begins by isolating mutants 
that have an interesting or unusual appearance: fruit flies with white eyes or curly 
wings, for example. Working backward from the phenotype—the appearance or 
behavior of the individual—one then determines the organism’s genotype, the 
form of the gene responsible for that characteristic (Panel 8-2). 

Today, with numerous genome sequences available, the exploration of gene 
function often begins with a DNA sequence. Here, the challenge is to translate 
sequence into function. One approach, discussed earlier in the chapter, is to 
search databases for well-characterized proteins that have similar amino acid 
sequences to the protein encoded by a new gene. From there, the protein (or for 
noncoding genes, the RNA molecule) can be overexpressed and purified and the 
methods described in the first part of this chapter can be employed to study its 
three-dimensional structure and its biochemical properties. But to determine 
directly a gene’s function in a cell or organism, the most effective approach 
involves studying mutants that either lack the gene or express an altered version 
of it. Determining which cell processes have been disrupted or compromised in 
such mutants will usually shed light on a gene’s biological role. 

In this section, we describe several approaches to determining a gene’s func- 
tion, starting either from an individual with an interesting phenotype or from a 
DNA sequence. We begin with the classical genetic approach, which starts with 
a genetic screen for isolating mutants of interest and then proceeds toward iden- 
tification of the gene or genes responsible for the observed phenotype. We then 
describe the set of techniques that are collectively called reverse genetics, in which 
one begins with a gene or gene sequence and attempts to determine its func- 
tion. This approach often involves some intelligent guesswork—searching for 
similar sequences in other organisms or determining when and where a gene is 
expressed—as well as generating mutant organisms and characterizing their phe- 


notype. 


Classical Genetics Begins by Disrupting a Cell Process by 
Random Mutagenesis 


Before the advent of gene cloning technology, most genes were identified by the 
abnormalities produced when the gene was mutated. Indeed, the very concept of 
the gene was deduced from the heritability of such abnormalities. This classical 
genetic approach—identifying the genes responsible for mutant phenotypes—is 
most easily performed in organisms that reproduce rapidly and are amenable to 
genetic manipulation, such as bacteria, yeasts, nematode worms, and fruit flies. 
Although spontaneous mutants can sometimes be found by examining extremely 
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PANEL 8-2: Review of Classical Genetics 


GENES AND PHENOTYPES 


a functional unit of inheritance, usually corresponding 
to the segment of DNA coding for a single protein. 
all of an organism's DNA sequences. 


the site of the gene in the genome 


7 
alternative forms of a gene 


the specific set of 
alleles forming the genome of 
an individual 


the visible 
character of the individual 


the normal, 
naturally occurring type 


homozygous A/A 


allele A is 


heterozygous a/A 


(relative to a); allele a is 


differing from the 
wild-type because of a genetic 
change (a mutation) 


homozygous a/a 


(relative to A) 


In the example above, the phenotype of the heterozygote is the same as that of one of the 
homozygotes; in cases where it is different from both, the two alleles are said to be co-dominant. 
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a chromosome near the end of the cell cycle, in 
metaphase; it is duplicated and condensed, consisting of 
two identical sister chromatids (each containing one DNA 
double helix) joined at the centromere. 
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A normal diploid chromosome set, as 
seen in a metaphase spread, prepared 
by bursting open a cell at metaphase 
and staining the scattered 
chromosomes. In the example shown 
schematically here, there are three 
pairs of autosomes (chromosomes 
inherited symmetrically from 

both parents, regardless of sex) and 
two sex chromosomes—an X from the 
mother and a Y from the father. The 
numbers and types of sex 
chromosomes and their role in sex 
determination are variable from one 
class of organisms to another, as is the 
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For simplicity, the cycle is shown for only 
one chromosome/chromosome pair. 
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haploid gametes (eggs or sperm) 


The greater the distance 
between two loci on a single 
chromosome, the greater is the 
chance that they will be 
separated by crossing over 
occurring at a site between them. 
If two genes are thus reassorted 
in x% of gametes, they are said 
to be separated ona 
chromosome by a 

of x (or 
x ). 





TYPES OF MUTATIONS 
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maps to a single site in the genome, 
corresponding to a single nucleotide pair or a very 
small part of a single gene 
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inverts a segment of a chromosome 


causes the developing organism to die 
prematurely. 
produces its phenotypic effect only 
under certain conditions, called the restrictive conditions. 
Under other conditions—the permissive conditions—the 
effect is not seen. For a temperature-sensitive mutation, the 
restrictive condition typically is high temperature, while the 
permissive condition is low temperature. 
either reduces or abolishes the 
activity of the gene. These are the most common class of 
mutations. Loss-of-function mutations are usually 
recessive—the organism can usually function normally as long 
as it retains at least one normal copy of the affected gene. 
a loss-of-function mutation that completely 
abolishes the activity of the gene. 


deletes a segment of a chromosome 
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@ Ee 
breaks off a segment from one 
chromosome and attaches it to another 


increases the activity of the gene 
or makes it active in inappropriate circumstances; these 
mutations are usually dominant. 
dominant-acting mutation that 

blocks gene activity, causing a loss-of-function phenotype 
even in the presence of a normal copy of the gene. This 
phenomenon occurs when the mutant gene product 
interferes with the function of the normal gene product. 

suppresses the phenotypic effect of 
another mutation, so that the double mutant seems normal. 
An intragenic suppressor mutation lies within the gene 
affected by the first mutation; an extragenic suppressor 
mutation lies in a second gene—often one whose product 
interacts directly with the product of the first. 


TWO GENES OR ONE? 


Given two mutations that produce the same phenotype, how can 
we tell whether they are mutations in the same gene? If the 
mutations are recessive (as they most often are), the answer can 
be found by a 


In the simplest type of complementation test, an individual who 
is homozygous for one mutation is mated with an individual 
who is homozygous for the other. The phenotype of the 
offspring gives the answer to the question. 


NONCOMPLEMENTATION: 
TWO INDEPENDENT MUTATIONS IN THE SAME GENE 


COMPLEMENTATION: 
MUTATIONS IN TWO DIFFERENT GENES 


homozygous mutant mother homozygous mutant father homozygous mutant mother homozygous mutant father 
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hybrid offspring shows mutant phenotype: 
no normal copies of the mutated gene are present 


hybrid offspring shows normal phenotype: 
one normal copy of each gene is present 
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large populations—thousands or tens of thousands of individual organisms—iso- 
lating mutant individuals is much more efficient if one generates mutations with 
chemicals or radiation that damage DNA. By treating organisms with such muta- 
gens, very large numbers of mutant individuals can be created quickly and then 
screened for a particular defect of interest, as we discuss shortly. 

An alternative approach to chemical or radiation mutagenesis is called inser- 
tional mutagenesis. This method relies on the fact that exogenous DNA inserted 
randomly into the genome can produce mutations if the inserted fragment inter- 
rupts a gene or its regulatory sequences. The inserted DNA, whose sequence is 
known, then serves as a molecular tag that aids in the subsequent identifica- 
tion and cloning of the disrupted gene (Figure 8-44). In Drosophila, the use of 
the transposable P element to inactivate genes has revolutionized the study of 
gene function in the fly. Transposable elements (see Table 5-4, p. 288) have also 
been used to generate mutations in bacteria, yeast, mice, and the flowering plant 
Arabidopsis. 


Genetic Screens Identify Mutants with Specific Abnormalities 


Once a collection of mutants in a model organism such as yeast or fly has been 
produced, one generally must examine thousands of individuals to find the 
altered phenotype of interest. Such a search is called a genetic screen, and the 
larger the genome, the less likely it is that any particular gene will be mutated. 
Therefore, the larger the genome of an organism, the bigger the screening task 
becomes. The phenotype being screened for can be simple or complex. Simple 
phenotypes are easiest to detect: one can screen many organisms rapidly, for 
example, for mutations that make it impossible for the organism to survive in the 
absence of a particular amino acid or nutrient. 

More complex phenotypes, such as defects in learning or behavior, may require 
more elaborate screens (Figure 8-45). But even genetic screens that are used to 
dissect complex physiological systems can be simple in design, which permits the 
simultaneous examination of large numbers of mutants. As an example, one par- 
ticularly elegant screen was designed to search for genes involved in visual pro- 
cessing in zebrafish. The basis of this screen, which monitors the fishes’ response 
to motion, is a change in behavior. Wild-type fish tend to swim in the direction of 
a perceived motion, whereas mutants with defects in their visual processing sys- 
tems swim in random directions—a behavior that is easily detected. One mutant 
discovered in this screen is called lakritz, which is missing 80% of the retinal gan- 
glion cells that help to relay visual signals from the eye to the brain. As the cellular 
organization of the zebrafish retina is similar to that of all vertebrates, the study 
of such mutants should also provide insights into visual processing in humans. 

Because defects in genes that are required for fundamental cell processes— 
RNA synthesis and processing or cell-cycle control, for example—are usu- 
ally lethal, the functions of these genes are often studied in individuals with 
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Figure 8-44 Insertional mutant of the 
snapdragon, Antirrhinum. A mutation 

in a single gene coding for a regulatory 
protein causes leafy shoots (/eft) to develop 
in place of flowers, which occur in the 
normal plant (right). The mutation causes 
cells to adopt a character that would be 
appropriate to a different part of the normal 
plant, so instead of a flower, the cells 
produce a leafy shoot. (Courtesy of Enrico 
Coen and Rosemary Carpenter.) 


Figure 8-45 A behavioral phenotype 
detected in a genetic screen. (A) Wild- 
type C. elegans engage in social feeding. 
The worms migrate around until they 
encounter their neighbors and commence 
feeding on bacteria. (B) Mutant animals 
feed by themselves. (Courtesy of Cornelia 
Bargmann, Cell 94: cover, 1998. With 
permission from Elsevier.) 
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conditional mutations. The mutant individuals function normally as long as 
“permissive” conditions prevail, but demonstrate abnormal gene function when 
subjected to “nonpermissive” (restrictive) conditions. In organisms with tem- 
perature-sensitive mutations, for example, the abnormality can be switched on 
and off experimentally simply by changing the ambient temperature; thus, a cell 
containing a temperature-sensitive mutation in a gene essential for survival will 
die at a nonpermissive temperature but proliferate normally at the permissive 
temperature (Figure 8-46). The temperature-sensitive gene in such a mutant usu- 
ally contains a point mutation that causes a subtle change in its protein product; 
for example, the mutant protein may function normally at low temperatures but 
unfold at higher temperatures. 

Temperature-sensitive mutations were crucial to find the bacterial genes that 
encode the proteins required for DNA replication. The mutants were identified 
by screening populations of mutagen-treated bacteria for cells that stop making 
DNA when they are warmed from 30°C to 42°C. These mutants were later used 
to identify and characterize the corresponding DNA replication proteins (dis- 
cussed in Chapter 5). Similarly, screens for temperature-sensitive mutations led 
to the identification of many proteins involved in regulating the cell cycle, as well 
as many proteins involved in moving proteins through the secretory pathway 
in yeast. Related screening approaches demonstrated the function of enzymes 
involved in the principal metabolic pathways of bacteria and yeast (discussed in 
Chapter 2) and identified many of the gene products responsible for the orderly 
development of the Drosophila embryo (discussed in Chapter 21). 


Mutations Can Cause Loss or Gain of Protein Function 


Gene mutations are generally classed as “loss of function” or “gain of function.” A 
loss-of-function mutation results in a gene product that either does not work or 
works too little; thus, it can reveal the normal function of the gene. A gain-of-func- 
tion mutation results in a gene product that works too much, works at the wrong 
time or place, or works in a new way (Figure 8-47). 

An important early step in the genetic analysis of any mutant cell or organ- 
ism is to determine whether the mutation causes a loss or a gain of function. A 
standard test is to determine whether the mutation is dominant or recessive. A 
dominant mutation is one that still causes the mutant phenotype in the presence 
of a single copy of the wild-type gene. A recessive mutation is one that is no longer 
able to cause the mutant phenotype in the presence of a single wild-type copy of 
the gene. Although cases have been described in which a loss-of-function muta- 
tion is dominant or a gain-of-function mutation is recessive, in the vast majority of 
cases, recessive mutations are loss of function and dominant mutations are gain 
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Figure 8-46 Screening for temperature- 
sensitive bacterial or yeast mutants. 
Mutagenized cells are plated out at the 
permissive temperature. They divide 

and form colonies, which are transferred 
to two identical Petri dishes by replica 
plating. One of these plates is incubated 
at the permissive temperature, the other 
at the nonpermissive temperature. Cells 
containing a temperature-sensitive 
mutation in a gene essential for proliferation 
can divide at the normal, permissive 
temperature but fail to divide at the 
elevated, nonpermissive temperature. 
Temperature-sensitive mutations of this 
type were especially useful for identifying 
genes needed for DNA replication, an 
essential process. 


Figure 8-47 Gene mutations that affect 
their protein product in different ways. 

In this example, the wild-type protein has 

a specific cell function denoted by the red 
rays. Mutations that eliminate this function 
or inactivate it at higher temperatures are 
shown. The conditional mutant protein 
carries an amino acid substitution (red) 
that prevents its proper folding at 37°C, 

but allows the protein to fold and function 
normally at 25°C. Such temperature- 
sensitive conditional mutations are 
especially useful for studying essential 
genes; the organism can be grown under 
the permissive condition and then be moved 
to the nonpermissive condition to study the 
consequences of losing the gene product. 
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of function. It is easy to determine if a mutation is dominant or recessive. One 
simply mates a mutant with a wild type to obtain diploid cells or organisms. The 
progeny from the mating will be heterozygous for the mutation. Ifthe mutant phe- 
notype is no longer observed, one can conclude that the mutation is recessive and 
is very likely to be a loss-of-function mutation (see Panel 8-2). 


Complementation Tests Reveal Whether Two Mutations Are in the 
Same Gene or Different Genes 


A large-scale genetic screen can turn up many different mutations that show the 
same phenotype. These defects might lie in different genes that function in the 
same process, or they might represent different mutations in the same gene. Alter- 
native forms of the same gene are known as alleles. The most common difference 
between alleles is a substitution of a single nucleotide pair, but different alleles 
can also bear deletions, substitutions, and duplications. How can we tell, then, 
whether two mutations that produce the same phenotype occur in the same gene 
or in different genes? If the mutations are recessive—if, for example, they repre- 
sent a loss of function of a particular gene—a complementation test can be used 
to ascertain whether the mutations fall in the same gene or in different genes. To 
test complementation in a diploid organism, an individual that is homozygous 
for one mutation—that is, it possesses two identical alleles of the mutant gene in 
question—is mated with an individual that is homozygous for the other muta- 
tion. If the two mutations are in the same gene, the offspring show the mutant 
phenotype, because they still will have no normal copies of the gene in question 
(Figure 8-48). If, in contrast, the mutations fall in different genes, the resulting 
offspring show a normal phenotype, because they retain one normal copy (and 
one mutant copy) of each gene; the mutations thereby complement one another 
and restore a normal phenotype. Complementation testing of mutants identi- 
fied during genetic screens has revealed, for example, that 5 different genes are 
required for yeast to digest the sugar galactose, 20 genes are needed for E. coli to 
build a functional flagellum, 48 genes are involved in assembling bacteriophage 
T4 viral particles, and hundreds of genes are involved in the development of an 
adult nematode worm from a fertilized egg. 


Gene Products Can Be Ordered in Pathways by Epistasis Analysis 


Once a set of genes involved in a particular biological process has been identified, 
the next step is often to determine in which order the genes function. Gene order 
is perhaps easiest to explain for metabolic pathways, where, for example, enzyme 
A is necessary to produce the substrate for enzyme B. In this case, we would say 
that the gene encoding enzyme A acts before (upstream of) the gene encoding 
enzyme B in the pathway. Similarly, where one protein regulates the activity of 
another protein, we would say that the former gene acts before the latter. Gene 
order can, in many cases, be determined purely by genetic analysis without any 
knowledge of the mechanism of action of the gene products involved. 

Suppose we have a biosynthetic process consisting of a sequence of steps, 
such that performance of step B is conditional on completion of the preceding 
step A; and suppose gene A is required for step A, and gene B is required for step 
B. Then a null mutation (a mutation that abolishes function) in gene A will arrest 
the process at step A, regardless of whether gene B is functional or not, whereas 
a null mutation in gene B will cause arrest at step B only if gene A is still active. 
In such a case, gene A is said to be epistatic to gene B. By comparing the pheno- 
types of the different combinations of mutations, we can therefore discover the 
order in which the genes act. This type of analysis is called epistasis analysis. As 
an example, the pathway of protein secretion in yeast has been analyzed in this 
way. Different mutations in this pathway cause proteins to accumulate aberrantly 
in the endoplasmic reticulum (ER) or in the Golgi apparatus. When a yeast cell is 
engineered to carry both a mutation that blocks protein processing in the ER and 
a mutation that blocks processing in the Golgi apparatus, proteins accumulate in 
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Figure 8-48 A complementation test can 
reveal that mutations in two different 
genes are responsible for the same 
abnormal phenotype. When an albino 
(white) bird from one strain is bred with an 
albino from a different strain, the resulting 
offspring (bottom) have normal coloration. 
This restoration of the wild-type plumage 
indicates that the two white breeds lack 
color because of recessive mutations 

in different genes. (From W. Bateson, 
Mendel’s Principles of Heredity, 1st ed. 
Cambridge, UK: Cambridge University 
Press, 1913.) 
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the ER. This indicates that proteins must pass through the ER before being sent to 
the Golgi before secretion (Figure 8-49). Strictly speaking, an epistasis analysis 
can only provide information about gene order in a pathway when both muta- 
tions are null alleles. When the mutations retain partial function, their epistasis 
interactions can be difficult to interpret. 

Sometimes, a double mutant will show a new or more severe phenotype than 
either single mutant alone. This type of genetic interaction is called a synthetic 
phenotype, and if the phenotype is death of the organism, it is called synthetic 
lethality. In most cases, a synthetic phenotype indicates that the two genes act in 
two different parallel pathways, either of which is capable of mediating the same 
cell process. Thus, when both pathways are disrupted in the double mutant, the 
process fails altogether, and the synthetic phenotype is observed. 


Mutations Responsible for a Phenotype Can Be Identified Through 
DNA Analysis 


Once a collection of mutant organisms with interesting phenotypes has been 
obtained, the next task is to identify the gene or genes responsible for the altered 
phenotype. If the phenotype has been produced by insertional mutagenesis, 
locating the disrupted gene is fairly simple. DNA fragments containing the inser- 
tion (a transposon or a retrovirus, for example) are amplified by PCR, and the 
nucleotide sequence of the flanking DNA is determined. The gene affected by 
the insertion can then be identified by a computer-aided search of the complete 
genome sequence of the organism. 

If a DNA-damaging chemical was used to generate the mutations, identifying 
the inactivated gene is often more laborious, but there are several powerful strate- 
gies available. If the genome size of the organism is small (for example, for bacte- 
ria or simple eukaryotes), it is possible to simply determine the genome sequence 
of the mutant organism and identify the affected gene by comparison with the 
wild-type sequence. Because of the continuous accumulation of neutral muta- 
tions, there will probably be differences between the two genome sequences in 
addition to the mutation responsible for the phenotype. One way of proving that 
a mutation is causative is to introduce the putative mutation back into a normal 
organism and determine whether or not it causes the mutant phenotype. We will 
discuss how this is accomplished later in the chapter. 


Rapid and Cheap DNA Sequencing Has Revolutionized Human 
Genetic Studies 


Genetic screens in model experimental organisms have been spectacularly suc- 
cessful in identifying genes and relating them to various phenotypes, including 
many that are conserved between these organisms and humans. But how can we 
study humans directly? They do not reproduce rapidly, cannot be treated with 
mutagens, and, if they have a defect in an essential process such as DNA replica- 
tion, would die long before birth. 

Despite their limitations compared to model organisms, humans are becom- 
ing increasingly attractive subjects for genetic studies. Because the human 
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Figure 8-49 Using genetics to determine 
the order of function of genes. In normal 
cells, secretory proteins are loaded into 
vesicles, which fuse with the plasma 
membrane to secrete their contents into 
the extracellular medium. Two mutants, A 
and B, fail to secrete proteins. In mutant A, 
secretory proteins accumulate in the ER. In 
mutant B, secretory proteins accumulate 
in the Golgi. In the double mutant AB, 
proteins accumulate in the ER; this 
indicates that the gene defective in mutant 
A acts before the gene defective in mutant 
B in the secretory pathway. 
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population is so large, spontaneous, nonlethal mutations have arisen in all human 
genes—many times over. A substantial proportion of these remain in the genomes 
of present-day humans. The most deleterious of these mutations are discovered 
when the mutant individuals call attention to themselves by seeking medical help. 

With the recent advances that have enabled the sequencing of entire human 
genomes cheaply and quickly, we can now identify such mutations and study 
their evolution and inheritance in ways that were impossible even a few years ago. 
By comparing the sequences of thousands of human genomes from all around the 
world, we can begin to identify directly the DNA differences that distinguish one 
individual from another. These differences hold clues to our evolutionary origins 
and can be used to explore the roots of disease. 


Linked Blocks of Polymorphisms Have Been Passed Down from 
Our Ancestors 


When we compare the sequences of multiple human genomes, we find that any 
two individuals will differ in roughly 1 nucleotide pair in 1000. Most of these vari- 
ations are common and relatively harmless. When two sequence variants coexist 
in the population and both are common, the variants are called polymorphisms. 
The majority of polymorphisms are due to the substitution of a single nucleotide, 
called single-nucleotide polymorphisms or SNPs (Figure 8-50). The rest are due 
largely to insertions or deletions—called indels when the change is small, or copy 
number variations (CNVs) when it is large. Although these common variants can 
be found throughout the genome, they are not scattered randomly—or even inde- 
pendently. Instead, they tend to travel in groups called haplotype blocks—com- 
binations of polymorphisms that are inherited as a unit. 

To understand why such haplotype blocks exist, we need to consider our evo- 
lutionary history. It is thought that modern humans expanded from a relatively 
small population—perhaps around 10,000 individuals—that existed in Africa 
about 60,000 years ago. Among that small group of our ancestors, some individu- 
als will have carried one set of genetic variants, others a different set. The chromo- 
somes of a present-day human represent a shuffled combination of chromosome 
segments from different members of this small ancestral group of people. Because 
only about two thousand generations separate us from them, large segments of 
these ancestral chromosomes have passed from parent to child, unbroken by the 
crossover events that occur during meiosis. As described in Chapter 5, only a few 
crossovers occur between each set of homologous chromosomes during each 
meiosis (see Figure 5-53). 

As a result, certain sets of DNA sequences—and their associated polymor- 
phisms—have been inherited in linked groups, with little genetic rearrangement 
across the generations. These are the haplotype blocks. Like genes that exist in 
different allelic forms, haplotype blocks also come in a limited number of variants 
that are common in the human population, each representing a combination of 
DNA polymorphisms passed down from a particular ancestor long ago. 
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Figure 8-50 Single-nucleotide polymorphisms (SNPs) are sites in the genome where two or 
more alternative choices of a nucleotide are common in the population. Most such variations 
in the human genome occur at locations where they do not significantly affect a gene’s function. 
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Polymorphisms Can Aid the Search for Mutations Associated with 
Disease 


Mutations that give rise, in a reproducible way, to rare but clearly defined abnor- 
malities, such as albinism, hemophilia, or congenital deafness, can often be iden- 
tified by studies of affected families. Such single-gene, or monogenic, disorders 
are often referred to as Mendelian because their pattern of inheritance is easy to 
track. Moreover, individuals who inherit the causative mutation will exhibit the 
abnormality irrespective of environmental factors such as diet or exercise. But for 
many common diseases, the genetic roots are more complex. Instead of a single 
allele of a single gene, such disorders stem from a combination of contributions 
from multiple genes. And often, environmental factors have strong influences 
on the severity of the disorder. For these multigenic conditions, such as diabetes 
or arthritis, population studies are often helpful in tracking down the genes that 
increase the risk of getting the disease. 

In population studies, investigators collect DNA samples from a large number 
of people who have the disease and compare them to samples from a group of 
people who do not have the disease. They look for variants—SNPs, for example— 
that are more common among the people who have the disease. Because DNA 
sequences that are close together on a chromosome tend to be inherited together, 
the presence of such SNPs could indicate that an allele that increases the risk of 
the disease might lie nearby (Figure 8-51). Although, in principle, the disease 
could be caused by the SNP itself, the culprit is much more likely to be a change 
that is merely linked to the SNP as part of a haplotype block. 

Such genome-wide association studies have been used to search for genes that 
predispose individuals to common diseases, including diabetes, coronary artery 
disease, rheumatoid arthritis, and even depression. For many of these condi- 
tions, the DNA polymorphisms identified increase the risk of disease only slightly. 
Moreover, environmental factors (diet, exercise, for example) play an important 
role in the onset and severity of the disease. Nonetheless, the identification of 
genes affected by these polymorphisms is leading to a mechanistic understanding 
of some of our most common disorders. 


Genomics Is Accelerating the Discovery of Rare Mutations That 
Predispose Us to Serious Disease 


The genetic variants that have thus far allowed us to identify some of the genes 
that increase our risk of disease are common ones. They arose long ago in our evo- 
lutionary past and are now present, in one form or another, in a substantial por- 
tion (1% or more) of the population. Such polymorphisms are thought to account 
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Figure 8-51 Genes that affect the 

risk of developing a common disease 
can often be tracked down through 
linkage to SNPs. Here, the patterns 

of SNPs are compared between two 

sets of individuals—a set of healthy 
controls and a set affected by a particular 
common disease. A segment of a 

typical chromosome is shown. For most 
polymorphic sites in this segment, it is 

a random matter whether an individual 
has one SNP variant (red vertical bars) 

or another (blue vertical bars); this same 
randomness is seen both for the control 
group and for the affected individuals. 
However, in the part of the chromosome 
that is shaded in darker gray, a bias is 
seen: most normal individuals have the 
blue SNP variants, whereas most affected 
individuals have the red SNP variants. 
This suggests that this region contains, 
or is close to, a gene that is genetically 
linked to these red SNP variants and that 
predisposes individuals to the disease. 
Using carefully selected controls and 
thousands of affected individuals, this 
approach can help track down disease- 
related genes, even when they confer only 
a Slight increase in the risk of developing 
the disease. 
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for about 90% of the differences between one person’s genome and another’s. But 
when we try to tie these common variants to differences in disease susceptibility 
or other heritable traits, such as height, we find that they do not have as much 
predictive power as we had anticipated: thus, for example, most confer relatively 
small increases—less than twofold—in the risk of developing a common disease. 

In contrast to polymorphisms, rare DNA variants—those much less frequent 
in humans than SNPs—can have large effects on the risk of developing some com- 
mon diseases. For example, a number of different loss-of-function mutations, 
each individually rare, have been found to increase greatly the predisposition 
to autism and schizophrenia. Many of these are de novo mutations, which arose 
spontaneously in the germ-line cells of one or the other parent. The fact that these 
mutations arise spontaneously with some frequency could help explain why 
these common disorders—each observed in about 1% of the population—remain 
with us, even though the affected individuals leave few or no descendants. These 
rare mutations may arise in any one of hundreds of different genes, which could 
explain much of the clinical variability of autism and schizophrenia. Because they 
are kept rare by natural selection, most such variants with a large effect on risk 
would be missed by genome-wide association studies. 

Now that DNA sequencing has become fast and inexpensive, the most efficient 
and cost-effective way to identify these rare, large-effect mutations is by sequenc- 
ing the genomes of affected individuals, along with those of their parents and sib- 
lings as controls. 


Reverse Genetics Begins with a Known Gene and Determines 
Which Cell Processes Require Its Function 


As we have seen, classical genetics starts with a mutant phenotype (or, in the case 
of humans, a range of characteristics) and identifies the mutations, and conse- 
quently the genes, responsible for it. Recombinant DNA technology has made 
possible a different type of genetic approach, one that is used widely in a variety 
of genetically tractable species. Instead of beginning with a mutant organism and 
using it to identify a gene and its protein, an investigator can start with a particu- 
lar gene and proceed to make mutations in it, creating mutant cells or organisms 
so as to analyze the gene’s function. Because this approach reverses the tradi- 
tional direction of genetic discovery—proceeding from genes to mutations, rather 
than vice versa—it is commonly referred to as reverse genetics. And because the 
genome of the organism is deliberately altered in a particular way, this approach 
is also called genome engineering or genome editing. We shall see in this chapter 
that this approach can be scaled upward so that whole collections of organisms 
can be created, each of which has a different gene altered. 

There are several ways a gene of interest can be altered. In the simplest, the 
gene can simply be deleted from the genome, although in a diploid organism, 
this requires that both copies—one on each chromosome homolog—be deleted. 
Although somewhat counterintuitive, one of the best ways to discover the func- 
tion of a gene is by observing the effects of not having it. Such “gene knockouts” 
are especially useful if the gene is not essential. Through reverse genetics, the gene 
in question (even if it is essential) can also be replaced by one that is expressed in 
the wrong tissue or at the wrong time in development; this type of manipulation 
often provides important clues to the gene’s normal function. For example, a gene 
of interest can be modified to be expressed at will by the experimenter (Figure 
8-52). Finally, genes can also be engineered so that they are expressed normally 
in most cell types and tissues but deleted in certain cell types or tissues selected 
by the experimenter (see Figure 5-66). This approach is especially useful when a 
gene has different roles in different tissues. 

It is also possible to make subtler changes to a gene. It is sometimes useful to 
make slight changes in a protein’s structure so that one can begin to dissect which 
portions of a protein are important for its function. The activity of an enzyme, for 
example, can be studied by changing a single amino acid in its active site. It is 
also possible, through genome engineering, to create new types of proteins in an 
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Figure 8-52 Engineered genes can be turned on and off with small molecules. Here, the DNA-binding portion of a bacterial 
protein (the tetracycline, Tet, repressor) has been fused to a portion of a mammalian transcriptional activator and expressed 

in cultured mammalian cells. The engineered gene X, present in place of the normal gene, has its usual gene control region 
replaced by cis-regulatory sequences recognized by the tetracycline repressor. In the absence of doxycycline (a particularly 
stable version of tetracycline), the engineered gene is expressed; in the presence of doxycycline, the gene is turned off because 
the drug causes the tetracycline repressor to dissociate from the DNA. This strategy can also be used in mice by incorporating 
the engineered genes into the germ line. In many tissues, the gene can be turned on and off simply by adding or removing 
doxycycline from the animal’s water. If the tetracycline repressor construct is placed under the control of a tissue-specific gene 
control region, the engineered gene will be turned on and off only in that tissue. 


animal. For example, a gene can be fused to the gene for a fluorescent protein. 
When this altered gene is introduced into the genome, the protein can be tracked 
in the living organism by monitoring its fluorescence. 

Altered genes can be created in several ways. Perhaps the simplest is to chem- 
ically synthesize the DNA that makes up the gene. In this way, the investigator 
can specify any type of variant of the normal gene. It is also possible to construct 
altered genes using recombinant DNA technology, as described earlier in this 
chapter. Once obtained, altered genes can be introduced into cells in a variety 
of ways. DNA can be microinjected into mammalian cells with a glass micropi- 
pette or introduced by a virus that has been engineered to carry foreign genes. In 
plant cells, genes are frequently introduced by a technique called particle bom- 
bardment: DNA samples are painted onto tiny gold beads and then literally shot 
through the cell wall with a specially modified gun. Electroporation is the method 
of choice for introducing DNA into bacteria and some other cells. In this tech- 
nique, a brief electric shock renders the cell membrane temporarily permeable, 
allowing foreign DNA to enter the cytoplasm. 

To be most useful to experimenters, the altered gene, once it is introduced into 
a cell, must recombine with the cell’s genome so that the normal gene is replaced. 
In simple organisms such as bacteria and yeasts, this process occurs with high fre- 
quency using the cell’s own homologous recombination machinery, as described 
in Chapter 5. In more complex organisms that have elaborate developmental 
programs, the procedure is more complicated because the altered gene must be 
introduced into the germ line, as we next describe. 


Animals and Plants Can Be Genetically Altered 


Animals and plants that have been genetically engineered by gene deletion or 
gene replacement are called transgenic organisms, and any foreign or modified 
genes that are added are called transgenes. We discuss transgenic plants later in 
this chapter and, for now, concentrate our discussion on transgenic mice, as enor- 
mous progress has been made in this area. Ifa DNA molecule carrying a mutated 
mouse gene is transferred into a mouse cell, it often inserts into the chromosomes 
at random, but methods have been developed to direct the mutant gene to replace 
the normal gene by homologous recombination. By exploiting these “gene target- 
ing” events, any specific gene can be altered or inactivated in a mouse cell by a 
direct gene replacement. In the case in which both copies of the gene of interest 
are completely inactivated or deleted, the resulting animal is called a “knockout” 
mouse. The technique is summarized in Figure 8-53. 
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The ability to prepare transgenic mice lacking a known normal gene has been 
a major advance, and the technique has been used to determine the functions of 
many mouse genes (Figure 8-54). If the gene functions in early development, a 
knockout mouse will usually die before it reaches adulthood. These lethal defects 
can be carefully analyzed to help determine the function of the missing gene. 
As described in Chapter 5, an especially useful type of transgenic animal takes 
advantage of a site-specific recombination system to excise—and thus disable— 
the target gene in a particular place or at a particular time (see Figure 5-66). In this 
case, the target gene in embryonic stem (ES) cells is replaced by a fully functional 
version of the gene that is flanked by a pair of the short DNA sequences, called lox 
sites, that are recognized by the Cre recombinase protein. The transgenic mice that 
result are phenotypically normal. They are then mated with transgenic mice that 
express the Cre recombinase gene under the control of an inducible promoter. In 


Figure 8-53 Summary of the procedures 
used for making gene replacements 
in mice. In the first step (A), an altered 
version of the gene is introduced into 
cultured ES (embryonic stem) cells. These 
cells are described in detail in Chapter 
22. Only a few ES cells will have their 
corresponding normal genes replaced by 
the altered gene through a homologous 
recombination event. These cells can be 
identified by PCR and cultured to produce 
many descendants, each of which carries 
an altered gene in place of one of its two 
normal corresponding genes. In the next 
step of the procedure (B), these altered ES 
cells are injected into a very early mouse 
embryo; the cells are incorporated into the 
growing embryo, and a mouse produced 
by such an embryo will contain some 
somatic cells (indicated by orange) that 
carry the altered gene. Some of these mice 
will also contain germ-line cells that contain 
the altered gene; when bred with a normal 
mouse, some of the progeny of these mice 
will contain one copy of the altered gene in 
all of their cells. 

The mice with the transgene in their 
germ line are then bred to produce 
both a male and a female animal, each 
heterozygous for the gene replacement 
(that is, they have one normal and one 
mutant copy of the gene). When these two 
mice are mated (not shown), one-fourth of 
their progeny will be homozygous for the 
altered gene. 
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the specific cells or tissues in which Cre is switched on, it catalyzes recombination 
between the lox sequences—excising a target gene and eliminating its activity 
(see Figure 22-5). 


The Bacterial CRISPR System Has Been Adapted to Edit 
Genomes in a Wide Variety of Species 


One of the difficulties in making transgenic mice by the procedure just described 
is that the introduced DNA molecule (bearing the experimentally altered gene) 
often inserts at random in the genome, and many ES cells must therefore be 
screened individually to find one that has the “correct” gene replacement. 

Creative use of the CRISPR system, discovered in bacteria as a defense against 
viruses, has largely solved this problem. As described in Chapter 7, the CRISPR 
system uses a guide RNA sequence to target (through complementary base-pair- 
ing) double-stranded DNA, which it then cleaves (see Figure 7-78). The gene cod- 
ing for the key component of this system, the bacterial Cas9 protein, has been 
transferred into a variety of organisms, where it greatly simplifies the process of 
making transgenic organisms (Figure 8-55A and B). The basic strategy is as fol- 
lows: Cas9 protein is expressed in ES cells along with a guide RNA designed by the 
experimenter to target a particular location on the genome. The Cas9 and guide 
RNA associate, the complex is brought to the matching sequence on the genome, 
and the Cas9 protein makes a double-strand break. As we saw in Chapter 5, dou- 
ble-strand breaks are often repaired by homologous recombination; here, the 
template chosen by the cell to repair the damage is often the altered gene, which 
is introduced to ES cells by the experimenter. In this way, the normal gene can be 
selectively damaged by the CRISPR system and replaced at high efficiency by the 
experimentally altered gene. 

The CRISPR system has a variety of other uses. Its particular power lies with its 
ability to target Cas9 to thousands of different positions across a genome through 
the simple rules of complementary base-pairing. Thus, if a catalytically inactive 
Cas9 protein is fused to a transcription activator or repressor, it is possible, in prin- 
ciple, to turn any gene on or off (Figure 8-55C and D). 
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Figure 8-54 Transgenic mice engineered 
to express a mutant DNA helicase 
show premature aging. The helicase, 
encoded by the Xod gene, is involved 

in both transcription and DNA repair. 
Compared with a wild-type mouse of the 
same age (A), a transgenic mouse that 
expresses a defective version of Xod 

(B) exhibits many of the symptoms of 
premature aging, including osteoporosis, 
emaciation, early graying, infertility, and 
reduced life-span. The mutation in Xod 
used here impairs the activity of the 
helicase and mimics a mutation that in 
humans causes trichothiodystrophy, a 
disorder characterized by brittle hair, 
skeletal abnormalities, and a very reduced 
life expectancy. These results indicate 
that an accumulation of DNA damage can 
contribute to the aging process in both 
humans and mice. (From J. de Boer et 
al., Science 296:1276-1279, 2002. With 
permission from AAAS.) 


Figure 8-55 Use of CRISPR to study 
gene function in a wide variety of 
species. (A) The Cas9 protein (artificially 
expressed in the species of interest) binds to 
a guide RNA, designed by the experimenter 
and also expressed. The portion of RNA 

in light blue is needed for associations 

with CasQ; that in dark blue is specified by 
the experimenter to match a position on 

the genome. The only other requirement 

is that the adjacent genome sequence 
includes a short PAM (protospacer adjacent 
motif) that is needed for Cas9 to cleave. 

As described in Chapter 7, this sequence 

is how the CRISPR system in bacteria 
distinguishes its own genome from that of 
invading viruses. (B) When directed to make 
double-strand breaks, the CRISPR system 
greatly improves the ability to replace an 
endogenous gene with an experimentally 
altered gene since the altered gene is used 
to “repair” the double-strand break (C, D). 
By using a mutant form of Cas9 that can no 
longer cleave DNA, Cas9 can be used to 
activate a normally dormant gene (C) or turn 
off an actively expressed gene (D). (Adapted 
from P. Mali et al., Nat. Methods 10:957- 
963, 2013. With permission from Macmillan 
Publishers Ltd.) 
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The CRISPR system has several advantages over other strategies for experi- 
mentally manipulating gene expression. First, it is relatively easy for the experi- 
menter to design the guide RNA: it simply follows standard base pairing conven- 
tion. Second, the gene to be controlled does not have to be modified; the CRISPR 
strategy exploits DNA sequences already present in the genome. Third, numer- 
ous genes can be controlled simultaneously. Cas9 has to be expressed only once, 
but many guide RNAs can be expressed in the same cell; this strategy allows the 
experimenter to turn on or off a whole set of genes at once. 

The export of the CRISPR system from bacteria to virtually all other experi- 
mental organisms (including mice, zebrafish, worms, flies, rice, and wheat) has 
revolutionized the study of gene function. Like the earlier discovery of restriction 
enzymes, this breakthrough came from scientists studying a fascinating phenom- 
enon in bacteria without—at first—realizing the enormous impact these discov- 
eries would have on all aspects of biology. 


Large Collections of Engineered Mutations Provide a Tool for 
Examining the Function of Every Gene in an Organism 


Extensive collaborative efforts have produced comprehensive libraries of muta- 
tions in a variety of model organisms, including S. cerevisiae, C. elegans, Drosoph- 
ila, Arabidopsis, and even the mouse. The ultimate aim in each case is to produce a 
collection of mutant strains in which every gene in the organism has been system- 
atically deleted or altered in such a way that it can be conditionally disrupted. Col- 
lections of this type provide an invaluable resource for investigating gene function 
on a genomic scale. For example, a large collection of mutant organisms can be 
screened for a particular phenotype. Like the classic genetic approaches described 
earlier, this is one of the most powerful ways to identify the genes responsible for 
a particular phenotype. Unlike the classical genetic approach, however, the set of 
mutants is “pre-engineered,” so that there is no need to rely on chance events such 
as spontaneous mutations or transposon insertions. In addition, each of the indi- 
vidual mutations within the collection is often engineered to contain a distinct 
molecular “barcode” —in the form of a unique DNA sequence—designed to make 
identification of the altered gene rapid and routine (Figure 8-56). 
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Figure 8-56 Making barcoded collections of mutant organisms. A deletion construct for use 

in yeast contains DNA sequences (red) homologous to each end of a target gene x, a selectable 
marker gene (blue), and a unique “barcode” sequence approximately 20 nucleotide pairs in 

length (green). This DNA is introduced into yeast cells, where it readily replaces the target gene 

by homologous recombination. Cells that carry a successful gene replacement are identified by 
expression of the selectable marker gene, typically a gene that provides resistance to a drug. By 
using a collection of such constructs, each specific for one gene, a library of yeast mutants was 
constructed containing a mutant for every gene. Essential genes cannot be studied this way, as 
their deletion from the genome causes the cells to die. In this case, the target gene is replaced by a 
version of the gene that can be regulated by the experimenter (see Figure 8-52). The gene can then 
be turned off and the effect of this can be monitored before the cells die. 
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Figure 8-57 Genome-wide screens for fitness using a large pool of 
barcoded yeast deletion mutants. A large pool of yeast mutants, each 

with a different gene deleted and present in equal amounts, is grown under 
conditions selected by the experimenter. Some mutants (b/ue) grow normally, 
but others show reduced growth (orange and green) or no growth at all (red). 
The fitness of each mutant is experimentally determined in the following way. 
After the growth phase is completed, genomic DNA (isolated from the mixture 
of strains) is purified and the relative abundance of each mutant is determined 
by quantifying the level of the DNA barcode matched to each deletion. 

This can be done by sequencing the pooled genomic DNA or hybridizing 

it to microarrays (See Figure 8-64) that contain DNA oligonucleotides 
complementary to each barcode. In this way, the contribution of every gene 
to growth under the specified condition can be rapidly ascertained. This type 
of study has revealed that of the approximately 6000 coding genes in yeast, 
only about 1000 are essential under standard growth conditions. 


In S. cerevisiae, the task of generating a complete set of 6000 mutants, each 
missing only one gene, was accomplished several years ago. Because each mutant 
strain has an individual barcode sequence embedded in its genome, a large mix- 
ture of engineered strains can be grown under various selective test conditions— 
such as nutritional deprivation, a temperature shift, or the presence of various 
drugs—and the cells that survive can be rapidly identified by the unique sequence 
tags present in their genomes. By assessing how well each mutant in the mixture 
fares, one can begin to discern which genes are essential, useful, or irrelevant for 
growth under the various conditions (Figure 8-57). 

The insights generated by examining mutant libraries can be considerable. 
For example, studies of an extensive collection of mutants in Mycoplasma gen- 
italium—the organism with the smallest known genome—have identified the 
minimum complement of genes essential for cellular life. Growth under labora- 
tory conditions requires about three-quarters of the 480 protein-coding genes in 
M. genitalium. Approximately 100 of these essential genes are of unknown func- 
tion, which suggests that a surprising number of the basic molecular mechanisms 
that underlie life have yet to be discovered. 

Collections of mutant organisms are also available for many animal and plant 
species. For example, it is possible to “order,’ by phone or email from a consortium 
of investigators, a deletion or insertion mutant for almost all coding genes in Dro- 
sophila. Likewise, a nearly complete set of mutants exists for the “model” plant 
Arabidopsis. And the adaptation of the CRISPR system for use in mice means that, 
in the near future, we can expect to be able to turn on or off—at will—each gene 
in the mouse genome. Although we are still ignorant about the function of most 
genes in most organisms, these technologies allow an exploration of gene func- 
tion on a scale that was unimaginable a decade ago. 


RNA Interference Is a Simple and Rapid Way to Test Gene 
Function 


Although knocking out (or conditionally expressing) a gene in an organism and 
studying the consequences is the most powerful approach for understanding the 
functions of the gene, RNA interference (RNAi, for short), is an alternative, par- 
ticularly convenient approach. As discussed in Chapter 7, this method exploits a 
natural mechanism used in many plants, animals, and fungi to protect themselves 
against viruses and transposable elements. The technique introduces into a cell or 
organism a double-stranded RNA molecule whose nucleotide sequence matches 
that of part of the gene to be inactivated. After the RNA is processed, it hybrid- 
izes with the target-gene RNA (either mRNA or noncoding RNA) and reduces its 
expression by the mechanisms shown in Figure 7-75. 

RNAi is frequently used to inactivate genes in Drosophila and mammalian cell 
culture lines. Indeed, a set of 15,000 Drosophila RNAi molecules (one for every 
coding gene) allows researchers, in several months, to test the role of every fly 
gene in any process that can be monitored using cultured cells. RNAi has also been 
widely used to study gene function in whole organisms, including the nematode 
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Figure 8-58 Gene function can be tested by RNA interference. (A) Double-stranded RNA 
(dsRNA) can be introduced into C. elegans by (1) feeding the worms E. coli that express the dsRNA 
or (2) injecting the dsRNA directly into the animal’s gut. (B) In a wild-type worm embryo, the egg and 
sperm pronuclei (red arrowheads) come together in the posterior half of the embryo shortly after 
fertilization. (C) In an embryo in which a particular gene has been inactivated by RNAi, the pronuclei 
fail to migrate. This experiment revealed an important but previously unknown function of this gene 
in embryonic development. (B and C, from P. Gönczy et al., Nature 408:331-336, 2000. With 
permission from Macmillan Publishers Ltd.) 


C. elegans. When working with worms, introducing the double-stranded RNA is 
quite simple: the RNA can be injected directly into the intestine of the animal, or 
the worm can be fed with E. coliengineered to produce the RNA (Figure 8-58). The 
RNA is amplified (see p. 431) and distributed throughout the body of the worm, 
where it inhibits expression of the target gene in different tissue types. RNAi is 
being used to help in assigning functions to the entire complement of worm genes 
(Figure 8-59). 

A related technique has also been applied to mice. In this case, the RNAi mol- 
ecules are not injected or fed to the mouse; rather, recombinant DNA techniques 
are used to make transgenic animals that express the RNAi under the control of an 
inducible promoter. Often this is a specially designed RNA that can fold back on 
itself and, through base-pairing, produce a double-stranded region that is recog- 
nized by the RNAi machinery. In the simplest cases, the process inactivates only 
the genes that exactly match the RNAi sequence. Depending on the inducible 
promoter used, the RNAi can be produced only in a specified tissue or only at a 
particular time in development, allowing the functions of the target genes to be 
analyzed in elaborate detail. 

RNAi has made reverse genetics simple and efficient in many organisms, but 
it has several potential limitations compared with true genetic knockouts. For 
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Figure 8-59 RNA interference provides 
a convenient method for conducting 
genome-wide genetic screens. In this 
experiment, each well in this 96-well 

plate is filled with E. coli that produce 

a different double-stranded RNA. Each 
interfering RNA matches the nucleotide 
sequence of a single C. elegans gene, 
thereby inactivating it. About 10 worms 
are added to each well, where they ingest 
the genetically modified bacteria. The plate 
is incubated for several days, which gives 
the RNAs time to inactivate their target 
genes—and the worms time to grow, mate, 
and produce offspring. The plate is then 
examined in a microscope, which can be 
controlled robotically, to screen for genes 
that affect the worms’ ability to survive, 
reproduce, develop, and behave. Shown 
here are normal worms alongside worms 
that show an impaired ability to reproduce 
due to inactivation of a particular “fertility” 
gene. (From B. Lehner et al., Nat. Genet. 
38:896-908, 2006. With permission from 
Macmillan Publishers Ltd.) 
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unknown reasons, RNAi does not efficiently inactivate all genes. Moreover, within 
whole organisms, certain tissues may be resistant to the action of RNAi (for exam- 
ple, neurons in nematodes). Another problem arises because many organisms 
contain large gene families, the members of which exhibit sequence similarity. 
RNAi therefore sometimes produces “off-target” effects, inactivating related genes 
in addition to the targeted gene. One strategy to avoid such problems is to use 
multiple small RNA molecules matched to different regions of the same gene. 
Ultimately, the results of any RNAi experiment must be viewed as a strong clue to, 
but not necessarily a proof of, normal gene function. 


Reporter Genes Reveal When and Where a Gene Is Expressed 


In the preceding section, we discussed how genetic approaches can be used to 
assess a gene’s function in cultured cells or, even better, in the intact organism. 
Although this information is crucial to understanding gene function, it does not 
generally reveal the molecular mechanisms through which the gene product 
works in the cell. For example, genetics on its own rarely tells us all the places 
in the organism where the gene is expressed, or how its expression is controlled. 
It does not necessarily reveal whether the gene acts in the nucleus, the cytosol, 
on the cell surface, or in one of the numerous other compartments of the cell. 
And it does not reveal how a gene product might change its location or its expres- 
sion pattern when the external environment of the cell changes. Key insights into 
gene function can be obtained by simply observing when and where a gene is 
expressed. A variety of approaches, most involving some form of genetic engi- 
neering, can easily provide this critical information. 

As discussed in detail in Chapter 7, cis-regulatory DNA sequences, located 
upstream or downstream of the coding region, control gene transcription. These 
regulatory sequences, which determine precisely when and where the gene is 
expressed, can be easily studied by placing a reporter gene under their control 
and introducing these recombinant DNA molecules into cells (Figure 8-60). In 
this way, the normal expression pattern of a gene can be determined, as well as 
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Figure 8-60 Using a reporter protein 
to determine the pattern of a gene’s 
expression. (A) In this example, the coding 
sequence for protein X is replaced by 

the coding sequence for reporter protein 
Y. The expression patterns for X and Y 
are the same. (B) Various fragments of 
DNA containing candidate cis-regulatory 
sequences are added in combinations to 
produce test DNA molecules encoding 
reporter gene Y. These recombinant DNA 
molecules are then tested for expression 
after introducing them into a variety of 
different types of mammalian cells. The 
results are summarized in (C). 

For experiments in eukaryotic cells, two 
commonly used reporter proteins are the 
enzyme ß-galactosidase (B-gal) (see Figure 
7-28) and green fluorescent protein (GFP) 
(see Figure 9-22). 
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the contribution of individual cis-regulatory sequences in establishing this pat- 
tern (see also Figure 7-29). 

Reporter genes also allow any protein to be tracked over time in living cells. 
Here, the reporter gene typically encodes a fluorescent protein, often green fluo- 
rescent protein (GFP), the molecule that gives luminescent jellyfish their green- 
ish glow. The GFP is simply attached—in the coding frame—to the protein-cod- 
ing gene of interest. The resulting GFP fusion protein often behaves in the same 
way the normal protein does and its location can be monitored by fluorescence 
microscopy, a topic that is discussed in the next chapter (see Figure 9-25). GFP 
fusion has become a standard strategy for tracking not only the location but also 
the movement of specific proteins in living cells. In addition, the use of multiple 
GFP variants that fluoresce at different wavelengths can provide insights into how 
different cells interact in a living tissue (Figure 8-61). 


In situ Hybridization Can Reveal the Location of mRNAs and 
Noncoding RNAs 


It is also possible to directly observe the time and place that an RNA product of a 
gene is expressed using in situ hybridization. For protein-coding genes, this strat- 
egy often provides the same general information as the reporter gene approaches 
described above; however, it is crucial for genes whose final product is RNA rather 
than protein. We encountered in situ hybridization earlier in the chapter (see Fig- 
ure 8-34); it relies on the basic principles of nucleic acid hybridization. Typically, 
tissues are gently fixed so that their RNA is retained in an exposed form that can 
hybridize with a labeled complementary DNA or RNA probe. In this way, the pat- 
terns of differential gene expression can be observed in tissues, and the location 
of specific RNAs can be determined (Figure 8-62). An advantage of in situ hybrid- 
ization over other approaches is that genetic engineering is not required. Thus, 
it is often simpler and faster and can be used for genetically intractable species. 


Expression of Individual Genes Can Be Measured Using 
Quantitative RT-PCR 


Although reporter genes and in situ hybridization accurately reveal patterns of 
gene expression, they are not the most powerful methods for quantifying amounts 
of individual RNAs in cells. We have seen that RNA sequencing can provide infor- 
mation about the relative abundance of different RNA molecules (see Figure 7-3). 
Here, the number of “sequence reads” (short bits of nucleotide sequence) is pro- 
portional to the abundance of the RNA species. But this method is limited to RNAs 


Figure 8-61 GFPs that fluoresce at 
different wavelengths help reveal the 
connections that individual neurons 
make within the brain. This image shows 
differently colored neurons in one region 

of a mouse brain. The neurons randomly 
express different combinations of differently 
colored GFPs (see Figure 9-13), making 

it possible to distinguish and trace many 
individual neurons within a population. 
These images were obtained by genetically 
engineering the genes for four different 
fluorescent proteins, each flanked by loxP 
sites of recombination (see Figure 5—66), 
and integrating them into the mouse 

germ line. When crossed to a mouse that 
produced the Cre recombinase in neuronal 
cells, the fluorescent protein genes were 
randomly excised, producing neurons that 
express many different combinations of 
the four fluorescent proteins. Over 100 
combinations of fluorescent protein can be 
produced, allowing scientists to distinguish 
one neuron from the next. The stunning 
appearance of these labeled neurons has 
earned these animals the colorful nickname 
“brainbow mice.” (From J. Livet et al., 
Nature 450:56-62, 2007. With permission 
from Macmillan Publishers Ltd.) 
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Figure 8-62 In situ hybridization to 
mRNAs has been used to generate an 
atlas of gene expression in the mouse 
brain. This computer-generated image 
shows the expression of several different 
mRNAs specific to an area of the brain 
associated with learning and memory. 
Similar maps of expression patterns of 

all known genes in the mouse brain are 
compiled in the brain atlas project, which is 
available online. (From M. Hawrylycz et al., 
PLoS Comput. Biol. 7:e1001065, 2011.) 
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Figure 8-63 RNA levels can be measured by quantitative RT-PCR. The 
fluorescence measured is generated by a dye that fluoresces only when 
bound to the double-stranded DNA products of the RT-PCR (see Figure 
8-36). The red sample has a higher concentration of the MRNA being 
measured than does the blue sample, since it requires fewer PCR cycles to 
reach the same half-maximal concentration of double-stranded DNA. Based 
on this difference, the relative amounts of the mRNA in the two samples can 
be precisely determined. 


that are expressed at reasonably high levels, and it is difficult to quantify (or even 
identify) rare RNAs. A more accurate method is based on the principles of PCR 
(Figure 8-63). Called quantitative RT-PCR (reverse transcription-polymerase 
chain reaction), this method begins with the total population of RNA molecules 
purified from a tissue or a cell culture. It is important that no DNA be present in 
the preparation; it must be purified away or enzymatically degraded. Two DNA 
primers that specifically match the mRNA of interest are added, along with reverse 
transcriptase, DNA polymerase, and the four deoxyribonucleoside triphosphates 
needed for DNA synthesis. The first round of synthesis is the reverse transcription 
of the RNA into DNA using one of the primers. Next, a series of heating and cool- 
ing cycles allows the amplification of that DNA strand by PCR (see Figure 8-36). 
The quantitative part of this method relies on a direct relationship between the 
rate at which the PCR product is generated and the original concentration of the 
mRNA species of interest. By adding chemical dyes to the PCR that fluoresce only 
when bound to double-stranded DNA, a simple fluorescence measurement can 
be used to track the progress of the reaction and thereby accurately deduce the 
starting concentration of the mRNA that is amplified. Although it seems compli- 
cated, this quantitative RT-PCR technique is relatively fast and simple to perform 
in the laboratory; it is currently the method of choice for accurately quantifying 
mRNA levels from any given gene. 


Analysis of mRNAs by Microarray or RNA-seg Provides a 
Snapshot of Gene Expression 


As discussed in Chapter 7, a cell expresses only a subset of the many thousands 
of genes available in its genome; moreover, this subset differs from one cell type 
to another or, in the same cell, from one environment to the next. One way to 
determine which genes are being expressed by a population of cells or a tissue is 
to analyze which mRNAs are being produced. 

The first tool that allowed investigators to analyze simultaneously the thou- 
sands of different RNAs produced by cells or tissues was the DNA microarray. 
Developed in the 1990s, DNA microarrays are glass microscope slides that contain 
hundreds of thousands of DNA fragments, each of which serves as a probe for the 
mRNA produced by a specific gene. Such microarrays allow investigators to mon- 
itor the expression of every gene in a genome in a single experiment. To do the 
analysis, mRNAs are extracted from cells or tissues and converted to cDNAs (see 
Figure 8-31). The cDNAs are fluorescently labeled and allowed to hybridize to the 
fragments bound to the microarray. An automated fluorescence microscope then 
determines which mRNAs were present in the original sample based on the array 
positions to which the cDNAs are bound (Figure 8-64). 

Although microarrays are relatively inexpensive and easy to use, they suffer 
from one obvious drawback: the sequences of the mRNA samples to be analyzed 
must be known in advance and represented by a corresponding probe on the 
array. With the development of improved sequencing technologies, investigators 
increasingly use RNA-seq, discussed earlier, as a more direct approach for cata- 
loging the RNAs produced by a cell. For example, this approach can readily detect 
alternative RNA splicing, RNA editing, and the many noncoding RNAs produced 
from a complex genome. 

DNA microarrays and RNA-seq analysis have been used to examine everything 
from the changes in gene expression that make strawberries ripen to the gene 
expression “signatures” of different types of human cancer cells; or from changes 
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Figure 8-64 DNA microarrays are used to analyze the production of mRNA from mRNA from 
thousands of different mRNAs in a single experiment. In this example, sample 1 sample 2 
mRNA is collected from two different cell samples—for example, cells EA Mea ac 


treated with a hormone and untreated cells of the same type —to allow for 
a direct comparison of the specific genes expressed under both conditions. | | 


The mRNAs are converted to cDNAs that are labeled with a red fluorescent convert to cDNA, convert to cDNA, 
dye for one sample and a green fluorescent dye for the other. The labeled pee en ADE ee ree 
y P g y fluorochrome fluorochrome 


samples are mixed and then allowed to hybridize to the microarray. Each 
microscopic spot on the microarray is a 50-nucleotide DNA molecule of 
defined sequence made by chemical synthesis and spotted on the array. sw nd n ao 
The DNA sequence represented by each spot is different, and the hundreds swe Z~ 
of thousands of such spots are designed to span the sequence of the 
genome. The DNA sequence of each spot is kept track of by computer. After 
incubation, the array is washed and the fluorescence scanned. Only a small 
proportion of the microarray, representing 676 genes, is shown. Red spots 
indicate that the gene in sample 1 is expressed at a higher level than the 
corresponding gene in sample 2, and the green spots indicate the opposite. I 
Yellow spots reveal genes that are expressed at about equal levels in both cell 
samples. The intensity of the fluorescence provides an estimate of how much 
RNA is present from a gene. Dark spots indicate little or no expression of the 
gene whose probe is located at that position in the array. 
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that occur as cells progress through the cell cycle to those made in response to | 
sudden shifts in temperature. Indeed, because these approaches allow the simul- 
taneous monitoring of large numbers of RNAs, they can detect subtle changes in a 
cell, changes that might not be manifested in its outward appearance or behavior. 
Comprehensive studies of gene expression also provide information that is 
useful for predicting gene function. Earlier in this chapter, we discussed how iden- 
tifying a protein’s interaction partners can yield clues about that protein’s func- 
tion. A similar principle holds true for genes: information about a gene’s function 
can be deduced by identifying genes that share its expression pattern. Using an 
approach called cluster analysis, one can identify sets of genes that are coordi- 
nately regulated. Genes that are turned on or turned off together under different 
circumstances are likely to work in concert in the cell: they may encode proteins 
that are part of the same multiprotein machine, or proteins that are involved in a 
complex coordinated activity, such as DNA replication or RNA splicing. Charac- 
terizing a gene whose function is unknown by grouping it with known genes that 
share its transcriptional behavior is sometimes called “guilt by association.” Clus- Oe ns eee we 
ter analyses have been used to analyze the gene expression profiles that underlie 
many interesting biological processes, including wound healing in humans (Fig- 
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Figure 8-65 Using cluster analysis to identify sets of genes that are coordinately regulated. Genes that have the same expression pattern 
are likely to be involved in common pathways or processes. To perform a cluster analysis, RNA-seg or microarray data are obtained from cell 
samples exposed to a variety of different conditions, and genes that show coordinate changes in their expression pattern are grouped together. In 
this experiment, human fibroblasts were deprived of serum for 48 hours; serum was then added back to the cultures at time O and the cells were 
harvested for microarray analysis at different time points. Of the 8600 genes depicted here (each represented by a thin, vertical line), just over 300 
showed threefold or greater variation in their expression patterns in response to serum reintroduction. Here, red indicates an increase in expression; 
green is a decrease in expression. On the basis of the results of many other experiments, the 8600 genes have been grouped in clusters based 

on similar patterns of expression. The results of this analysis show that genes involved in wound healing are turned on in response to serum, while 
genes involved in regulating cell-cycle progression and cholesterol biosynthesis are shut down. (From M.B. Eisen et al., Proc. Nat! Acad. Sci. USA 
94:14863-14868, 1998. With permission from National Academy of Sciences.) 
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Figure 8-66 Chromatin immunoprecipitation. This method allows the 
identification of all the sites in a genome that a transcription regulator 
occupies in vivo. The identities of the precipitated, amplified DNA fragments 
are determined by DNA sequencing. 


Genome-wide Chromatin Immunoprecipitation Identifies Sites on 
the Genome Occupied by Transcription Regulators 


We have discussed several strategies to measure the levels of individual RNAs in a 
cell and to monitor changes in their levels in response to external signals. But this 
information does not tell us how such changes are brought about. We saw in Chap- 
ter 7 that transcription regulators, by binding to cis-regulatory sequences in DNA, 
are responsible for establishing and changing patterns of transcription. Typically, 
these proteins do not occupy all of their potential cis-regulatory sequences in 
the genome under all conditions. For example, in some cell types, the regulatory 
protein may not be expressed, or it may be present but lack an obligatory part- 
ner protein, or it may be excluded from the nucleus until an appropriate signal is 
received from the cell’s environment. Even if the protein is present in the nucleus 
and is competent to bind DNA, other transcription regulators or components of 
chromatin can occupy overlapping DNA sequences and thereby occlude some of 
its cis-regulatory sequences in the genome. 

Chromatin immunoprecipitation provides a way to experimentally deter- 
mine all the cis-regulatory sequences in a genome that are occupied by a given 
transcription regulator under a particular set of conditions (Figure 8-66). In this 
approach, proteins are covalently cross-linked to DNA in living cells, the cells are 
broken open, and the DNA is mechanically sheared into small fragments. Anti- 
bodies directed against a given transcription regulator are then used to purify the 
DNA that became covalently cross-linked to that protein in the cell. This DNA is 
then sequenced using the rapid methods discussed earlier; the precise location 
of each precipitated DNA fragment along the genome is determined by compar- 
ing its DNA sequence to that of the whole genome sequence (Figure 8-67). In 
this way, all of the sites occupied by the transcription regulator in the cell sample 
can be mapped across the cell’s genome (see Figure 7-37). In combination with 
microarray or RNA-seq information, chromatin immunoprecipitation can iden- 
tify the key transcriptional regulator responsible for specifying a particular pat- 
tern of gene expression. 

Chromatin immunoprecipitation can also be used to deduce the cis-regula- 
tory sequences recognized by a given transcription regulator. Here, all the DNA 
sequences precipitated by the regulator are lined up (by computer) and features 
in common are tabulated to produce the spectrum of cis-regulatory sequences 
recognized by the protein (see Figure 7-9A). Chromatin immunoprecipitation is 
also used routinely to identify the positions along a genome that are bound by the 
various types of modified histones discussed in Chapter 4. In this case, antibodies 
specific to the particular histone modification are employed (see Figure 8-67). A 
variation of the technique can also be used to map positions of chromosomes that 
are in physical proximity (see Figure 4-48). 


Ribosome Profiling Reveals Which mRNAs Are Being Translated in 
the Cell 


In preceding sections, we discussed several ways that RNA levels in the cell can be 
monitored. But for mRNAs, this represents only one step in gene expression, and 
we are often more interested in the final level of the protein produced by the gene. 
As described in the first part of this chapter, mass-spectroscopy methods can be 
used to monitor the levels of all proteins in the cell, including modified forms of 
the proteins. However, if we want to understand how synthesis of proteins is con- 
trolled by the cell, we need to consider the translation step of gene expression. 
An approach called ribosome profiling provides an instantaneous map of the 
position of ribosomes on each mRNA in the cell and thereby identifies those 
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mRNAs that are being actively translated. To accomplish this, total RNA from a 
cell line or tissue is exposed to RNAses under conditions where only those RNA 
sequences covered by ribosomes are spared. The protected RNAs are released 
from ribosomes, converted to DNA, and the nucleotide sequence of each is deter- 
mined (Figure 8-68). When these sequences are mapped on the genome, the 
position of ribosomes across each mRNA species can be ascertained. 

Ribosome profiling has revealed many cases where MRNAS are abundant but 
are not translated until the cell receives an external signal. It has also shown that 
many open reading frames (ORFs) that were too short to be annotated as genes 
are actively translated and probably encode functional, albeit very small, proteins 
(Figure 8-69). Finally, ribosome profiling has revealed the ways that cells rapidly 
and globally change their translation patterns in response to sudden changes in 
temperature, nutrient availability, or chemical stress. 


Recombinant DNA Methods Have Revolutionized Human Health 


We have seen that nucleic acid methodologies developed in the past 40 years have 
completely changed the way that cell and molecular biology is studied. But they 
have also had a profound effect on our day-to-day lives. Many human pharma- 
ceuticals in routine use (insulin, human growth hormone, blood-clotting factors, 
and interferon, for example) are based on cloning human genes and expressing 
the encoded proteins in large amounts. As DNA sequencing continues to drop 
in cost, more and more individuals will elect to have their genome sequenced; 
this information can be used to predict susceptibility to diseases (often with the 
option of minimizing this possibility by appropriate behavior) or to predict the 
way an individual will respond to a given drug. The genomes of tumor cells from 
an individual can be sequenced to determine the best type of anticancer treat- 
ment. And mutations that cause or greatly increase the risk of disease continue 
to be identified at an unprecedented pace. Using the recombinant DNA technol- 
ogies discussed in this chapter, these mutations can then be introduced into ani- 
mals, such as mice, that can be studied in the laboratory. The resulting transgenic 
animals, which often mimic some of the phenotypic abnormalities associated 
with the condition in patients, can be used to explore the cellular and molecular 
basis of the disease and to screen for drugs that could potentially be used thera- 
peutically in humans. 


Figure 8-67 Results of several chromatin 
immunoprecipitations showing proteins 
bound to the control region that control 
expression of the Oct4 gene. In this 
series of chromatin immunoprecipitation 
experiments, antibodies directed against 

a transcription regulator (first three panels) 
or a particular histone modification (fourth 
panel) were used to precipitate bound, 
cross-linked DNA. Precipitated DNA was 
sequenced, and the positions across the 
genome were mapped. (Only the small part 
of the mouse genome containing the Oct4 
gene is shown.) The results show that, in 
the embryonic stem cells analyzed in these 
experiments, Oct4 binds upstream of its 
own gene and that Sox2 and Nanog are 
bound in close proximity. Oct4, Sox2, and 
Nanog are key regulators in embryonic 
stem cells (discussed in Chapter 22) and 
this experiment reveals the position on 

the genome through which they exert 

their effects on Oct4 expression. In the 
fourth panel, the positions of a histone 
modification associated with actively 
transcribed genes is shown (see Figure 
4-39). Finally, the bottom panel shows the 
RNA produced from the Oct4 gene under 
the same conditions used for the chromatin 
immunoprecipitations. Note that the introns 
and exons are relatively easy to identify 
from these RNA-seq data. 
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Transgenic Plants Are Important for Agriculture 


Although we tend to think of recombinant DNA research in terms of animal biol- 
ogy, these techniques have also had a profound impact on the study of plants. In 
fact, certain features of plants make them especially amenable to recombinant 
DNA methods. 

When a piece of plant tissue is cultured in a sterile medium containing nutri- 
ents and appropriate growth regulators, some of the cells are stimulated to pro- 
liferate indefinitely in a disorganized manner, producing a mass of relatively 
undifferentiated cells called a callus. If the nutrients and growth regulators are 
carefully manipulated, one can induce the formation of a shoot within the callus, 
and in many species a whole new plant can be regenerated from such shoots. In a 
number of plants—including tobacco, petunia, carrot, potato, and Arabidopsis—a 
single cell from such a callus (known as a totipotent cell) can be grown into a small 
clump of cells from which a whole plant can be regenerated (see Figure 7-2B). Just 
as mutant mice can be derived by the genetic manipulation of embryonic stem 
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Figure 8-68 Ribosome profiling. RNA 
is purified from cells and digested with 

an RNAse to leave only those portions 

of the mRNAs that are protected by a 
bound ribosome. These short pieces 

of protected RNA (approximately 20 
nucleotides in length) are converted to DNA 
and sequenced. The resulting information 
is displayed as the number of sequence 
reads along each position of the genome. 
In the diagram here, the data for only one 
gene, whose mRNA is being efficiently 
translated, are shown. Ribosome profiling 
provides this type of information for every 
mRNA produced by the cell. 


Figure 8-69 Ribosome profiling can 
identify new genes. This experiment 
shows the discovery of a previously 
unrecognized gene—one that encodes a 
protein of only 20 amino acids. At the top is 
shown a portion of a viral genome with two 
previously annotated genes. Below are the 
results of a ribosome profiling experiment, 
displayed across the same section of the 
genome, after the virus was infected into 
human cells. The results show that the 
left-hand gene is not expressed under 
these conditions, the right-hand gene is 
expressed at low levels, and a previously 
unrecognized gene that lies between them 
is expressed at high levels. 
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Figure 8-70 Transgenic plants can 
be made using recombinant DNA 
leaf discs incubated with techniques optimized for plants. A 


discs removed from genetically engineered disc is cut out of a leaf and incubated in 


tobacco leaf Agrobacterium for 24h 


a culture of Agrobacterium that carries 

a recombinant plasmid with both a 
selectable marker and a desired genetically 
engineered gene. The wounded plant cells 
at the edge of the disc release substances 
that attract the bacteria, which inject their 
DNA into the plant cells. Only those plant 
cells that take up the appropriate DNA and 
express the selectable marker gene survive 
and proliferate and form a callus. The 
manipulation of growth factors supplied to 
the callus induces it to form shoots, which 
subsequently root and grow into adult 
plants carrying the engineered gene. 
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cells in culture, so transgenic plants can be created from plant cells transfected 
with DNA in culture (Figure 8-70). 

The ability to produce transgenic plants has greatly accelerated progress in 
many areas of plant cell biology. It has played an important part, for example, in 
isolating receptors for growth regulators and in analyzing the mechanisms of mor- 
phogenesis and of gene expression in plants. These techniques have also opened 
up many new possibilities in agriculture that could benefit both the farmer and 
the consumer. They have made it possible, for example, to modify the ratio of lipid, 
starch, and protein in seeds, to impart pest and virus resistance to plants, and 
to create modified plants that tolerate extreme habitats such as salt marshes or 
water-stressed soil. One variety of rice has been genetically engineered to produce 
B-carotene, the precursor of vitamin A. Were it to replace conventional rice, this 
“golden rice” —so-called because of its faint yellow color—could help to alleviate 
severe vitamin A deficiency, which causes blindness in hundreds of thousands of 
children in the developing world each year. 


Summary 


Genetics and genetic engineering provide powerful tools for understanding the func- 
tion of individual genes in cells and organisms. In the classical genetic approach, 
random mutagenesis is coupled with screening to identify mutants that are defi- 
cient in a particular biological process. These mutants are then used to locate and 
study the genes responsible for that process. 

Gene function can also be ascertained by reverse genetic techniques. DNA engi- 
neering methods can be used to alter genes and to re-insert them into a cell’s chro- 
mosomes so that they become a permanent part of the genome. If the cell used for 
this gene transfer is a fertilized egg (for an animal) or a totipotent plant cell in cul- 
ture, transgenic organisms can be produced that express the mutant gene and pass 
it on to their progeny. Especially important for cell and molecular biology is the 
ability to alter cells and organisms in highly specific ways—allowing one to discern 
the effect on the cell or the organism of a designed change in a single protein or RNA 
molecule. For example, genomes can be altered so that the expression of any gene 
can be switched on or off by the experimenter. 


MATHEMATICAL ANALYSIS OF CELL FUNCTIONS 


Many of these methods are being expanded to investigate gene function on a 
genome-wide scale. The generation of mutant libraries in which every gene in an 
organism has been systematically deleted, disrupted, or made controllable by the 
experimenter provides invaluable tools for exploring the role of each gene in the 
elaborate molecular collaboration that gives rise to life. Technologies such as RNA- 
seq and DNA microarrays can monitor the expression of tens of thousands of genes 
simultaneously, providing detailed, comprehensive snapshots of the dynamic pat- 
terns of gene expression that underlie complex cell processes. 


MATHEMATICAL ANALYSIS OF CELL FUNCTIONS 


Quantitative experiments combined with mathematical theory mark the begin- 
ning of modern science. Galileo, Kepler, Newton, and their contemporaries did 
more than set out some rules of mechanics and offer an explanation of the move- 
ments of the planets around the Sun: they showed how a quantitative mathemat- 
ical approach could provide a depth and precision of understanding, at least for 
physical systems, that had never before been dreamed to be possible. 

What is it that gives mathematics this almost magical power to explain the nat- 
ural world, and why has mathematics played so much more important a part in 
physical sciences than in biology? What do biologists need to know about math- 
ematics? 

Mathematics can be viewed as a tool for deriving logical consequences from 
propositions. It differs from ordinary intuitive reasoning in its insistence on rig- 
orous, accurate logic and the precise treatment of quantitative information. If the 
initial propositions are correct, then the deductions drawn from them by mathe- 
matics will be true. The surprising power of mathematics comes from the length 
of the chains of reasoning that rigorous logic and mathematical arguments make 
possible, and from the unexpectedness of the conclusions that can be reached, 
often revealing connections that one would not otherwise have guessed at. Revers- 
ing the argument, mathematics provides a way to test experimental hypotheses: if 
mathematical reasoning from a given hypothesis leads to a prediction that is not 
true, then the hypothesis is not true. 

Clearly, mathematics is not much use unless we can frame our ideas—our ini- 
tial hypotheses—about the given system in a precise, quantitative form. A math- 
ematical edifice raised on a rickety or—even worse—a vague or overcomplicated 
set of propositions is likely to lead us astray. For mathematics to be useful, we 
must focus our analysis on simple subsystems in which we can pick out key quan- 
titative parameters and frame well-defined hypotheses. This approach has been 
used with great success in physics for centuries, but it has been less common in 
biology. But times are changing, and more and more it is becoming possible for 
biologists to exploit the power of quantitative mathematical analysis. 

In this final section of our methods chapter, we do not attempt to teach read- 
ers every way in which mathematics can be fruitfully applied to biological prob- 
lems. Rather, we simply aim to give a sense of what mathematics and quantitative 
approaches can do for us in modern biology. We focus primarily on the important 
principles that mathematics teaches us about the dynamics of molecular interac- 
tions, and how mathematics can unveil surprising and useful features of complex 
systems containing feedback. We will illustrate these principles using the regula- 
tion of gene expression by transcription regulators like those discussed in Chapter 
7. The same principles apply to the post-transcriptional regulatory systems that 
govern cell signaling (Chapter 15), cell-cycle control (Chapter 17), and essentially 
all cell processes. 


Regulatory Networks Depend on Molecular Interactions 


Cell function and regulation depend on transient interactions among thousands 
of different macromolecules in the cell. We often summarize these interactions in 
this book with schematic cartoons. These diagrams are useful, but a complete pic- 
ture requires a deeper, more quantitative level of understanding. To meaningfully 
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assess the biological impact of any interaction in the cell, we need to know in 
precise terms how the molecules interact, how they catalyze reactions, and, most 
importantly, how the behaviors of the molecules change over time. If a cartoon 
shows that protein A activates protein B, for example, we cannot judge the impor- 
tance of this relationship without quantitative details about the concentrations, 
affinities, and kinetic behaviors of proteins A and B. 

Let us begin by defining two different types of regulatory interaction in our 
cartoons: one designating inhibition and the other designating activation. If the 
protein product of gene X is a transcription repressor that inhibits the expression 
of gene Z, we depict the relationship as a red bar-headed line (1) drawn between 
genes X and Z (Figure 8-71). If the protein product of gene Y is a transcription 
activator that induces the expression of gene Z, then a green arrow (=> ) is drawn 
between genes Y and Z. 

The regulation of one gene’s expression by another is more complicated than 
a single arrow connecting them, and a complete understanding of this regulation 
requires that we tease apart the underlying biochemical processes. Figure 8-72A 
sketches some of the biochemical steps in the activation of gene expression by a 
transcription activator. A gene encoding the activator, designated as gene A, will 
produce its product, protein A, via an RNA intermediate. This protein A will then 
bind to px, the regulatory promoter of gene X, to form the complex A:px. Once the 
A:px complex forms, it stimulates the production of an RNA transcript that is sub- 
sequently translated to produce protein X. 

We will focus here on the binding interaction that lies at the heart of this reg- 
ulatory system: the interaction between protein A and the promoter px. Any mol- 
ecule of protein A that is bound to px can also dissociate from it. The steps repre- 
sented by the green activation arrow in Figure 8-72A include both the binding of 
A to px and the dissociation of the complex A:px to re-form A and px, as illustrated 
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Figure 8-71 Diagrams that summarize 
biochemical relationships. Here, a simple 
cartoon indicates that gene X represses 
gene Z (left) whereas gene Y activates gene 
Z (right). 


Figure 8-72 A simple transcriptional 
interaction. (A) Genes A and X each 
produce a protein, with the product of 
gene A serving as a transcription activator 
to stimulate expression of gene X. As 
indicated by the green arrow, stimulation 
depends in part on the binding of protein 
A to the promoter region of gene X, 
designated as px. (B) The binding of protein 
A to the gene promoter is determined 

by the concentrations of the two binding 
partners (denoted as [A] and [py], in units 
of mol/liter, or M), the association rate 
constant Kon (in units of sec™! M7'), and 
the dissociation rate constant Kos (in units 
of sec~'). (C) At steady state, the rates of 
association and dissociation are equal, and 
the concentration of the bound complex is 
determined by Equation 8-1, in which the 
two rate constants are combined in the 
equilibrium constant K. (D) Equation 8-2 
can be derived to calculate the steady- 
state concentration of bound complex at a 
known total concentration of the promoter 
[ox]. (E) Rearrangement of Equation 

8-2 yields Equation 8-3, which allows 
calculation of the fraction of promoter px 
that is occupied by protein A. 
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by the notation in Figure 8-72B. This reaction notation is more informative than 
the diagrams in our figures, but has its own limitations. Suppose that the concen- 
tration of A increases by a factor of ten as a response to an environmental input. 
If A increases, we intuitively know that A:py should increase too, but we cannot 
determine the amount of the increase without additional information. We need to 
know the affinity of the binding interaction and the concentrations of the compo- 
nents. With this information in hand, we can rigorously derive the answer. 

As discussed earlier and in Chapter 3 (see Figure 3-44), we know that the for- 
mation of a complex between two binding partners, such as A and px, depends 
on arate constant kon, which describes how many productive collisions occur per 
unit time per protein at a given concentration of px. The rate of complex forma- 
tion equals the product of this rate constant kon and the concentrations of A and 
px (see Figure 8-72B). Complex dissociation occurs at a rate ko¢ multiplied by the 
concentration of the complex. The rate constant koff can differ by orders of magni- 
tude for different DNA sequences because it depends on the strength of the non- 
covalent bonds formed between A and px. 

We are primarily interested in understanding the amount of bound promoter 
complex at equilibrium or steady state, where the rate of complex formation 
equals the rate of complex dissociation. Under these conditions, the concentra- 
tion of the promoter complex is specified by a simple equation that combines the 
two rate constants into a single equilibrium constant K = kon/koff (Equation 8-1; 
Figure 8-72C). K is sometimes called the association constant, K,. The larger this 
constant K, the stronger the interaction between A and px (see Figure 3-44). The 
reciprocal of Kis the dissociation constant, Ka. 

To calculate the steady-state concentration of promoter complex using Equa- 
tion 8-1, we need to account for another complication: both A and px exist in two 
forms—free in solution and bound to each other. In most cases, we know the total 
concentration of px and not the free or bound concentrations, so we must find a 
way to use the total concentration in our calculations. To do this, we first specify 
that the total concentration of px ([py]) is the sum of the concentrations of free 
([px]) and bound ([A:px]) forms (Figure 8-72D). This leads to a new equation that 
allows us to use [px] to calculate the steady-state concentration of the promoter 
complex ([A:px]|) (Equation 8-2, Figure 8-72D). 

Protein A also exists in two forms: free ([A]) and bound to px ([A:px]). In a cell, 
there are typically one or two copies of px (assuming there is only one gene X per 
haploid genome) and multiple copies of A. As a result, we can safely assume that 
from the viewpoint of A, [A:py] is negligible relative to the total [A’]. This means 
that [A] = [Af], and we can just plug in the values of total [A7] in Equation 8-2 
without incurring appreciable error in the calculation of [A:px]. 

Now, we are ready to determine the effects of increasing the concentration of 
A. Suppose that K = 10 M~t, which is a typical value for many such interactions. 
The starting concentration of A is [AT] = 10~° M, and [py] = 10-!° M (assuming 
there is one copy of gene X in a haploid yeast cell, for example, with a volume 
of around 2 x 10714 L). Using Equation 8-2, we find that a tenfold increase in the 
concentration of A causes the amount of promoter complex [A:px] to increase 5.5- 
fold, from 0.09 x 10-!° M to 0.5 x 107}? M at steady state. The effects of a tenfold 
increase in the concentration of A will vary dramatically depending on its starting 
concentration relative to the equilibrium constant. Only through this mathemat- 
ical approach can we achieve a thorough understanding of what these effects will 
be and what impact they will have on the biological response. 

To assess the biological impact of a change in transcription activator levels, it is 
also important in many cases to determine the fraction of the target gene promoter 
that is bound by the activator, since this number will be directly proportional to 
the activity of the gene’s promoter. In our case, we can calculate the fraction of the 
gene X promoter, px, that has protein A bound to it by rearranging Equation 8-2 
(Equation 8-3, Figure 8-72E). This fraction can be viewed as the probability that 
promoter pxis occupied, averaged over time. It is also equal to the average occu- 
pancy across a large population of cells at any instant in time. When there is no 
protein A present, px is always free, the bound fraction is zero, and transcription is 
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off. When [A] = 1/K, the promoter px has a 50% chance of being occupied. When 
[A] greatly exceeds 1/K, the bound fraction is almost equal to one, meaning that 
px is fully occupied and transcription is maximal. 


Differential Equations Help Us Predict Transient Behavior 


The most important and basic insights for which we, as biologists, depend on 
mathematics concern the behavior of regulatory systems over time. This is the 
central theme of dynamics, and it was for the solution of problems in dynamics 
that the techniques of calculus were developed, by Newton and Leibniz, in the 
seventeenth century. Briefly, the general problem is this: if we are given the rates 
of change of a set of variables that characterize the system at any instant, how can 
we compute its future state? The problem becomes especially interesting, and the 
predictions often remarkable, when the rates of change themselves depend on the 
values of the state variables, as in systems with feedback. 

Let us return to Equation 8-2 (Figure 8-72D), which tells us that when [A] 
changes, [A:px] at steady state will also change to a new concentration that we 
can calculate with precision. However, |A:px] does not change instantaneously to 
this value. If we hope to understand the behavior of this system in detail, we must 
also ask how long it takes [A:px] to get to its new steady-state value inside the cell. 
Equation 8-2 cannot answer this question. We need calculus. 

The most common strategy for solving this problem is to use ordinary differen- 
tial equations. The equations that describe biochemical reactions have a simple 
premise: the rate of change in the concentration of any molecular species X (that 
is, d|X|/dt) is given by the balance of the rate of its appearance with that of its dis- 
appearance. For our example, the rate of change in the concentration of the bound 
promoter complex, [A:px], is determined by the rates of complex assembly and dis- 
assembly. We can incorporate these rates into the differential equation shown in 
Figure 8-73A (Equation 8-4). When [A] changes, Equation 8-4 can be solved 
to generate the concentration of [A:p | as a function of time. Notice that when 
kon [A] [px] = koft [A:px], then d|A:px]/dt=0 and [A:px]| stops changing. At this point, 
the system has reached the steady state. 

Calculation of all [A:px] values as a function of time, using Equation 8-4, allows 
us to determine the rate at which [A:px]| reaches its steady-state value. Because 
this value is attained asymptotically, it is often most useful to compare the times 
needed to get to 50, 90, or 99 percent of this new steady state. The simplest way to 
determine these values is to solve Equation 8-4 with a method called numerical 
integration, which involves plugging in values for all of the parameters (Kon, Koff, 
etc.) and then using a computer to determine the values of |A:px] over time, start- 
ing from given initial concentrations of [A] and [py]. For kon = 0.5 x 10’ sec! M£, 
koff = 0.5 x 107! sec"! (K = 108 M7! as above), and [px] = 10-!° M, it takes [A:py] 
about 5, 20, and 40 seconds to reach 50, 90, and 99 percent of the new steady-state 
value following a sudden tenfold change in [A] (Figure 8-73B). Thus, a sudden 
jump in [A] does not have instantaneous effects, as we might have assumed from 
looking at the cartoon in Figure 8-72A. 

Differential equations therefore allow us to understand the transient dynamics 
of biochemical reactions. This tool is critical for achieving a deep understanding 
of cell behavior, in part because it allows us to determine the dependence of the 
dynamics inside cells on parameters that are specific to the particular molecules 
involved. For example, if we double the values of both kon and koff, then Equa- 
tion 8-1 (Figure 8-72C) indicates that the steady-state value of [A:px| does not 
change. However, the time it takes to reach 50% of this steady state after a ten-fold 
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Figure 8-73 Using differential equations 
to study the dynamics and steady- 
state behavior of a biological system. 
(A) Equation 8-4 is an ordinary differential 
equation for calculating the rate of change 
in the formation of bound promoter 
complex in response to a change in other 
components. (B) Formation of [A:p x] after 
a tenfold increase in [A], as determined by 
solving Equation 8-4. In blue is the solution 
corresponding to kon = 0.5 x 10” sec™! M7! 
and koş = 0.5 x 107! sec". In this case, it 
takes [A:py | about 5, 20, and 40 seconds 
to reach 50, 90, and 99 percent of the new 
steady-state value. For the red curve, the 
Kon and Ko values are doubled, and the 
system reaches the same steady state 
more rapidly. 
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change in [A] in our example changes from about 5 seconds to 2 seconds (see Fig- 
ure 8-73B). These insights are not accessible from either cartoons or equilibrium 
equations. This is an unusually simple example; mathematical descriptions such 
as differential equations become more indispensible for understanding biological 
interactions as the number of interactions increases. 


Both Promoter Activity and Protein Degradation Affect the Rate of 
Change of Protein Concentration 


To understand our gene regulatory system further, we also need to describe the 
dynamics of protein X production in response to changes in the amount of tran- 
scription activator protein A. Here again, we use an ordinary differential equation 
for the rate of change of protein X concentration—determined by the balance of 
the rate of production of protein X through expression of gene X and the protein’s 
rate of degradation. 

Let us begin with the rate of protein X production, which is determined pri- 
marily by the occupancy of the promoter of gene X by protein A. The binding and 
dissociation of a transcription regulator at a promoter generally occurs on a much 
faster time scale than transcription initiation, causing many binding and unbind- 
ing events to occur before transcription proceeds. As a result, we can assume that 
the binding reaction is at equilibrium on the time scale of transcription, and we 
can calculate promoter occupancy by protein A using the equilibrium equation 
discussed earlier (Equation 8-3, Figure 8-72E). To determine transcription rate, 
we simply multiply the occupied promoter fraction by a transcription rate con- 
stant, P, that represents the binding of RNA polymerase and the subsequent steps 
that lead to production of mRNA and protein (Figure 8-74A). If each mRNA mol- 
ecule produces, on average, m molecules of protein product, then we can deter- 
mine protein production rate by multiplying the transcription rate by m (Figure 
8-74A). 

Now let us consider the factors that influence protein X degradation and 
its dilution due to cell growth. Degradation generally results in an exponential 
decline in protein levels, and the average time required for a specific protein to be 
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Figure 8-74 Effect of protein lifetime 

on the timing of the response. 

(A) Equations for calculation of the rates of 
gene X transcription, protein X production, 
and protein X degradation, as explained 

in the text. (B) Equation 8-5 is an ordinary 
differential equation for calculating the 

rate of change in protein X in response to 
changes in other components. (C) When 
the rate of change in protein X is zero 
(steady state), its concentration can be 
calculated with Equation 8-6, revealing 

a direct relationship with protein lifetime 

(t). (D) The solution of Equation 8-5 
specifies the concentration of protein X 
over time as it approaches its steady-state 
concentration. (E) Response time depends 
on protein lifetime. As described in the text, 
the time that it takes a protein to reach 

a new steady state is greater when the 
protein is more stable. Here, the blue line 
corresponds to a protein with a lifetime that 
is 2.5-fold shorter than the lifetime of the 
protein in red. 
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degraded is defined as its mean lifetime, t. In our current example, the rate of deg- 
radation of protein X depends on its mean lifetime tx, which takes into account 
active degradation as well as its dilution as the cell grows. The degradation rate 
depends on the concentration of protein X and is calculated by dividing this con- 
centration by the lifetime (Figure 8-74A). 

With equations for rates of production and degradation in hand, we can now 
generate a differential equation to determine the rate of change of protein X as a 
function of time (Equation 8-5, Figure 8-74B). This equation can be solved by the 
numerical methods mentioned earlier. According to the solution of this equation, 
when transcription begins, the concentration of protein X rises to a steady-state 
level at which the concentration of X is not changing anymore; that is, its rate of 
change is zero. When this occurs, rearrangement of Equation 8-5 yields an equa- 
tion that can be used to determine the steady-state value of X, [X,;| (Equation 8-6, 
Figure 8-74C). An important concept emerges from the mathematics: the steady- 
state concentration of a gene product is directly proportional to its lifetime. If life- 
time doubles, protein concentration doubles as well. 


The Time Required to Reach Steady State Depends on Protein 
Lifetime 


We can see from Equation 8-6 (see Figure 8-74C) that when the concentration 
of protein A rises, protein X increases to a new steady-state value, [X,;|. But this 
cannot happen instantaneously. Instead, X changes dynamically according to the 
solution of its differential rate equation (Equation 8-5). The solution of this equa- 
tion reveals that the concentration of X over time is related to its steady-state con- 
centration according to the equation in Figure 8-74D. Once again, mathematics 
uncovers a simple but important concept that is not intuitively obvious: following 
a sudden increase in [A], [X] rises to a new steady state at an exponential rate that 
is inversely related to its lifetime; the faster X is degraded, the less time it takes it to 
reach its new steady-state value (Figure 8-74E). The faster response time comes at 
a higher metabolic cost, however, since proteins with a rapid response time must 
be produced and degraded at a high rate. For proteins that are not rapidly turned 
over, the response time is very long, and protein concentration is determined pri- 
marily by the dilution that results from cell growth and division. 


Quantitative Methods Are Similar for Transcription Repressors and 
Activators 


Positive control is not the only mechanism that cells use to regulate the expres- 
sion of their genes. As we discussed in Chapter 7, cells also actively shut off genes, 
often by employing transcription repressor proteins that bind to specific sites on 
target genes, thereby blocking access to RNA polymerase. We can analyze the 
function of these repressors by the same quantitative methods described above 
for transcription activators. If a repressor protein R binds to the regulatory region 
of gene X and represses its transcription, then the fraction of gene binding sites 
occupied by the repressor is specified by the same equation we used earlier for 
the transcription activator (Figure 8-75A). In this case, however, it is only when 
the DNA is free that RNA polymerase can bind to the promoter and transcribe the 
gene. Thus, the quantity of interest is the unbound fraction, which can be viewed 
as the probability that the site is free, averaged over multiple binding and unbind- 
ing events. When the repressor concentration is zero, the unbound fraction is 1 
and the promoter is fully active; when the repressor concentration greatly exceeds 
1/K, the unbound fraction approaches zero. Figures 8-75B and C compare these 
relationships for a transcription activator and a transcription repressor. 

We can create a differential equation that provides the rate of change in pro- 
tein X when repressor concentrations change (Equation 8-7, Figure 8-75D). As in 
the case of the transcription activator, the steady-state concentration of protein 
X increases as its lifetime increases, but it decreases as the concentration of the 
transcription repressor increases. 
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Negative Feedback Is a Powerful Strategy in Cell Regulation 


Thus far, we have considered simple regulatory systems of just a few components. 
In most of the complex regulatory systems that govern cell behaviors, multiple 
modules are linked to produce larger circuits that we call network motifs, which 
can produce surprisingly complex and biologically useful responses whose prop- 
erties become apparent only through mathematical analysis. A particularly com- 
mon and important network motif is the negative feedback loop, which can have 
dramatically different functions depending on how it is structured. 

We take as a first example a network motif consisting of two linked modules 
(Figure 8-76A). Here, an input signal initiates the transcription of gene A, which 
produces a transcription activator protein A. This activates gene R, which synthe- 
sizes a transcription repressor protein R. Protein R in turn binds to the promoter 
of gene A to inhibit its expression. This cyclical organization creates a negative 
feedback loop that one can intuitively understand as a mechanism to prevent 
proteins from accumulating to high levels. But what can we learn about negative 
feedback loops, and their value in biology, by using mathematics to model them? 

The negative feedback loop in Figure 8-76A can be modeled using Equation 
8-7 (see Figure 8-75D) for the repression of gene A and Equation 8-5 (see Figure 
8-74B) for the activation of gene R. Thus, for proteins A and R, we use the set of dif- 
ferential equations (Equation set 8-8) shown in Figure 8-76B. The two equations 
in this set are coupled, which means that they must be solved together to describe 


Figure 8-76 A simple negative feedback motif. (A) Gene A negatively 
regulates its own expression by activating gene R. The product of gene R is a 
transcription repressor that inhibits gene A. (B) Equation set 8-8 can be solved 
to determine the dynamics of system components over time. (C) A system with 
negative feedback (blue) reaches its steady state faster than a system with 

no feedback (red). The plots indicate the levels of protein A, expressed as a 
fraction of the steady-state level. The blue line reflects the solution of Equation 
set 8-8, which includes negative feedback of gene A by the repressor R. The 
red line represents the solution when the rate of synthesis of A was set to a 
constant value that is unaffected by the repressor R. 
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Figure 8-75 How promoter occupancy 
depends on the binding affinity of a 
transcription regulator protein. (A) The 
fraction of a binding site that is occupied by 
a transcription repressor R is determined 
by an equation that is similar to the one 
we used for a transcription activator (see 
Figure 8—72E), except that in the case of a 
repressor we are interested primarily in the 
unbound fraction. (B) For a transcription 
activator A, half of the promoters are 
occupied when [A] = 1/K,4. Gene activity is 
proportional to this bound fraction. (C) For 
a transcription repressor R, gene activity 
is proportional to the unbound fraction 

of promoters. As indicated, this fraction 

is reduced to half of its maximal value 
when [R]=1/Kr. (D) As in the case of the 
transcription activator A (see Figure 8-74), 
we can derive equations to assess the 
timing of protein X production as a function 
of repressor concentrations. 
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time time 


the behavior of A and R over time for any value of the input. As before, we plug in 
values for the parameters (Aa, Tp, etc.) and then use a computer to determine the 
values of [A] and [R] as a function of time after a sudden input activates gene A. 

The results reveal several important properties of negative feedback. First, 
rather surprisingly, negative feedback increases the speed of the response to the 
activating input. As shown in Figure 8-76C, the system with negative feedback 
reaches its new steady state faster than the system with no feedback. 

Second, negative feedback is useful for protecting cells from perturbations 
that continuously arise in the cell’s internal environment—due either to random 
variations in the birth and death of molecules or to fluctuations in environmental 
variables such as temperature and nutritional supplies. Let us imagine, for exam- 
ple, that £4, the transcription rate constant for gene A, fluctuates by 25% of its 
value and ask whether and how much the levels of protein R are affected. The 
results, shown in Figure 8-77, reveal that a change in J4 causes a smaller change 
in the steady-state value of R when the network has negative feedback. 


Delayed Negative Feedback Can Induce Oscillations 


A beautiful thing happens when a negative feedback loop contains some delay 
mechanism that slows the feedback signal through the loop: rather than gener- 
ating a new stable state as in a rapid negative feedback loop, a delayed loop gen- 
erates pulses, or oscillations, in the levels of its components. This can be seen, 
for example, if the number of components in a negative feedback loop increases, 
which leads to delays in the amount of time required for the cycle of signals to be 
completed. Figure 8-78 compares the behavior of two network motifs—one with 
a three-stage and one with a five-stage negative feedback loop. Using the same 
kinetic parameters at each stage in the two loops, one finds that stable oscillations 
arise in the longer loop, while in the shorter loop the same parameters lead to 
relatively rapid convergence to a stable steady state. 

Changes in the parameters of a delayed negative feedback loop—binding 
affinities, transcription rates, or protein stabilities, for example—can change the 
amplitude and period of the oscillations, providing a remarkably versatile mech- 
anism for generating all sorts of oscillators that can be used for various purposes 
in the cell. Indeed, many naturally occurring oscillators, including the calcium 
oscillators described in Chapter 15 and the cell-cycle network described in Chap- 
ter 17, use delayed negative feedback as the basis for biologically important oscil- 
lations. Not all of the oscillations observed in cells are thought to have a function, 
however. Oscillations become inevitable in a highly complex, multicomponent 
biochemical pathway like glycolysis, due simply to the large number of feedback 
loops that appear to be required for its regulation. 


DNA Binding By a Repressor or an Activator Can Be Cooperative 


We have focused thus far on the binding of a single transcription regulator to a 
single site in a gene promoter. Many promoters, however, contain multiple adja- 
cent binding sites for the same transcription regulator, and it is not uncommon for 
these regulators to interact with each other on the DNA to form dimers or larger 
oligomers. These interactions can result in a cooperative form of DNA binding, 


Figure 8-77 The effect of fluctuations 

in kinetic rate constants on a system 
with negative feedback compared to 
one without feedback. The plot at /eft 
represents the levels of protein R after a 
sudden activating stimulus, according to 
the regulatory scheme in Figure 8—76A and 
determined by the solution of Equation set 
8-8 (see Figure 8—76B). A perturbation was 
induced by changing J4 from 4 M/min (red 
line) to 3 M/min (blue line). The plot at right 
shows the results when negative feedback 
was removed. The system with negative 
feedback deviates less from its normal 
operation as J changes than does the 
system with no feedback. Notice that, as 
in Figure 8-76C, the system with negative 
feedback also reaches its steady state 
more rapidly. 
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such that DNA-binding affinity increases at higher concentrations of the tran- 
scription regulator. Cooperativity produces a steeper transcriptional response to 
increasing regulator concentration than the response that can be generated by the 
binding of a monomeric protein to a single site. A steep transcriptional response 
of this sort, when present in conjunction with positive feedback, is an important 
ingredient for producing systems with the ability to switch between different dis- 
crete phenotypic states. To begin to understand how this occurs, we need to mod- 
ify our equations to include cooperativity. 

Cooperative binding events can produce steep S-shaped (or sigmoidal) rela- 
tionships between the concentration of regulatory protein and the amount bound 
on the DNA (see Figure 15-16). In this case, anumber called the Hill coefficient (h) 
describes the degree of cooperativity, and we can include this coefficient in our 
equations for calculating the bound fraction of promoter (Figure 8-79A). As the 
Hill coefficient increases, the dependence of binding on protein concentration 
becomes steeper (Figure 8-79B). In principle, the Hill coefficient is similar to the 
number of molecules that must come together to generate a reaction. In practice, 
however, cooperativity is rarely complete, and the Hill coefficient does not reach 
this number. 
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Figure 8-78 Oscillations arising 

from delayed negative feedback. 

A transcriptional circuit with three 
components (A, B) is less likely to oscillate 
than a transcriptional circuit with five 
components (C, D). The X (light blue), 

Y (dark blue), and Z (brown) here represent 
transcription regulatory proteins. For the 
simulations in (B) and (D), the system was 
initiated from random initial conditions for 
X, Y, and Z. Oscillations are produced by 
a delay induced as the signal propagates 
through the loop. 


Figure 8-79 How the cooperative 
binding of transcription regulatory 
proteins affects the fraction of 
promoters bound. (A) Cooperativity is 
incorporated into our mathematical models 
by including a Hill coefficient (h) in the 
equations used previously to determine the 
fraction of bound promoter (see Figures 
8-72E and 8-75A). When A is 1, the 
equations shown here become identical to 
the equations used previously, and there is 
no cooperativity. (B) The /eft panel depicts a 
cooperatively bound transcription activator 
and the right panel depicts a cooperatively 
bound transcription repressor. Recall 

from Figure 8-75B that gene activity 

is proportional to bound activator (/eft 
panel) or unbound repressor (right panel). 
Note that the plots get steeper as the Hill 
coefficient increases. 
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Positive Feedback Is Important for Switchlike Responses and 
Bistability 


We turn now to positive feedback and its very important consequences. First and 
foremost, positive feedback can make a system bistable, enabling it to persist in 
either of two (or more) alternative steady states. The idea is simple and can be 
conveyed by drawing an analogy with a candle, which can exist either in a burning 
state or in an unlit state. The burning state is maintained by positive feedback: the 
heat generated by burning keeps the flame alight. The unlit state is maintained 
by the absence of this feedback signal: so long as sufficient heat has never been 
applied, the candle will stay unlit. 

For the biological system, as for the candle, bistability has an important corol- 
lary: it means that the system has a memory, such that its present state depends 
on its history. If we start with the system in an Off state and gradually rack up the 
concentration of the activator protein, there will come a point where autostimula- 
tion becomes self-sustaining (the candle lights), and the system moves rapidly to 
an On state. If we now intervene to decrease the level of activator, there will come 
a point where the same thing happens in reverse, and the system moves rapidly 
back to an Off state. But the transition points for switching on and switching off 
are different, and so the current state of the system depends on the route by which 
it has been taken in the past—a phenomenon called hysteresis. 

A simple case of positive feedback can be seen in a regulatory system in which 
a transcription regulator activates (directly or indirectly) its own expression, as in 
Figure 8-80A. Positive feedback can also arise in a circuit with many intervening 
repressors or activators, so long as the net overall effect of the interactions is acti- 
vation (Figure 8-80B and C). 

To illustrate how positive feedback can generate stable states, let us focus on a 
simple positive feedback loop containing two repressors, X and Y, each of which 
inhibits expression of the other (Figure 8-81A). As we saw with Equation set 8-8 
(Figure 8-76B) earlier, we can create differential equations describing the rate 
of change of [X] and [Y] (Equation set 8-9, Figure 8-81B). We can further mod- 
ify these equations to include cooperativity by adding Hill coefficients. As we did 
earlier, we can then create equations for calculating the concentrations of [X] 
and |Y] when the system reaches a steady state (that is, when (d|X]/dt) = 0 and 
(d| Y|/dt) = 0; Equations 8-10 and 8-11, Figure 8-81C). 

Equations 8-10 and 8-11 can be used to carry out an intriguing mathemati- 
cal procedure called a nullcline analysis. These equations define the relationships 
between the concentration of X at steady state, [X,;], and the concentration of Y 
at steady state, [Yst], which must be simultaneously satisfied. We can plug in dif- 
ferent values for [Yst] in Equation 8-10, and calculate the corresponding [X,;| for 
each of these values. We can then graph [X,;| as a function of [Yst]. Next, we repeat 
the process by varying [X,;| in Equation 8-11 to graph the resulting [ Y,;|. The inter- 
sections of these two graphs determine the theoretically possible steady states of 
the system. For systems in which the Hill coefficients hy and hy are much larger 
than 1, the lines in the two graphs intersect at three locations (Figure 8-81D). In 
other systems that have the same arrangement of regulators but different parame- 
ters, there might only be one intersection, indicating the presence of only a single 
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Figure 8-80 Positive feedback of a gene 
onto itself through serially connected 
interactions. A sequence of activators and 
repressors of any length can be connected 
to produce a positive feedback loop, as 
long as the overall sign is positive. Because 
the negative of a negative is positive, not 
only circuit (A) and (B) but also circuit (C) 
create positive feedback. 
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Figure 8-81 A graphical nullcline analysis. (A) X inhibits Y and Y inhibits X, resulting 
in a positive feedback loop. (B) Equation set 8-9 can be used to determine the rate of 


change in the concentrations of proteins X and Y. (C) Equations 8-10 and 8-11 provide 


the concentrations of proteins X and Y, respectively, when these concentrations reach 
a steady state. (D, E) Blue curves (called nullclines) are plots of [Xs;|calculated from 


Equation 8-10 over a range of concentrations of [Yst]. Red curves indicate values of [Yst] 
calculated from Equation 8-11 over a range of concentrations of [Xs]. At an intersection 


of the two lines, both [X] and [Y] are at steady state. For plot (D), the binding of both 


proteins to their target gene promoters was cooperative (hx and hy much larger than 1), 


resulting in the presence of multiple intersections of the nullclines—suggesting that the 
system can assume multiple discrete steady states. In plot (E), the binding of protein 
X to the promoter of gene Y was not cooperative (hy close to 1), resulting in only one 
nullcline intersection and thus just one likely steady state. 


steady state. For example, when there is a low cooperativity of protein X binding 
to the promoter of gene Y (that is, a small Hill coefficient, hx, in Equation 8-11), 
the plot of [Y] is less curved (Figure 8-81E), and it is less likely that there will be 
multiple intersections of the two curves. 

We emphasized earlier that positive feedback typically generates a bistable sys- 
tem with two stable steady states. Why does the system modeled in Figure 8-81D 
have three? This conundrum can be explained by solving the reaction rate equa- 
tions (Equation set 8-9, Figure 8-81B) for various different starting conditions of 
[X] and [Y], determining all values of [X] and [Y] as a function of time. Starting 
with each set of initial concentrations of [X] and [Y], these calculations produce 
a so-called trajectory of points, each indicated by a curved green line on Figure 
8-82A. A fascinating pattern emerges: each trajectory moves across the plot and 
settles in one of two steady states, but never in the third (middle steady state). 
We conclude that the middle steady state is unstable because it cannot “attract” 
any trajectories. The system therefore has only two stable steady states. Thus, the 
number of stable steady states in a system need not be equal to the total number 
of its theoretically possible steady states. In fact, stable steady states are usually 
separated by unstable ones, as in our example. 

Once this system adopts a fate by settling in one of the two steady states, does 
it have the ability to switch to the other state? The numerical solution of Equation 
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Figure 8-82 Analysis of the stability of 
a system’s steady states. (A) The dotted 
lines are the nullclines for the system shown 
in Figure 8-81. Also shown are dynamic 
trajectories (green) that show the changes 
over time in [X] and [Y], starting at a variety 
of different initial concentrations (determined 
by solution of Equation set 8-9; see Figure 
8-81B). By plotting [X] versus [Y] at each 
time point, we find that, although there are 
three possible steady states in this system, 
the dynamic trajectories converge on only 
two of them. The middle steady state is 
avoided: it is unstable, being unable to 
attract any trajectories. (B) Imagine that 
the system is at the upper-left steady state 
and experiences a perturbation (black 
arrows), such as a random fluctuation in 
the production rates of X and/or Y. If the 
perturbation is small (arrow 1), the system 
will return to the same steady state. On 
the other hand, a perturbation that drives 
the system beyond the unstable (middle) 
steady state (arrow 2) causes it to switch 
to the lower-right steady state. The set of 
perturbations that a system can withstand 
without switching from one steady state 

to the other is known as the region of 
attraction of that steady state. 
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set 8-9 can again provide an answer. In Figure 8-82B, we show the solution of this 

equation set for two perturbations from the upper-left steady state. For a small 
perturbation, the system returns to its original steady state. But the larger pertur- 

bation causes the system to switch to the alternate steady state. Thus, this system 
can be switched from one stable steady state to the other by subjecting it to an 

input (or a perturbation) that is large enough to make the other steady state more 

attractive. More generally, every stable steady state has a corresponding region of 
attraction, which can be intuitively thought of as the range of perturbations (of [X] 

or [Y] in this example) for which the dynamic trajectories converge back to that 
particular steady state, rather than switch to the other one. 

The concept ofa region of attraction has interesting implications for the herita- 
bility of transcriptional states and the transition rates between them. If the region 
of attraction around one steady state is large, for example, then most cells in the 
population will assume this particular state. Furthermore, this state is likely to be 
inherited by daughter cells, since minor perturbations, like those ensuing from 
an asymmetric distribution of molecules during cell division, will rarely be suffi- 
cient to induce switching to the other steady state. We should expect that the use 
of positive feedback, coupled to cooperativity, will quite often be associated with 
systems requiring stable cell memory. 


Robustness Is an Important Characteristic of Biological Networks 


Biological regulatory systems are exposed to frequent and sometimes extreme 
variations in external conditions or the concentrations or activities of key compo- 
nents. The ability of these systems to function normally in the face of such pertur- 
bations is called robustness. If we understand a complex system to the extent that 
we can reproduce its behavior with a computational model, then the robustness of 
the system can be assessed by determining how well its normal function persists 
following changes in various parameters, such as rate constants and component 
concentrations. We have already seen, for example, how the presence of negative 
feedback reduces the sensitivity of the steady state to changes in the values of the 
system’s parameters (see Figure 8-77). Considerations of robustness also apply 
to dynamic behaviors. Thus, for example, when discussing negative feedback, we 
described how the behavior of a system tends to become more oscillatory as the 
number of components that constitute the feedback loop increases. If we use dif- 
ferent values of the parameters in models derived for systems like those in Figure 
8-78, we find that the system with the longer loop tends to exhibit stable oscilla- 
tions within a much broader range of parameters, indicating that this system pro- 
vides a more robust oscillator. We can perform similar calculations to determine 
the ability of different systems to achieve robust bistability arising from positive 
feedback. Thus, one benefit of computational models is that they allow us to probe 
the robustness of biological networks in a systematic and rigorous way. 


Two Transcription Regulators That Bind to the Same Gene 
Promoter Can Exert Combinatorial Control 


Thus far, we have discussed how one transcription regulator can modulate the 
expression level of a gene. Most genes, however, are controlled by more than one 
type of transcription regulator, providing combinatorial control that allows two or 
more inputs to influence the expression of one gene. We can use computational 
methods to unveil some of the important regulatory features of combinatorial 
control systems. 

Consider a gene whose promoter contains binding sites for two regulatory 
proteins, A and R, which bind to their individual sites independently. There are 
four possible binding configurations (Figure 8-83A). Suppose that A is a tran- 
scription activator, R is a transcription repressor, and the gene is only active when 
A is bound and R is not bound. We learned earlier that the probability that A is 
bound and the probability that R is not bound can be determined by the equa- 
tions in Figure 8-84A. The product of these two probabilities gives us the proba- 
bility of gene activation. 
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This example illustrates an AND NOT logic function (A and not R) (see Figure 
8-83A). Maximal activation of this gene is accomplished when [A] is high and [R] 
is zero. However, intermediate levels of gene activation are also possible depend- 
ing on the levels of A and Rand also on the binding affinities of [A] and [R] for their 
respective sites (that is, K4 and Kr). When Ka » Kp, even a small concentration of 
[A] is capable of overcoming repression by R. Conversely, if K4 « Kr, then much 
more [A] is needed to activate the gene (Figure 8-84B and C). 

Many other logic functions can govern combinatorial gene regulation. For 
example, an AND logic gate results when two activators, Al and A2, are both 
required for a gene to be transcribed (Figures 8-83B and 8-84D). In E. coli cells, 
the AraJ gene controls some aspects of arabinose sugar metabolism: its expres- 
sion requires two transcription regulators, one activated by arabinose and the 
other activated by the small molecule cAMP (Figure 8-84E). 
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Figure 8-83 Combinatorial control of 
gene expression. There are many ways in 
which gene expression can be controlled 
by two transcription regulators. To define 
precisely the relationship between the two 
inputs and the gene expression output, a 
regulatory circuit is often described as a 
specific type of logic gate, a term borrowed 
from electronic circuit design. A simple 
example is the OR logic gate (not shown 
here), in which a gene is controlled by two 
transcription activators, and one or the 
other can activate gene expression. (A) In 
a system with an activator A and repressor 
R, if transcription is turned on only when A 
is bound and œF is not, then the result is an 
AND NOT logic gate. We saw an example 
of this logic in Chapter 7 (Figure 7-15). 

(B) An AND gate results when two 
transcription activators, A1 and A2, are 
both required to turn on a gene. 


Figure 8-84 How the quantitative output of 
a gene depends on both its combinatorial 
logic and the affinities of transcription 
regulators. (A) In a combinatorial gene 
regulatory system like that illustrated in Figure 
8-83A, the fraction of promoters bound by 
activator A and not bound by repressor R are 
each determined as shown here. The product 
of these probabilities provides the probability, 
P(A, R), that a gene promoter is active. 

(B-E) In these four panels, red indicates 

high gene expression and blue indicates low 
gene expression. (B) and (C) depict gene 
expression from the system described 

in panel (A). The two panels demonstrate 
how the system behaves when the relative 
affinities of the two transcription regulators 
change as indicated above each panel. 

(D) Gene expression in a case where the 
gene turns on only at high levels of both 
activating inputs (A1 and A2), as shown in 
Figure 8-83B. (E) Experimental data showing 
measured expression of a gene in E. coli 

that is combinatorially regulated by two 
inputs: arabinose and cAMP. Note the close 
resemblance to panel (D). (E, adapted from 

S. Kaplan et al., Mol. Cell 29:786-792, 2008.) 
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An Incoherent Feed-forward Interaction Generates Pulses 


Imagine that a sudden input signal immediately activates a transcription activator 
A and that the same input signal induces the much slower synthesis of a tran- 
scription repressor protein R that acts on the same gene X. If A and R control gene 
expression by an AND NOT logic function like that described above, our intuition 
tells us that this system should be able to generate a pulse of transcription: when 
A is activated (and R is absent), the transcription of gene X will begin and cause 
an increase in the concentration of protein X, but then transcription will shut off 
when the concentration of R increases to a sufficiently high value. 

Arrangements of this type are common in the cell. In E. coli, for example, galac- 
tose metabolic genes are positively regulated by the catabolite activator protein 
(CAP), which is activated at high levels of cAMP. The same genes are repressed 
by the GalS repressor protein, which is encoded by a gene whose transcription is 
likewise activated by CAP. Thus, an increase in input (cAMP) activates A (CAP), 
and transcription of the galactose genes begins. But activation of A also causes 
a subsequent buildup of R (GalS), which causes the same genes to be repressed 
after a delay. This results in an incoherent feed-forward motif (Figure 8-85A). 

The response of the incoherent feed-forward motif will vary, depending on 
the parameters of the system. Suppose, for example, that the transcription acti- 
vator protein A binds more weakly to the gene regulatory region than does the 
transcription repressor protein R (Ky, < Kp). In this case, there will be a transient 
burst of protein synthesized by the affected gene (gene X) in response to a sudden 
activating input (Figure 8-85B). In contrast, the output will be more sustained if 
Ka is much larger than Kp, because the repression will be too weak to overcome 
the gene activation (Figure 8-85C). Other properties of this network, such as the 
dependence of the amplitude of the pulse on the various rate constants in the sys- 
tem, can be explored with the same computational tools. Thus, our intuitive guess 
about how this system would behave was only partially correct; even the simplest 
of networks depends on precise interaction strengths, demonstrating yet again 
why mathematics is needed to complement cartoon drawings. 


A Coherent Feed-forward Interaction Detects Persistent Inputs 


In the bacterium E. coli, the sugar arabinose is only consumed when the preferred 
sugar, glucose, is scarce. The strategy that cells use to assess the presence of arabi- 
nose and absence of glucose involves a feed-forward arrangement that is different 
from the one just described. In this case, depletion of glucose causes an increase 
of cAMP, which is sensed by the CAP transcription activator protein, as described 
previously. In this case, however, CAP also induces the synthesis of a second tran- 
scription activator, AraC. Both activator proteins are necessary to activate arabi- 
nose metabolic genes (the AND logic function in Figure 8-83B). 
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Figure 8-85 How an incoherent feed- 
forward motif can generate a brief 
pulse of gene activation in response 

to a sustained input. (A) Diagram of an 
incoherent feed-forward motif in which the 
transcription activator A and the repressor 
R control the expression of gene X using 
the AND NOT logic of Figure 8-83A. 

(B) When Ka « Kp, this motif generates a 
pulse of protein X expression, such that the 
output goes back down even if the input 
remains high. (C) When Ka » Kr, the same 
motif responds to a sustained input by 
generating a sustained output. 
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This arrangement, known as a coherent feed-forward motif, has the interest- 
ing characteristics illustrated in Figure 8-86. Imagine that two activators, Al and 
A2, are both required to initiate transcription of a gene. The input to the network 
activates Al directly, but only activates A2 through this Al activation. Thus, for 
a protein to be synthesized from this gene, long-term inputs are required that 
allow both Al and A2 to be produced in active form. Brief input pulses are either 
ignored or produce small outputs. The requirement for a long input is important if 
assurances about a signal are needed before a costly cellular program is triggered. 
For example, glucose is the sugar on which E. coli cells grow best. Before cells trig- 
ger arabinose metabolism in the example above, it might be beneficial to be sure 
that glucose has been depleted (a sustained CAP pulse), rather than inducing the 
arabinose program during a transient glucose fluctuation. 


The Same Network Can Behave Differently in Different Cells Due 
to Stochastic Effects 


Up to this point, we have assumed that all cells in a population produce identical 
behaviors if they contain the same network. It is important, however, to account 
for the fact that cells often show considerable individuality in their responses. 
Consider a situation in which a single mother cell divides into two daughter cells 
of equal volume. If the mother cell has only one molecule of a given protein, then 
only one daughter will inherit it. The daughters, though genetically identical, are 
already different. This variability is most pronounced for molecules that are pres- 
ent in small numbers. Nevertheless, even when there are many copies of a partic- 
ular protein (or RNA), it is very unlikely that both daughter cells will end up with 
exactly the same number of molecules. 

This is just one illustration of a universal feature of cells: their behaviors are 
often stochastic, meaning that they display variability in their protein content and 
therefore exhibit variations in phenotypes. In addition to the asymmetric parti- 
tioning of molecules following cell division, variability can originate from many 
chemical reactions. Imagine, for example, that our mother cell contains a sim- 
ple gene regulatory circuit with a positive feedback loop like that shown in Fig- 
ure 8-80B. Even if both daughter cells receive a copy of this circuit, including one 
copy of the initial transcription activator protein, there will be variability in the 
time required for promoter binding—anzd it will be statistically nearly impossible 
for the genes in the two daughter cells to become activated at precisely the same 
time. If the system is bistable and poised near a switching point, then variability 
in the response might flip the switch in only one daughter cell. Two daughter cells 
that were born identical can thereby acquire, by chance, a dramatic difference in 
phenotype. 

More generally, isogenic populations of cells grown in the same environment 
display diversity in size, shape, cell-cycle position, and gene expression. These 
differences arise because biochemical reactions require probabilistic collisions 
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Figure 8-86 How a coherent feed- 
forward motif responds to various 
inputs. (A) Diagram of a coherent feed- 
forward motif in which the transcription 
activators A1 and A2 together activate 
expression of gene X using the AND logic 
of Figure 8-83B. (B) The response to a 
brief input can be either weak (as shown) or 
nonexistent. This allows the motif to ignore 
random fluctuations in the concentration of 
signaling molecules. (C) A prolonged input 
produces a strong response that can turn 
off rapidly. 
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between randomly moving molecules, with each event resulting in changes in the 
number of molecular species by integer amounts. The amplified effect of fluctu- 
ations in a molecular reactant, or the compounded effects of fluctuations across 
many molecular reactants, often accumulates as an observable phenotype. This 
can endow a cell with individuality and generate non-genetic cell-to-cell variabil- 
ity in a population. 

Non-genetic variability can be studied in the laboratory by single-cell measure- 
ments of fluorescent proteins expressed from genes under the control of a specific 
promoter. Live cells can be mounted on a slide and viewed through a fluorescence 
microscope, revealing the striking variability in protein expression levels (Figure 
8-87). Another approach is to use flow cytometry, which works by streaming a 
dilute suspension of cells past an illuminator and measuring the fluorescence of 
individual cells as they flow past the detector (see Figure 8-2). Fluorescence values 
can be used to build histograms that reveal the variability in a process across a 
population of cells, with a broad histogram indicating higher variability. 


Several Computational Approaches Can Be Used to Model the 
Reactions in Cells 


We have focused primarily on the use of ordinary differential equations to model 
the dynamics of simple regulatory circuits. These models are called deterministic, 
because they do not incorporate stochastic variability and will always produce 
the same result from a specific set of parameters. As we have seen, such models 
can provide useful insights, particularly in the detailed mechanistic analysis of 
small regulatory circuits. However, other types of computational approaches are 
also needed to comprehend the great complexity of cell behavior. Stochastic mod- 
els, for example, attempt to account for the very important problem of random 
variability in molecular networks. These models do not provide deterministic pre- 
dictions about the behavior of molecules; instead, they incorporate random vari- 
ation into molecule numbers and interactions, and the purpose of these models 
is to obtain a better understanding of the probability that a system will exist in a 
certain state over time. 

Numerous other modeling strategies have been or are being developed. Bool- 
ean networks are used for the qualitative analysis of complex gene regulatory 
networks containing large numbers of interacting components. In these models, 
each molecule is a node that can exist in either the active or inactive state, thereby 
affecting the state of the nodes it is linked to. Models of this sort provide insights 
into the flow of information through a network, and they were useful in helping 
us understand the complex gene regulatory network that controls the early devel- 
opment of the sea urchin (see Figure 7-43). Boolean networks therefore reduce 
complex networks to a highly simplified (and potentially inaccurate) form. At the 
other extreme are agent-based simulations, in which thousands of molecules (or 
“agents” ) in a system are modeled individually, and their probable behaviors and 
interactions with each other over time are calculated on the basis of predicted 
physical and chemical behaviors, often while taking stochastic variation into 
account. Agent-based approaches are computationally demanding but have the 
potential to generate highly lifelike simulations of real biological systems. 


Statistical Methods Are Critical For the Analysis of Biological Data 


Dynamics, differential equations, and theoretical modeling are not the be-all and 
end-all of mathematics. Other branches of the subject are no less important for 
biologists. Statistics—the mathematics of probabilistic processes and noisy data- 
sets—is an inescapable part of every biologist’s life. 

This is true in two main ways. First, imperfect measurement devices and other 
errors generate experimental noise in our data. Second, all cell-biological pro- 
cesses depend on the stochastic behavior of individual molecules, as we just dis- 
cussed, and this results in biological noise in our results. How, in the face of all 
this noise, do we come to conclusions about the truth of hypotheses? The answer 
is statistical analysis, which shows how to move from one level of description to 





Figure 8-87 Different levels of gene 
expression in individual cells within a 
population of E. coli bacteria. For this 
experiment, two different reporter proteins 
(one fluorescing green, the other rea), 
controlled by a copy of the same promoter, 
have been introduced into all of the 
bacteria. Some cells express only one gene 
copy, and so appear either red or green, 
while others express both gene copies, 

and so appear yellow. This experiment 
reveals variable levels of fluorescence, 
indicating variable levels of gene expression 
within an apparently uniform population of 
cells. (From M.B. Elowitz et al., Science 
297:1183-1186, 2002. With permission 
from AAAS.) 
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another: from a set of erratic individual data points to a simpler description of the 
key features of the data. 

Statistics teaches us that the more times we repeat our measurements, the bet- 
ter and more refined the conclusions we can draw from them. Given many repeti- 
tions, it becomes possible to describe our data in terms of variables that summa- 
rize the features that matter: the mean value of the measured variable, taken over 
the set of data points; the magnitude of the noise (the standard deviation of the 
set of data points); the likely error in our estimate of the mean value (the standard 
error of the mean); and, for specialists, the details of the probability distribution 
describing the likelihood that an individual measurement will yield a given value. 
For all these things, statistics provides recipes and quantitative formulas that biol- 
ogists must understand if they are to make rigorous conclusions on the basis of 
variable results. 


Summary 


Quantitative mathematical analysis can provide a powerful extra dimension in our 
understanding of cell regulation and function. Cell regulatory systems often depend 
on macromolecular interactions, and mathematical analysis of the dynamics of 
these interactions can unveil important insights into the importance of binding 
affinities and protein stability in the generation of transcriptional or other signals. 
Regulatory systems often employ network motifs that generate useful behaviors: 
a rapid negative feedback loop dampens the response to input signals; a delayed 
negative feedback loop creates a biochemical oscillator; positive feedback yields a 
system that alternates between two stable states; and feed-forward motifs provide 
systems that generate transient signal pulses or respond only to sustained inputs. 
The dynamic behavior of these network motifs can be dissected in detail with deter- 
ministic and stochastic mathematical modeling. 


PROBLEMS 


Which statements are true? Explain why or why not. 


8-1 Because a monoclonal antibody recognizes a spe- 8-7 
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WHAT WE DON’T KNOW 


e Many of the tools that revolutionized 
DNA technology were discovered by 
scientists studying basic biological 
problems that had no obvious 
applications. What are the best 
strategies to ensure that such crucially 
important technologies will continue to 
be discovered? 


e As the cost of DNA sequencing 
decreases and the amount of 
sequence data accumulates, how 

are we going to keep track of and 
meaningfully analyze this vast amount 
of information? What new questions 
will this information allow us to answer? 


e Can we develop tools to analyze 
each of the post-transcriptional 
modifications on the proteins in living 
cells, so as to follow all of their 
changes in real time? 


e Can we develop mathematical 
models to accurately describe the 
enormous complexity of cellular 
networks and to predict undiscovered 
components and mechanisms? 


Discuss the following problems. 


A common step in the isolation of cells from a 


cific antigenic site (epitope), it binds only to the specific 
protein against which it was made. 


8-2 Given the inexorable march of technology, it 
seems inevitable that the sensitivity of detection of mol- 
ecules will ultimately be pushed beyond the yoctomole 
level (10-74 mole). 


8-3 If each cycle of PCR doubles the amount of DNA 
synthesized in the previous cycle, then 10 cycles will give a 
10°-fold amplification, 20 cycles will give a 10°-fold ampli- 
fication, and 30 cycles will give a 10°-fold amplification. 


8-4 To judge the biological importance of an interac- 
tion between protein A and protein B, we need to know 
quantitative details about their concentrations, affinities, 
and kinetic behaviors. 


8-5 ‘The rate of change in the concentration of any 
molecular species X is given by the balance between its 
rate of appearance and its rate of disappearance. 


8-6 After a sudden increase in transcription, a protein 
with a slow rate of degradation will reach a new steady 
state level more quickly than a protein with a rapid rate of 
degradation. 


sample of animal tissue is to treat the tissue with trypsin, 
collagenase, and EDTA. Why is such a treatment nece- 
ssary, and what does each component accomplish? And 
why does this treatment not kill the cells? 


8-8 Tropomyosin, at 93 kd, sediments at 2.6S, whereas 


the 65-kd protein, hemoglobin, sediments at 4.3S. (The 
sedimentation coefficient S is a linear measure of the rate 
of sedimentation.) These two proteins are drawn to scale 
in Figure Q8-1. How is it that the bigger protein sediments 
more slowly than the smaller one? Can you think of an 
analogy from everyday experience that might help you 
with this problem? 






hemoglobin tropomyosin 


Figure Q8-1 Scale models of tropomyosin “2 
and hemoglobin (Problem 8-8). r 
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8-9 Hybridoma technology allows one to generate 
monoclonal antibodies to virtually any protein. Why is 
it, then, that genetically tagging proteins with epitopes is 
such a commonly used technique, especially since an epi- 
tope tag has the potential to interfere with the function of 
the protein? 


8-10 How many copies of a protein need to be present 
in a cell in order for it to be visible as a band on an SDS 
gel? Assume that you can load 100 ug of cell extract onto 
a gel and that you can detect 10 ng in a single band by sil- 
ver staining the gel. The concentration of protein in cells 
is about 200 mg/mL, and a typical mammalian cell has a 
volume of about 1000 um and a typical bacterium a vol- 
ume of about 1 um. Given these parameters, calculate 
the number of copies of a 120-kd protein that would need 
to be present in a mammalian cell and in a bacterium in 
order to give a detectable band on a gel. You might try an 
order-of-magnitude guess before you make the calcula- 
tions. 


8-11 You have isolated the proteins from two adjacent 
spots after two-dimensional polyacrylamide-gel electro- 
phoresis and digested them with trypsin. When the masses 
of the peptides were measured by MALDI-TOF mass spec- 
trometry, the peptides from the two proteins were found to 
be identical except for one (Figure Q8-2). For this peptide, 
the mass-to-charge (m/z) values differed by 80, a value 
that does not correspond to a difference in amino acid 
sequence. (For example, glutamic acid instead of valine at 
one position would give an m/z difference of around 30.) 
Can you suggest a possible difference between the two 
peptides that might account for the observed m/z differ- 
ence? 


Figure Q8-2 
oO : 
= Masses of peptides 
5 3706 measured by 
S | | | | MALDI-TOF mass 
© spectrometry 
(Problem 8-11). 
Y Only the numbered 
5 3786 peaks differ between 
E the two protein 
2 | | samples. 


m/z (mass-to-charge ratio) 


8-12 You want to amplify the DNA between the two 
stretches of sequence shown in Figure Q8-3. Of the listed 
primers, choose the pair that will allow you to amplify the 
DNA by PCR. 


8-13 In the very first round of PCR using genomic DNA, 
the DNA primers prime synthesis that terminates only 
when the cycle ends (or when a random end of DNA is 
encountered). Yet, by the end of 20 to 30 cycles—a typical 
amplification—the only visible product is defined pre- 
cisely by the ends of the DNA primers. In what cycle is a 
double-stranded fragment of the correct size first gener- 
ated? 
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DNA to be amplified 


CATACGGGATTGA-3' 
GTATGCCCTAACT-5' 


5'-GACCTGTGGAAGC 
3'-CTGGACACCTTCG 


primers 


(1) 5'-GACCTGTGGAAGC-3' 
(2) 5'-CTGGACACCTTCG-3' 
(3) 5'-CGAAGGTGTCCAG-3' 
(4) 5'-GCTTCCACAGGTC-3' 


(5) 5'-CATACGGGATTGA-3' 
(6) 5'-GTATGCCCTAACT-3' 
(7) 5'-TGTTAGGGCATAC-3' 
(8) 5'-TCAATCCCGTATG-3' 


Figure Q8-3 DNA to be amplified and potential PCR primers 
(Problem 8-12). 


8-14 Explain the difference between a gain-of-function 
mutation and a dominant-negative mutation. Why are 
both these types of mutation usually dominant? 


8-15 Discuss the following statement: “We would 
have no idea today of the importance of insulin as a reg- 
ulatory hormone if its absence were not associated with 
the human disease diabetes. It is the dramatic conse- 
quences of its absence that focused early efforts on the 
identification of insulin and the study of its normal role in 


physiology.” 


8-16 You have just gotten back the results from an RNA- 
seq analysis of mRNAs from liver. You had anticipated 
counting the number of reads of each mRNA to deter- 
mine the relative abundance of different mRNAs. But you 
are puzzled because many of the mRNAs have given you 
results like those shown in Figure Q8-4. How is it that dif- 
ferent parts of an MRNA can be represented at different 
levels? 


8-17 Examine the network motifs in Figure Q8-5. 
Decide which ones are negative feedback loops and which 
are positive. Explain your reasoning. 


8-18 Imagine that a random perturbation positions a 
bistable system precisely at the boundary between two sta- 
ble states (at the orange dot in Figure Q8-6). How would 
the system respond? 


reads 





exons 1 2 3 4 5 


Figure Q8-4 RNA-seq reads for a liver mRNA (Problem 8-16). The 
exon structure of the mRNA is indicated, with protein-coding segments 
indicated in /ight blue and untranslated regions in dark blue. The 
numbers of sequencing reads are indicated by the heights of the 
vertical lines above the mRNA. 
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(A) ACTIVATING (B) ACTIVATING 


INPUT INPUT 
B OGENE X I) GENE x 





(C) ACTIVATING (D) ACTIVATING 


INPUT INPUT 
B GENE X B GENE X 





Figure Q8-5 Network motifs composed of transcription activators and 
repressors (Problem 8-17). 


8-19 Detailed analysis of the regulatory region of the 
Lac operon has revealed surprising complexity. Instead 
of a single binding site for the Lac repressor, as might be 
expected, there are three sites termed operators: O1, Oo, 
and O3, arrayed along the DNA as shown in Figure Q8-7. 
To probe the functions of these three sites, you make a 
series of constructs in which various combinations of oper- 
ator sites are present. You examine their ability to repress 
expression of {-galactosidase, using either tetrameric 
(wild type) or dimeric (mutant) forms of the Lac repres- 
sor. The dimeric form of the repressor can bind to a single 
operator (with the same affinity as the tetramer) with each 
monomer binding to half the site. The tetramer, the form 
normally expressed in cells, can bind to two sites simulta- 
neously. When you measure repression of B-galactosidase 
expression, you find the results shown in Figure Q8-7, with 
higher numbers indicating more effective repression. 
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concentration of Y 





concentration of X 


Figure Q8-6 Perturbations of a bistable system (Problem 8-18). As 
shown by the green lines, after perturbation 1 the system returns to 

its original stable state (green dot at left), and after perturbation 2, the 
system moves to the other stable state (green dot at right). Perturbation 
3 moves the system to the precise boundary between the two stable 
states (orange dot). 


A. Which single operator site is the most important 
for repression? How can you tell? 
B. Do combinations of operator sites (Figure Q8-7, 


constructs 1, 2, 3, and 5) substantially increase repression 
by the dimeric repressor? Do combinations of operator 
sites substantially increase repression by the tetrameric 
repressor? If the two repressors behave differently, offer an 
explanation for the difference. 

C. The wild-type repressor binds O3 very weakly 
when it is by itself on a segment of DNA. However, if O; is 
included on the same segment of DNA, the repressor binds 
O3 quite well. How can that be? 


92 bp 401 bp 2-mer 4-mer 
1 110 6700 
2 90 3900 
3 80 1400 
4 60 140 
5 1 5 
6 1 2 


N 
2 
N 
= 
= 





Figure Q8-7 Repression of B-galactosidase by promoter regions that 
contain different combinations of Lac repressor binding sites (Problem 
8-19). The base-pair (bop) separation of the three operator sites is 
shown. Numbers at right refer to the level of repression, with higher 
numbers indicating more effective repression by dimeric (2-mer) or 
tetrameric (4-mer) repressors. (From S. Oehler et al., EMBO J. 
9:973-979, 1990. With permission from John Wiley and Sons.) 
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Visualizing Cells 


Understanding the structural organization of cells is essential for learning how 
they function. In this chapter, we briefly describe some of the principal micro- 
scopy methods used to study cells. Optical microscopy will be our starting point 
because cell biology began with the light microscope, and it is still an indispen- 
sible tool. The development of methods for the specific labeling and imaging of 
individual cellular constituents and the reconstruction of their three-dimensional 
architecture has meant that, far from falling into disuse, optical microscopy con- 
tinues to increase in importance. One advantage of optical microscopy is that light 
is relatively nondestructive. By tagging specific cell components with fluorescent 
probes, such as intrinsically fluorescent proteins, we can watch their movement, 
dynamics, and interactions in living cells. Although conventional optical micro- 
scopy is limited in resolution by the wavelength of visible light, new methods clev- 
erly bypass this limitation and allow the position of even single molecules to be 
mapped. By using a beam of electrons instead of visible light, electron microscopy 
can image the interior of cells, and their macromolecular components, at almost 
atomic resolution, and in three dimensions. 

This chapter is intended as a companion, rather than an introduction, to the 
chapters that follow; readers may wish to refer back to it as they encounter appli- 
cations of microscopy to basic biological problems in the later pages of the book. 


LOOKING AT CELLS IN THE LIGHT MICROSCOPE 


A typical animal cell is 10-20 um in diameter, which is about one-fifth the size of 
the smallest object that we can normally see with the naked eye. Only after good 
light microscopes became available in the early part of the nineteenth century did 
Schleiden and Schwann propose that all plant and animal tissues were aggregates 
of individual cells. Their proposal in 1838, known as the cell doctrine, marks the 
formal birth of cell biology. 

Animal cells are not only tiny, but they are also colorless and translucent. The 
discovery of their main internal features, therefore, depended on the develop- 
ment, in the late nineteenth century, of a variety of stains that provided sufficient 
contrast to make those features visible. Similarly, the far more powerful electron 
microscope introduced in the early 1940s required the development of new tech- 
niques for preserving and staining cells before the full complexities of their inter- 
nal fine structure could begin to emerge. To this day, microscopy often relies as 
much on techniques for preparing the specimen as on the performance of the 
microscope itself. In the following discussions, we therefore consider both instru- 
ments and specimen preparation, beginning with the light microscope. 

The images in Figure 9-1 illustrate a stepwise progression from a thumb to a 
cluster of atoms. Each successive image represents a tenfold increase in magnifi- 
cation. The naked eye can see features in the first two panels, the light microscope 
allows us to see details corresponding to about the fourth or fifth panel, and the 
electron microscope takes us to about the seventh or eighth panel. Figure 9-2 
shows the sizes of various cellular and subcellular structures and the ranges of 
size that different types of microscopes can visualize. 
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The Light Microscope Can Resolve Details 0.2 um Apart 


For well over 100 years, all microscopes were constrained by a fundamental lim- 
itation: that a given type of radiation cannot be used to probe structural details 
much smaller than its own wavelength. A limit to the resolution of a light micro- 
scope was therefore set by the wavelength of visible light, which ranges from 
about 0.4 um (for violet) to 0.7 um (for deep red). In practical terms, bacteria and 
mitochondria, which are about 500 nm (0.5 um) wide, are generally the small- 
est objects whose shape we can clearly discern in the light microscope; details 
smaller than this are obscured by effects resulting from the wavelike nature of 
light. To understand why this occurs, we must follow the behavior of a beam of 
light as it passes through the lenses of a microscope (Figure 9-3). 

Because of its wave nature, light does not follow the idealized straight ray 
paths that geometrical optics predicts. Instead, light waves travel through an 
optical system by many slightly different routes, like ripples in water, so that they 








Figure 9-1 A sense of scale between 
living cells and atoms. Each diagram 
shows an image magnified by a factor of 
ten in an imaginary progression from a 
thumb, through skin cells, to a ribosome, 
to a cluster of atoms forming part of 

one of the many protein molecules in 

our body. Atomic details of biological 
macromolecules, as shown in the last two 
panels, are usually beyond the power of 
the electron microscope. While color has 
been used here in all the panels, it is not a 
feature of objects much smaller than the 
wavelength of light, so the last five panels 
should really be in black and white. 
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Figure 9-2 Resolving power. Sizes of cells and their components are drawn on a logarithmic scale, indicating the range of 
objects that can be readily resolved by the naked eye and in the light and electron microscopes. Note that new superresolution 
microscopy techniques, discussed in detail later, allow an improvement in resolution by an order of magnitude compared with 


conventional light microscopy. 


interfere with one another and cause optical diffraction effects. If two trains of 
waves reaching the same point by different paths are precisely in phase, with crest 
matching crest and trough matching trough, they will reinforce each other so as 
to increase brightness. In contrast, if the trains of waves are out of phase, they will 
interfere with each other in such a way as to cancel each other partly or entirely 
(Figure 9-4). The interaction of light with an object changes the phase relation- 
ships of the light waves in a way that produces complex interference effects. At 
high magnification, for example, the shadow of an edge that is evenly illuminated 
with light of uniform wavelength appears as a set of parallel lines (Figure 9-5), 
whereas that of a circular spot appears as a set of concentric rings. For the same 
reason, a single point seen through a microscope appears as a blurred disc, and 
two point objects close together give overlapping images and may merge into one. 
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The following units of length are 
commonly employed in microscopy: 


um (micrometer) = 10® m 
nm (nanometer) = 10-2 m 
A (Ångström unit) = 10-19 m 


Figure 9-3 A light microscope. 

(A) Diagram showing the light path in a 
compound microscope. Light is focused on 
the specimen by lenses in the condenser. 
A combination of objective lenses, tube 
lenses, and eyepiece lenses is arranged to 
focus an image of the illuminated specimen 
in the eye. (B) A modern research light 
microscope. (B, courtesy of Carl Zeiss 
Microscopy, GmbH.) 
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Although no amount of refinement of the lenses can overcome the diffraction 
limit imposed by the wavelike nature of light, other ways of cleverly bypassing this 
limit have emerged, creating so-called superresolution imaging techniques that 
can even detect the position of single molecules. 

The limiting separation at which two objects appear distinct—the so-called 
limit of resolution—depends on both the wavelength of the light and the numeri- 
cal aperture of the lens system used. The numerical aperture affects the light-gath- 
ering ability of the lens and is related both to the angle of the cone of light that 
can enter it and to the refractive index of the medium the lens is operating in; 
the wider the microscope opens its eye, so to speak, the more sharply it can see 
(Figure 9-6). The refractive index is the ratio of the speed of light in a vacuum to 
the speed of light in a particular transparent medium. For example, for water this 
is 1.33, meaning that light travels 1.33 times slower in water than in a vacuum. 
Under the best conditions, with violet light (wavelength = 0.4 um) and a numer- 
ical aperture of 1.4, the basic light microscope can theoretically achieve a limit 
of resolution of about 0.2 um, or 200 nm. Some microscope makers at the end 
of the nineteenth century achieved this resolution, but it is routinely matched in 
contemporary, factory-produced microscopes. Although it is possible to enlarge 
an image as much as we want—for example, by projecting it onto a screen—it 
is not possible, in a conventional light microscope, to resolve two objects in the 
light microscope that are separated by less than about 0.2 um; they will appear 
as a single object. It is important, however, to distinguish between resolution and 
detection. If a small object, below the resolution limit, itself emits light, then we 
may still be able to see or detect it. Thus, we can see a single fluorescently labeled 
microtubule even though it is about ten times thinner than the resolution limit of 
the light microscope. Diffraction effects, however, will cause it to appear blurred 
and at least 0.2 um thick (see Figure 9-16). In a similar way, we can see the stars in 
the night sky, even though their diameters are far below the angular resolution of 
our unaided eyes: they all appear as similar, slightly blurred points of light, differ- 
ing only in their color and brightness. 


Photon Noise Creates Additional Limits to Resolution When Light 
Levels Are Low 


Any image, whether produced by an electron microscope or by an optical micro- 
scope, is made by particles—electrons or photons—striking a detector of some 
sort. But these particles are governed by quantum mechanics, so the numbers 
reaching the detector are predictable only in a statistical sense. Finite samples, 
collected by imaging for a limited period of time (that is, by taking a snapshot), will 
show random variation: successive snapshots of the same scene will not be exactly 
identical. Moreover, every detection method has some level of background sig- 
nal or noise, adding to the statistical uncertainty. With bright illumination, corre- 
sponding to very large numbers of photons or electrons, the features of the imaged 


Figure 9-4 Interference between light 
waves. When two light waves combine in 
phase, the amplitude of the resultant wave 
is larger and the brightness is increased. 
Two light waves that are out of phase 
cancel each other partly and produce a 
wave whose amplitude, and therefore 
brightness, is decreased. 





(A) (B) 


Figure 9-5 Images of an edge and ofa 
point of light. (A) The interference effects, 
or fringes, seen at high magnification when 
light of a specific wavelength passes the 
edge of a solid object placed between 

the light source and the observer. (B) The 
image of a point source of light. Diffraction 
spreads this out into a complex, circular 
pattern, whose width depends on the 
numerical aperture of the optical system: 
the smaller the aperture, the bigger (more 
blurred) the diffracted image. Two point 
sources can be just resolved when the 
center of the image of one lies on the first 
dark ring in the image of the other: this is 
used to define the limit of resolution. 
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LENSES RESOLUTION: the resolving power of the 
microscope depends on the width of the 
cone of illumination and therefore on both 
the condenser and the objective lens. It is 
calculated using the formula 


. 0.612 
resolution = 


the objective lens nsin 0 
collects a cone of where: 
light rays to create 


an image 0 = half the angular width of the cone of 


rays collected by the objective lens 
from a typical point in the central 
the condenser lens region of the specimen (since the 
focuses a cone of maximum width is 180°, 
light rays onto sin 0 has a maximum value of 1) 
each point of the the refractive index of the medium 
specimen (usually air or oil) separating the 
specimen from the objective and 
condenser lenses 
the wavelength of light used (for white 
light a figure of 0.53 um is commonly 
assumed) 





NUMERICAL APERTURE: n sin Q in the aperture, the greater the resolution and the 
equation above is called the numerical aperture brighter the image (brightness is important in 
of the lens and is a function of its light- fluorescence microscopy). However, this advan- 


collecting ability. For dry lenses this cannot be tage does necessitate very short working distances 
more than 1, but for oil-immersion lenses itcan and avery small depth of field. 
be as high as 1.4. The higher the numerical 





specimen are accurately determined based on the distribution of these particles 
at the detector. However, with smaller numbers of particles, the structural details 
of the specimen are obscured by the statistical fluctuations in the numbers of par- 
ticles detected in each region, which give the image a speckled appearance and 
limit its precision. The term noise describes this random variability. 


Living Cells Are Seen Clearly in a Phase-Contrast or a 
Differential-Interference-Contrast Microscope 


There are many ways in which contrast in a specimen can be generated (Figure 
9-7A). While fixing and staining a specimen can generate contrast through color, 
microscopists have always been challenged by the possibility that some compo- 
nents of the cell may be lost or distorted during specimen preparation. The only 
certain way to avoid the problem is to examine cells while they are alive, without 
fixing or freezing. For this purpose, light microscopes with special optical systems 
are especially useful. 

In the normal bright-field microscope, light passing through a cell in culture 
forms the image directly. Another system, dark-field microscopy, exploits the fact 
that light rays can be scattered in all directions by small objects in their path. If 
oblique lighting from the condenser is arranged, which does not directly enter 
the objective, focused but unstained objects in a living cell can scatter the rays, 
some of which then enter the objective to create a bright image against a black 
background (Figure 9-7B). 

When light passes through a living cell, the phase of the light wave is changed 
according to the cell’s refractive index: a relatively thick or dense part of the cell, 
such as a nucleus, slows the light passing through it. The phase of the light, con- 
sequently, is shifted relative to light that has passed through an adjacent thinner 
region of the cytoplasm (Figure 9-7C). The phase-contrast microscope and, in a 
more complex way, the differential-interference-contrast microscope increase 
these phase differences so that the waves are more nearly out of phase, produc- 
ing amplitude differences when the sets of waves recombine, thereby creating an 
image of the cell’s structure. Both types of light microscopy are widely used to 
visualize living cells (see Movie 17.2). Figure 9-8 compares images of the same 
cell obtained by four kinds of light microscopy. 
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Figure 9-6 Numerical aperture. The path 
of light rays passing through a transparent 
specimen in a microscope illustrates the 
concept of numerical aperture and its 
relation to the limit of resolution. 
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Figure 9-7 Contrast in light microscopy. (A) The stained portion of the cell will absorb light of some wavelengths, which 
depends on the stain, but will allow other wavelengths to pass through it. A colored image of the cell is thereby obtained that is 
visible in the normal bright-field light microscope. (B) In the dark-field microscope, oblique rays of light focused on the specimen 
do not enter the objective lens, but light that is scattered by components in the living cell can be collected to produce a bright 
image on a dark background. (C) Light passing through the unstained living cell experiences very little change in amplitude, 

and the structural details cannot be seen even if the image is highly magnified. The phase of the light, however, is altered by 

its passage through either thicker or denser parts of the cell, and small phase differences can be made visible by exploiting 
interference effects using a phase-contrast or a differential-interference-contrast microscope. 


Phase-contrast, differential-interference-contrast, and dark-field microscopy 
make it possible to watch the movements involved in such processes as mitosis 
and cell migration. Since many cellular motions are too slow to be seen in real 
time, it is often helpful to make time-lapse movies in which the camera records 
successive frames separated by a short time delay, so that when the resulting pic- 
ture series is played at normal speed, events appear greatly speeded up. 


Images Can Be Enhanced and Analyzed by Digital Techniques 


In recent years, electronic, or digital, imaging systems, and the associated tech- 
nology of image processing, have had a major impact on light microscopy. Cer- 
tain practical limitations of microscopes relating to imperfections in the optical 
system have been largely overcome. Electronic imaging systems have also cir- 
cumvented two fundamental limitations of the human eye: the eye cannot see 
well in extremely dim light, and it cannot perceive small differences in light inten- 
sity against a bright background. To increase our ability to observe cells in these 
difficult conditions, we can attach a sensitive digital camera to a microscope. 
These cameras detect light by means of charge-coupled devices (CCDs), or high 
sensitivity complementary metal-oxide semiconductor (CMOS) sensors, similar 
to those found in digital cameras. Such image sensors are 10 times more sensitive 
than the human eye and can detect 100 times more intensity levels. It is therefore 
possible to observe cells for long periods at very low light levels, thereby avoiding 
the damaging effects of prolonged bright light (and heat). Such low-light cam- 
eras are especially important for viewing fluorescent molecules in living cells, as 
explained below. 

Because images produced by digital cameras are in electronic form, they can 
be processed in various ways to extract latent information. Such image processing 
makes it possible to compensate for several optical faults in microscopes. More- 
over, by digital image processing, contrast can be greatly enhanced to overcome 
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Figure 9-8 Four types of light microscopy. Four images are shown of the same fibroblast cell in culture. All images can 

be obtained with most modern microscopes by interchanging optical components. (A) Bright-field microscopy, in which light 

is transmitted straight through the specimen. (B) Phase-contrast microscopy, in which phase alterations of light transmitted 
through the specimen are translated into brightness changes. (C) Differential-interference-contrast microscopy, which highlights 
edges where there is a steep change of refractive index. (D) Dark-field microscopy, in which the specimen is lit from the side and 


only the scattered light is seen. 


the eye’s limitations in detecting small differences in light intensity, and back- 
ground irregularities in the optical system can be digitally subtracted. This proce- 
dure reveals small transparent objects that were previously impossible to distin- 
guish from the background. 


Intact Tissues Are Usually Fixed and Sectioned Before Microscopy 


Because most tissue samples are too thick for their individual cells to be examined 
directly at high resolution, they are often cut into very thin transparent slices, or 
sections. To preserve the cells within the tissue they must be treated with a fixa- 
tive. Common fixatives include glutaraldehyde, which forms covalent bonds with 
the free amino groups of proteins, cross-linking them so they are stabilized and 
locked into position. 

Because tissues are generally soft and fragile, even after fixation, they need to 
be either frozen or embedded in a supporting medium before being sectioned. 
The usual embedding media are waxes or resins. In liquid form, these media both 
permeate and surround the fixed tissue; they can then be hardened (by cooling or 
by polymerization) to form a solid block, which is readily sectioned with a micro- 
tome. This is a machine with a sharp blade, usually of steel or glass, which oper- 
ates like a meat-slicer (Figure 9-9). The sections (typically 0.5-10 um thick) are 
then laid flat on the surface of a glass microscope slide. 

There is little in the contents of most cells (which are 70% water by weight) to 
impede the passage of light rays. Thus, most cells in their natural state, even if fixed 
and sectioned, are almost invisible in an ordinary light microscope. We have seen 
that cellular components can be made visible by techniques such as phase-con- 
trast and differential-interference-contrast microscopy, but these methods tell us 
almost nothing about the underlying chemistry. There are three main approaches 
to working with thin tissue sections that reveal differences in types of molecules 
that are present. 

First, and traditionally, sections can be stained with organic dyes that have 
some specific affinity for particular subcellular components. The dye hematox- 
ylin, for example, has an affinity for negatively charged molecules and therefore 
reveals the distribution of DNA, RNA, and acidic proteins in a cell (Figure 9-10). 
The chemical basis for the specificity of many dyes, however, is not known. 
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Figure 9-9 Making tissue sections. 
This illustration shows how an embedded 
tissue is sectioned with a microtome in 
preparation for examination in the light 
microscope. 
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Second, sectioned tissues can be used to visualize specific patterns of differ- 
ential gene expression. In situ hybridization, discussed earlier (see Figure 8-34), 
reveals the cellular distribution and abundance of specific expressed RNA mol- 
ecules in sectioned material or in whole mounts of small organisms or organs. 
This is particularly effective when used in conjunction with fluorescent probes 
(Figure 9-11). 

A third and very sensitive approach, generally and widely applicable for local- 
izing proteins of interest, also depends on using fluorescent probes and markers, 
as we explain next. 


Specific Molecules Can Be Located in Cells by Fluorescence 
Microscopy 


Fluorescent molecules absorb light at one wavelength and emit it at another, lon- 
ger wavelength (Figure 9-12A). If we illuminate such a molecule at its absorbing 
wavelength and then view it through a filter that allows only light of the emitted 
wavelength to pass, it will glow against a dark background. Because the back- 
ground is dark, even a minute amount of the glowing fluorescent dye can be 
detected. In contrast, the same number of molecules of a nonfluorescent stain, 
viewed conventionally, would be practically indiscernible because the absorption 
of light by molecules in the stain would result in only the faintest tinge of color in 
the light transmitted through that part of the specimen. 

The fluorescent dyes used for staining cells are visualized with a fluorescence 
microscope. This microscope is similar to an ordinary light microscope except 
that the illuminating light, from a very powerful source, is passed through two 
sets of filters—one to filter the light before it reaches the specimen and one to 





Figure 9-10 Staining of cell components. 
(A) This section of cells in the urine- 
collecting ducts of the kidney was stained 
with hematoxylin and eosin, two dyes 
commonly used in histology. Each duct 

is made of closely packed cells (with 
nuclei stained rea) that form a ring. 

The ring is Surrounded by extracellular 
matrix, stained purple. (B) This section 

of a young plant root is stained with two 
dyes, safranin and fast green. The fast 
green stains the cellulosic cell walls while 
the safranin stains the lignified xylem cell 
walls bright red. (A, from P.R. Wheater et 
al., Functional Histology, 2nd ed. London: 
Churchill Livingstone, 1987; B, courtesy of 
Stephen Grace.) 


Figure 9-11 RNA in situ hybridization. 
As described in Chapter 8 (see 

Figure 8-62), it is possible to visualize 

the distribution of different RNAs in 
tissues using in situ hybridization. Here, 
the transcription pattern of five different 
genes involved in patterning the early fly 
embryo is revealed in a single embryo. 
Each RNA probe has been fluorescently 
labeled in a different way, some directly 
and some indirectly; the resulting images 
are displayed each in a different color 
(“false-colored”) and combined to give an 
image where different color combinations 
represent different sets of genes expressed. 
The genes whose expression pattern 

is revealed here are wingless (yellow), 
engrailed (blue), short gastrulation (rea), 
intermediate neuroblasts defective (green), 
and muscle specific homeobox (purple). 
(From D. Kosman et al., Science 305:846, 
2004. With permission from AAAS.) 
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3 second barrier filter: cuts out 
unwanted fluorescent signals, 
passing the specific green 
fluorescein emission between 
520 and 560 nm 
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Figure 9-12 Fluorescence and the fluorescence microscope. (A) An orbital electron of a fluorochrome molecule can be 
raised to an excited state following the absorption of a photon. Fluorescence occurs when the electron returns to its ground 
state and emits a photon of light at a longer wavelength. Too much exposure to light, or too bright a light, can also destroy 

the fluorochrome molecule, in a process called photobleaching. (B) In the fluorescence microscope, a filter set consists of two 
barrier filters (1 and 3) and a dichroic (beam-splitting) mirror (2). This example shows the filter set for detection of the fluorescent 
molecule fluorescein. High-numerical-aperture objective lenses are especially important in this type of microscopy because, 

for a given magnification, the brightness of the fluorescent image is proportional to the fourth power of the numerical aperture 


(see also Figure 9-6). 


filter the light obtained from the specimen. The first filter passes only the wave- 
lengths that excite the particular fluorescent dye, while the second filter blocks 
out this light and passes only those wavelengths emitted when the dye fluoresces 
(Figure 9-12B). 

Fluorescence microscopy is most often used to detect specific proteins or 
other molecules in cells and tissues. A very powerful and widely used technique 
is to couple fluorescent dyes to antibody molecules, which then serve as highly 
specific and versatile staining reagents that bind selectively to the particular mac- 
romolecules they recognize in cells or in the extracellular matrix. Two fluorescent 
dyes that have been commonly used for this purpose are fluorescein, which emits 
an intense green fluorescence when excited with blue light, and rhodamine, which 
emits deep red fluorescence when excited with green-yellow light (Figure 9-13). 
By coupling one antibody to fluorescein and another to rhodamine, the distribu- 
tions of different molecules can be compared in the same cell; the two molecules 
are visualized separately in the microscope by switching back and forth between 
two sets of filters, each specific for one dye. As shown in Figure 9-14, three flu- 
orescent dyes can be used in the same way to distinguish among three types of 
molecules in the same cell. Many newer fluorescent dyes, such as Cy3, Cy5, and 
the Alexa dyes, have been specifically developed for fluorescence microscopy 


Figure 9-13 Fluorescent probes. The maximum excitation and emission 
wavelengths of several commonly used fluorescent probes are shown in 
relation to the corresponding colors of the spectrum. The photon emitted 

by a fluorescent molecule is necessarily of lower energy (longer wavelength) 
than the absorbed photon and this accounts for the difference between the 
excitation and emission peaks. CFP, GFP, YFP, and RFP are cyan, green, 
yellow, and red fluorescent proteins, respectively. DAPI is widely used as a 
general fluorescent DNA probe, which absorbs ultraviolet light and fluoresces 
bright blue. FITC is an abbreviation for fluorescein isothiocyanate, a widely 
used derivative of fluorescein, which fluoresces bright green. The other 
orobes are all commonly used to fluorescently label antibodies and other 
proteins. The use of fluorescent proteins will be discussed later in the chapter. 
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Figure 9-14 Different fluorescent probes 
can be visualized in the same cell. In this 
composite micrograph of a cell in mitosis, 
three different fluorescent probes have 
been used to label three different cellular 
components (Movie 9.1). The spindle 
microtubules are revealed with a green 
fluorescent antibody, centromeres with a 
red fluorescent antibody, and the DNA of 
the condensed chromosomes with the 
blue fluorescent dye DAPI. (Courtesy of 
Kevin F. Sullivan.) 
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(see Figure 9-13) but, like many organic fluorochromes, they fade fairly rapidly 
when continuously illuminated. More stable fluorochromes have been developed 
based on inorganic chemistry. Tiny crystals of semiconductor material, called 
nanoparticles, or quantum dots, can be excited to fluoresce by a broad spectrum 
of blue light. Their emitted light has a color that depends on the exact size of the 
nanocrystal, between 2 and 10 nm in diameter, and additionally the fluorescence 
fades only slowly with time (Figure 9-15). These nanoparticles, when coupled to 
other probes such as antibodies, are therefore ideal for tracking molecules over 
time. If introduced into a living cell, in an embryo for example, the progeny of that 
cell can be followed many days later by their fluorescence, allowing cell lineages 
to be tracked. 
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Figure 9-15 Fluorescent nanoparticles or quantum dots. (A) Quantum dots are tiny particles 

of cadmium selenide, a semiconductor, with a coating to make them water-soluble. They can 

be coupled to protein molecules such as antibodies or streptavidin and, when introduced into 

a cell, will bind to a target protein of interest. Different-sized quantum dots emit light of different 
colors —the larger the dot, the longer the wavelength —but they are all excited by the same blue 
light. Quantum dots can keep shining for weeks, unlike most fluorescent organic dyes. (B) In this 
cell, microtubules are labeled (green) with an organic fluorescent dye (Alexa 488), while a nuclear 
protein is stained (red) with quantum dots bound to streptavidin. On continuous exposure to strong 
blue light, the fluorescent dye fades quickly while the quantum dots continue to shine. (C) In this 
cell, the labeling pattern is reversed; a nuclear protein is labeled (green) with an organic fluorescent 
dye (Alexa 488), while microtubules are labeled (red) with quantum dots. Again, the quantum dots 
far outlast the fluorescent dye. (B and C, from L. Medintz et al., Nat. Mater. 4:435-446, 2005. With 
permission from Macmillan Publishers Ltd.) 
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Later in the chapter, additional fluorescence microscopy methods will be dis- 
cussed that can be used to monitor changes in the concentration and location of 
specific molecules inside living cells. 


Antibodies Can Be Used to Detect Specific Molecules 


Antibodies are proteins produced by the vertebrate immune system as a defense 
against infection (discussed in Chapter 24). They are unique among proteins in 
that they are made in billions of different forms, each with a different binding site 
that recognizes a specific target molecule (or antigen). The precise antigen speci- 
ficity of antibodies makes them powerful tools for the cell biologist. When labeled 
with fluorescent dyes, antibodies are invaluable for locating specific molecules 
in cells by fluorescence microscopy (Figure 9-16); labeled with electron-dense 
particles such as colloidal gold spheres, they are used for similar purposes in the 
electron microscope (discussed below). The antibodies employed in microscopy 
are commonly either purified from antiserum so as to remove all nonspecific anti- 
bodies, or they are specific monoclonal antibodies that only recognize the target 
molecule. 

When we use antibodies as probes to detect and assay specific molecules in 
cells, we frequently use chemical methods to amplify the fluorescent signal they 
produce. For example, although a marker molecule such as a fluorescent dye 
can be linked directly to an antibody—the primary antibody—a stronger signal 
is achieved by using an unlabeled primary antibody and then detecting it with a 
sroup of labeled secondary antibodies that bind to it (Figure 9-17). This process is 
called indirect immunocytochemistry. 

Some amplification methods use an enzyme as a marker molecule attached 
to the secondary antibody. The enzyme alkaline phosphatase, for example, in the 
presence of appropriate chemicals, produces inorganic phosphate that in turn 
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Figure 9-16 Immunofluorescence. 

(A) A transmission electron micrograph of 
the periphery of a cultured epithelial cell 
showing the distribution of microtubules 
and other filaments. (B) The same area 
stained with fluorescent antibodies against 
tubulin, the protein that assembles to 
form microtubules, using the technique of 
indirect immunocytochemistry (see Figure 
9-17). Red arrows indicate individual 
microtubules that are readily recognizable 
in both images. Note that, because of 
diffraction effects, the microtubules in the 
light microscope appear 0.2 um wide 
rather than their true width of 0.025 um. 
(From M. Osborn, R. Webster and 

K. Weber, J. Cell Biol. 77:R27-R84, 
1978. With permission from The 
Rockefeller University Press.) 


Figure 9-17 Indirect immunocytochemistry. 
This detection method is very sensitive 
because many molecules of the secondary 
antibody recognize each primary antibody. 
The secondary antibody is covalently 
coupled to a marker molecule that makes 
it readily detectable. Commonly used 
marker molecules include fluorescent 
probes (for fluorescence microscopy), the 
enzyme horseradish peroxidase (for either 
conventional light microscopy or electron 
microscopy), colloidal gold spheres (for 
electron microscopy), and the enzymes 
alkaline phosphatase or peroxidase (for 
biochemical detection). 
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leads to the local formation of a colored precipitate. This reveals the location of 
the secondary antibody and hence the location of the antibody-antigen complex. 
Since each enzyme molecule acts catalytically to generate many thousands of 
molecules of product, even tiny amounts of antigen can be detected. Although 
the enzyme amplification makes enzyme-linked methods sensitive, diffusion of 
the colored precipitate away from the enzyme limits the spatial resolution of this 
method for microscopy, and fluorescent labels are usually used for the most sen- 
sitive and precise optical localization. 


Imaging of Complex Three-Dimensional Objects Is Possible with 
the Optical Microscope 


For ordinary light microscopy, as we have seen, a tissue has to be sliced into thin 
sections to be examined; the thinner the section, the crisper the image. Since 
information about the third dimension is lost upon sectioning, how, then, can we 
get a picture of the three-dimensional architecture of a cell or tissue, and how can 
we view the microscopic structure of a specimen that, for one reason or another, 
cannot first be sliced into sections? Although an optical microscope is focused on 
a particular focal plane within a three-dimensional specimen, all the other parts 
of the specimen, above and below the plane of focus, are also illuminated and 
the light originating from these regions contributes to the image as “out-of-focus” 
blur. This can make it very hard to interpret the image in detail and can lead to fine 
image structure being obscured by the out-of-focus light. 

Two distinct but complementary approaches solve this problem: one is com- 
putational, the other optical. These three-dimensional microscopic imaging 
methods make it possible to focus on a chosen plane in a thick specimen while 
rejecting the light that comes from out-of-focus regions above and below that 
plane. Thus one sees a crisp, thin optical section. From a series of such optical 
sections taken at different depths and stored in a computer, a three-dimensional 
image can be reconstructed. The methods do for the microscopist what the com- 
puted tomography (CT) scanner does (by different means) for the radiologist 
investigating a human body: both machines give detailed sectional views of the 
interior of an intact structure. 

The computational approach is often called image deconvolution. To under- 
stand how it works, remember that the wavelike nature of light means that the 
microscope lens system produces a small blurred disc as the image of a point 
light source (see Figure 9-5), with increased blurring if the point source lies above 
or below the focal plane. This blurred image of a point source is called the point 
spread function (see Figure 9-36). An image of a complex object can then be 
thought of as being built up by replacing each point of the specimen by a corre- 
sponding blurred disc, resulting in an image that is blurred overall. For decon- 
volution, we first obtain a series of (blurred) images, usually with a cooled CCD 
camera or more recently a CMOS camera, focusing the microscope in turn on a 
series of focal planes—in effect, a (blurred) three-dimensional image. Digital pro- 
cessing of the stack of digital images then removes as much of the blur as pos- 
sible. In essence, the computer program uses the measured point spread func- 
tion of a point source of light from that microscope to determine what the effect 
of the blurring would have been on the image, and then applies an equivalent 
“deblurring” (deconvolution), turning the blurred three-dimensional image into 
a series of clean optical sections, albeit still constrained by the diffraction limit. 
Figure 9-18 shows an example. 


The Confocal Microscope Produces Optical Sections by Excluding 
Out-of-Focus Light 


The confocal microscope achieves a result similar to that of deconvolution, but 
does so by manipulating the light before it is measured; it is an analog technique 
rather than a digital one. The optical details of the confocal microscope are com- 
plex, but the basic idea is simple, as illustrated in Figure 9-19, and the results are 
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Figure 9-18 Image deconvolution. 

(A) A light micrograph of the large polytene 
chromosomes from Drosophila, stained 
with a fluorescent DNA-binding dye. 

(B) The same field of view after image 
deconvolution clearly reveals the banding 
pattern on the chromosomes. Each band 
is about 0.25 um thick, approaching the 
diffraction limit of the light microscope. 
(Courtesy of the John Sedat Laboratory.) 
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(A) = = Figure 9-19 The confocal fluorescence 
O O ; i i ne z 
1s) © microscope. (A) This simplified diagram 
i) oO . 
D T shows that the basic arrangement of 
= Bs optical components is similar to that of the 
confocal ey pees g standard fluorescence microscope shown 


pinholes in Figure 9-12, except that a laser is used 


to illuminate a small pinhole whose image 
is focused at a single point in the three- 
dimensional (3-D) specimen. (B) Emitted 
fluorescence from this focal point in the 
specimen is focused at a second (confocal) 
pinhole. (C) Emitted light from elsewhere 

in the specimen is not focused at the 
pinhole and therefore does not contribute 
to the final image. By scanning the beam 
of light across the specimen, a very sharp 
two-dimensional image of the exact plane 
of focus is built up that is not significantly 
degraded by light from other regions of the 
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far superior to those obtained by conventional light microscopy (Figure 9-20A 
and B). 
The confocal microscope is generally used with fluorescence optics (see Fig- 
ure 9-12), but instead of illuminating the whole specimen at once, in the usual 
way, the optical system at any instant focuses a spot of light onto a single point at 
a specific depth in the specimen. This requires a source of pinpoint illumination 
that is usually supplied by a laser whose light has been passed through a pinhole. 
The fluorescence emitted from the illuminated material is collected at a suitable 
light detector and used to generate an image. A pinhole aperture is placed in front 
of the detector, at a position that is confocal with the illuminating pinhole—that 
is, precisely where the rays emitted from the illuminated point in the specimen 
come to a focus. Thus, the light from this point in the specimen converges on this 
aperture and enters the detector. 
By contrast, the light from regions out of the plane of focus of the spotlight is 
also out of focus at the pinhole aperture and is therefore largely excluded from 
the detector (see Figure 9-19). To build up a two-dimensional image, data from 
each point in the plane of focus are collected sequentially by scanning across the 
field from left to right in a regular pattern of pixels and are displayed onacom- Figure 9-20 Confocal fluorescence 
puter screen. Although not shown in Figure 9-19, the scanning is usually done by microscopy produces clear optical 
deflecting the beam with an oscillating mirror placed between the dichroic mir- sections and three-dimensional data 


ror and the objective lens in such a way that the illuminating spotlight and the sets. The first two micrographs are of the 
same intact gastrula-stage Drosophila 
embryo, which has been stained with a 
fluorescent probe for actin filaments. 

(A) The conventional, unprocessed image 
is blurred by the presence of fluorescent 
structures above and below the plane 

of focus. (B) In the confocal image, this 
out-of-focus information is removed, 
resulting in a crisp optical section of 

the cells in the embryo. (C) A three- 
dimensional reconstruction of an object 
can be assembled from a stack of such 
optical sections. In this case, the complex 
branching structure of the mitochondrial 
compartment in a single live yeast cell 

is Shown. (A and B, courtesy of Richard 
(© Warn and Peter Shaw; C, courtesy of 

10 um 2 um Stefan Hell.) 
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Figure 9-21 Multiphoton imaging. Infrared laser light causes less damage to living cells than 
visible light and can also penetrate further, allowing microscopists to peer deeper into living tissues. 
The two-photon effect, in which a fluorochrome can be excited by two coincident infrared photons 
instead of a single high-energy photon, allows us to see nearly 0.5 mm inside the cortex of a live 
mouse brain. A dye, whose fluorescence changes with the calcium concentration, reveals active 
synapses (yellow) on the dendritic spines (red) that change as a function of time; in this case, there 
is a day between each image. (Courtesy of Thomas Oertner and Karel Svoboda.) 





confocal pinhole at the detector remain strictly in register. Variations in design 
now allow the rapid collection of data at video rates. 

The confocal microscope has been used to resolve the structure of numerous 
complex three-dimensional objects (Figure 9-20C) including the networks of 
cytoskeletal fibers in the cytoplasm and the arrangements of chromosomes and 
genes in the nucleus. 

The relative merits of deconvolution methods and confocal microscopy for 
three-dimensional optical microscopy depend on the specimen being imaged. 
Confocal microscopes tend to be better for thicker specimens with high levels of 
out-of-focus light. They are also generally easier to use than deconvolution sys- 
tems and the final optical sections can be seen quickly. In contrast, the cooled 
CCD or CMOS cameras used for deconvolution systems are extremely efficient at 
collecting small amounts of light, and they can be used to make detailed three-di- 
mensional images from specimens that are too weakly stained or too easily dam- 
aged by the bright light used for confocal microscopy. 

Both methods, however, have another drawback; neither is good at coping 
with very thick specimens. Deconvolution methods quickly become ineffective 
any deeper than about 40 um into a specimen, while confocal microscopes can 
only obtain images up to a depth of about 150 um. Special microscopes can now 
take advantage of the way in which fluorescent molecules are excited, to probe 
even deeper into a specimen. Fluorescent molecules are usually excited by a sin- 
gle high-energy photon, of shorter wavelength than the emitted light, but they can 
in addition be excited by the absorption of two (or more) photons of lower energy, 
as long as they both arrive within a femtosecond or so of each other. The use of 
this longer-wavelength excitation has some important advantages. In addition 
to reducing background noise, red or near-infrared light can penetrate deeper 
into a specimen. Multiphoton microscopes, constructed to take advantage of this 
two-photon effect, can obtain sharp images, sometimes even at a depth of 250 um 
within a specimen. This is particularly valuable for studies of living tissues, nota- 
bly in imaging the dynamic activity of synapses and neurons just below the sur- 
face of living brains (Figure 9-21). 


Individual Proteins Can Be Fluorescently Tagged in Living Cells 
and Organisms 


Even the most stable cell structures must be assembled, disassembled, and reor- 
ganized during the cell’s life cycle. Other structures, often enormous on the molec- 
ular scale, rapidly change, move, and reorganize themselves as the cell conducts 
its internal affairs and responds to its environment. Complex, highly organized 
pieces of molecular machinery move components around the cell, controlling 
traffic into and out of the nucleus, from one organelle to another, and into and out 
of the cell itself. 

Various techniques have been developed to visualize the specific components 
involved in such dynamic phenomena. Many of these methods use fluorescent 
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proteins, and they require a trade-off between structural preservation and effi- 
cient labeling. All of the fluorescent molecules discussed so far are made outside 
the cell and then artificially introduced into it. But use of genes coding for protein 
molecules that are themselves inherently fluorescent also enables the creation of 
organisms and cell lines that make their own visible tags and labels, without the 
introduction of foreign molecules. These cellular exhibitionists display their inner 
workings in glowing fluorescent color. 

Foremost among the fluorescent proteins used for these purposes by cell biol- 
ogists is the green fluorescent protein (GFP), isolated from the jellyfish Aequo- 
rea victoria. This protein is encoded by a single gene, which can be cloned and 
introduced into cells of other species. The freshly translated protein is not fluo- 
rescent, but within an hour or so (less for some alleles of the gene, more for oth- 
ers) it undergoes a self-catalyzed post-translational modification to generate an 
efficient fluorochrome, shielded within the interior of a barrel-like protein, which 
will now fluoresce when illuminated appropriately with blue light (Figure 9-22). 
Extensive site-directed mutagenesis performed on the original gene sequence 
has resulted in multiple variants that can be used effectively in organisms ranging 
from animals and plants to fungi and microbes. The fluorescence efficiency has 
also been improved, and variants have been generated with altered absorption 
and emission spectra from the blue-green, like blue fluorescent protein or BFP, to 
the far visible red. Other, related fluorescent proteins have since been discovered 
(for example, in corals) that also extend the range into the red region of the spec- 
trum, like red fluorescent protein or RFP. 

One of the simplest uses of GFP is as a reporter molecule, a fluorescent probe to 
monitor gene expression. A transgenic organism can be made with the GFP-cod- 
ing sequence placed under the transcriptional control of the promoter belonging 
to a gene of interest, giving a directly visible readout of the gene’s expression pat- 
tern in the living organism (Figure 9-23). In another application, a peptide loca- 
tion signal can be added to the GFP to direct it to a particular cell compartment, 
such as the endoplasmic reticulum or a mitochondrion, lighting up these organ- 
elles so they can be observed in the living state (see Figure 12-31). 

The GFP DNA coding sequence can also be inserted at the beginning or end of 
the gene for another protein, yielding a chimeric product consisting of that pro- 
tein with a GFP domain attached. In many cases, this GFP fusion protein behaves 
in the same way as the original protein, directly revealing its location and activ- 
ities by means of its genetically encoded fluorescence (Figure 9-24). It is often 
possible to prove that the GFP fusion protein is functionally equivalent to the 
untagged protein, for example by using it to rescue a mutant lacking that protein. 
GFP tagging is the clearest and most unequivocal way of showing the distribution 
and dynamics of a protein in a living organism (Figure 9-25 and see Movie 16.8). 


Protein Dynamics Can Be Followed in Living Cells 


Fluorescent proteins are now exploited not just to see where in a cell a partic- 
ular protein is located, but also to uncover its kinetic properties and to find out 
whether it might interact with other molecules. We now describe three techniques 
in which fluorescent proteins are used in this way. 

First, interactions between one protein and another can be monitored by 
fluorescence resonance energy transfer, also called Férster resonance energy 


Figure 9-23 Green fluorescent protein (GFP) as a reporter. For this 
experiment, carried out in the fruit fly, the GFP gene was joined (using 
recombinant DNA techniques) to a fly promoter that is active only in a 
specialized set of neurons. This image of a live fly embryo was captured 

by a fluorescence microscope and shows approximately 20 neurons, each 
with long projections (axons and dendrites) that communicate with other 
(nonfluorescent) cells. These neurons are located just under the surface of the 
animal and allow it to sense its immediate environment. (From W.B. Grueber 
et al., Curr. Biol. 13:618-626, 2003. With permission from Elsevier.) 


943 





Figure 9-22 Green fluorescent protein 
(GFP). The structure of GFP, shown here 
schematically, highlights the eleven 

B strands that form the staves of a barrel. 
Buried within the barrel is the active 
chromophore (dark green) that is formed 
post-translationally from the protruding side 
chains of three amino acid residues. (From 
M. Orm6 et al., Science 273:1392-1395, 
1996. With permission from AAAS.) 
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Figure 9-24 GFP-tagged proteins. This living cell from a tobacco plant 

is expressing high levels of green fluorescent protein, fused to a protein 

that is targeted to mitochondria, which accordingly appear green. The 
mitochondria are seen to cluster around the chloroplasts, whose chlorophyll 
autofluorescence marks them out in red. (Courtesy of Olivier Grandjean.) 


transfer, both abbreviated FRET. In this technique, two molecules of interest are 
each labeled with a different fluorochrome, chosen so that the emission spectrum 
of one fluorochrome, the donor, overlaps with the absorption spectrum of the 
other, the acceptor. If the two proteins bind so as to bring their fluorochromes into 
very close proximity (closer than about 5 nm), one fluorochrome, when excited, 
can transfer energy from the absorbed light directly (by resonance, nonradiatively) 
to the other. Thus, when the complex is illuminated at the excitation wavelength of 
the first fluorochrome, fluorescent light is produced at the emission wavelength of 
the second. This method can be used with two different spectral variants of GFP 
as fluorochromes to monitor processes such as the interaction of signaling mol- 
ecules with their receptors, or proteins in macromolecular complexes at specific 
locations inside living cells (Figure 9-26). The FRET can be measured by quantify- 
ing the reduction of the donor fluorescence in the presence of the acceptor. 

A second example of a fluorescence-tagging technique that allows detailed 
observations of proteins within cells involves synthesizing an inactive form of the 
fluorescent molecule of interest, introducing it into the cell, and then activating 
it suddenly at a chosen site in the cell by focusing a spot of light on it. This pro- 
cess is referred to as photoactivation. Many inactive photosensitive precursors 
of this type, often called caged molecules, have been made based on a variety of 
fluorescent molecules. A microscope can be used to focus a strong pulse of light 
from a laser on any tiny region of the cell, so that the experimenter can control 
exactly where and when the fluorescent molecule is photoactivated. The tech- 
nique allows us to follow complex and rapid intracellular processes, such as the 
actions of signaling molecules or the movements of cytoskeletal proteins. 

When a photoactivatable fluorescent tag is attached to a purified protein, it 
is important that the modified protein remain biologically active: labeling with a 
caged fluorescent dye adds a bulky group to the surface of a protein, which can 
easily change the protein’s properties. A satisfactory labeling protocol is usually 
found by trial and error. Once a biologically active labeled protein has been pro- 
duced, it needs to be introduced into the living cell where its behavior can be fol- 
lowed. Tubulin labeled with caged fluorescein, for example, can be injected into 
a dividing cell, where it is incorporated into microtubules of the mitotic spindle. 
When a small region of the spindle is illuminated with a laser, the labeled tubulin 


0 min 45 min 90 min 135 min 
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5 um 


Figure 9-25 Dynamics of GFP tagging. This sequence of micrographs shows a set of three- 
dimensional images of a living nucleus taken over the course of 135 minutes. Tobacco cells have 
been stably transformed with GFP fused to a spliceosomal protein that is concentrated in small 
nuclear bodies called Cajal bodies (see Figure 6-46). The fluorescent Cajal bodies, easily visible in 
a living cell with confocal microscopy, are dynamic structures that move around within the nucleus. 
(Courtesy of Kurt Boudonck, Liam Dolan, and Peter Shaw.) 
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becomes fluorescent, so that its movement along the spindle microtubules can be 
readily followed (Figure 9-27). 

A further development in photoactivation is the discovery that the genes 
encoding GFP and related fluorescent proteins can be engineered to produce 
protein variants, usually with one or more amino acid changes, that fluoresce 
only weakly under normal excitation conditions, but can be induced to fluoresce 
either more strongly or with a color shift (for example, from green to red) by acti- 
vating them with a strong pulse of light at a different wavelength. In principle, 
the microscopist can then follow the local in vivo behavior of any protein that 
can be expressed as a fusion with one of these GFP variants. These genetically 
encoded, photoactivatable fluorescent proteins allow the lifetime and behavior 
of any protein to be studied independently of other newly synthesized proteins 
(Figure 9-28). 

A third way to exploit GFP fused to a protein of interest is known as fluores- 
cence recovery after photobleaching (FRAP). Here, one uses a strong focused 
beam of light from a laser to extinguish the GFP fluorescence in a specified region 
of the cell, after which one can analyze the way in which remaining unbleached 
fluorescent protein molecules move into the bleached area as a function of time. 
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Figure 9-26 Fluorescence resonance 
energy transfer (FRET). To determine 
whether (and when) two proteins interact 
inside a cell, the proteins are first produced 
as fusion proteins attached to different 
color variants of green fluorescent protein 
(GFP). (A) In this example, protein X is 
coupled to a blue fluorescent protein, which 
is excited by violet light (870-440 nm) and 
emits blue light (440-480 nm); protein Y 

is coupled to a green fluorescent protein, 
which is excited by blue light (440-480 nm) 
and emits green light (510 nm). (B) If protein 
X and Y do not interact, illuminating the 
sample with violet light yields fluorescence 
from the blue fluorescent protein only. 

(C) When protein X and protein Y interact, 
the resonance transfer of energy, FRET, 
can now occur. Illuminating the sample 
with violet light excites the blue fluorescent 
protein, which transfers its energy to the 
green fluorescent protein, resulting in an 
emission of green light. The fluorochromes 
must be quite close together — within about 
1-5 nm of one another—for FRET to occur. 
Because not every molecule of protein X 
and protein Y is bound at all times, some 
blue light may still be detected. But as the 
two proteins begin to interact, emission 
from the donor blue fluorescent protein 
falls as the emission from the acceptor 
GFP rises. 


Figure 9-27 Determining microtubule 
flux in the mitotic spindle with caged 
fluorescein linked to tubulin. 

(A) A metaphase spindle formed in vitro 
from an extract of Xenopus eggs has 
incorporated three fluorescent markers: 
rhodamine-labeled tubulin (red) to mark 
all the microtubules, a blue DNA-binding 
dye that labels the chromosomes, and 
caged-fluorescein-labeled tubulin, which is 
also incorporated into all the microtubules 
but is invisible because it is nonfluorescent 
until activated by ultraviolet (UV) light. 

(B) A beam of UV light activates, or 
“uncages,” the caged-fluorescein-labeled 
tubulin locally, mainly just to the left side 
of the metaphase plate. Over the next 
few minutes—after 1.5 minutes in (C) and 
after 2.5 minutes in (D)—the uncaged- 
fluorescein—tubulin signal moves toward 
the left spindle pole, indicating that tubulin 
is continuously moving poleward even 
though the spindle (visualized by the red 
rhodamine-labeled tubulin fluorescence) 
remains largely unchanged. (From 

K.E. Sawin and T.J. Mitchison, J. Cell Biol. 
112:941-954, 1991. With permission 
from The Rockefeller University Press.) 
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This technique is usually carried out with a confocal microscope and, like photo- 
activation, can deliver valuable quantitative data about a protein’s kinetic param- 
eters, such as diffusion coefficients, active transport rates, or binding and dissoci- 
ation rates from other proteins (Figure 9-29). 








Light-Emitting Indicators Can Measure Rapidly Changing 
Intracellular lon Concentrations 


One way to study the chemistry of a single living cell is to insert the tip of a fine, 
glass, ion-sensitive microelectrode directly into the cell interior through the 
plasma membrane. This technique is used to measure the intracellular concen- 
trations of common inorganic ions, such as Ht, Nat, K+, Cl", and Ca?+. However, 
ion-sensitive microelectrodes reveal the ion concentration only at one point 
in a cell, and for an ion present at a very low concentration, such as Ca**, their 
responses are slow and somewhat erratic. Thus, these microelectrodes are not 
ideally suited to record the rapid and transient changes in the concentration of 
cytosolic Ca** that have an important role in allowing cells to respond to extracel- 
lular signals. Such changes can be analyzed with ion-sensitive indicators, whose 
light emission reflects the local concentration of the ion. Some of these indica- 
tors are luminescent (emitting light spontaneously), while others are fluorescent 
(emitting light on exposure to light). 
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Figure 9-28 Photoactivation. 
Photoactivation is the light-induced 
activation of an inert molecule to an 

active state. In this experiment, shown 
schematically in (A), a photoactivatable 
variant of GFP is expressed in a cultured 
animal cell. (B) Before activation (time 

O sec), little or no GFP fluorescence is 
detected in the selected region (red circle) 
when excited by blue light at 488 nm. After 
activation of the GFP with an ultraviolet 
laser pulse at 413 nm, it rapidly fluoresces 
brightly in the selected region (green). The 
movement of GFP, as it diffuses out of this 
region, can be measured. Since only the 
ohotoactivated proteins are fluorescent 
within the cell, the trafficking, turnover, and 
degradative pathways of proteins can be 
monitored. (B, from J. Lippincott-Schwartz 
and G.H. Patterson, Science 300:87-91, 
2003.) 


Figure 9-29 Fluorescence recovery 
after photobleaching (FRAP). A 

strong focused pulse of laser light will 
extinguish, or bleach, the fluorescence 

of GFP. By selectively photobleaching 

a set of fluorescently tagged protein 
molecules within a defined region of a cell, 
the microscopist can monitor recovery 
over time, as the remaining fluorescent 
molecules move into the bleached region 
(see Movie 10.6). (A) The experiment 
shown uses monkey cells in culture that 
express galactosyltransferase, an enzyme 
that constantly recycles between the Golgi 
apparatus and the endoplasmic reticulum 
(ER). The Golgi apparatus in one of the 
two cells is selectively photobleached, 
while the production of new fluorescent 
protein is blocked by treating the cells with 
cycloheximide. The recovery, resulting from 
fluorescent enzyme molecules moving from 
the ER to the Golgi, can then be followed 
over a period of time. (B) Schematic 
diagram of the experiment shown in (A). 
(A, from J. Lippincott-Schwartz, 
Histochem. Cell Biol. 116:97-107, 2001. 
With permission from Springer-Verlag.) 
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Figure 9-30 Aequorin, a luminescent protein. The luminescent protein 
aequorin emits blue light in the presence of free Ca**. Here, an egg of the 
medaka fish has been injected with aequorin, which has diffused throughout 
the cytosol, and the egg has then been fertilized with a sperm and examined 
with the help of a very sensitive camera. The four photographs were taken 
looking down on the site of sperm entry at intervals of 10 seconds and 
reveal a wave of release of free Ca2* into the cytosol from internal stores 
just beneath the plasma membrane. This wave sweeps across the egg 
starting from the site of sperm entry, as indicated in the diagrams on the 

left. (Photographs reproduced from J.C. Gilkey, L.F. Jaffe, E.B. Ridgway and 
G.T. Reynolds, J. Cell Biol. 76:448-466, 1978. With permission from The 
Rockefeller University Press.) 


Aequorin is a luminescent protein isolated from the same marine jellyfish 
that produces GFP. It emits blue light in the presence of Ca** and responds to 
changes in Ca** concentration in the range of 0.5-10 uM. If microinjected into an 
egg, for example, aequorin emits a flash of light in response to the sudden local- 
ized release of free Ca** into the cytoplasm that occurs when the egg is fertilized 
(Figure 9-30). Aequorin has also been expressed transgenically in plants and 
other organisms to provide a method of monitoring Ca** in all their cells without 
the need for microinjection, which can be a difficult procedure. 

Bioluminescent molecules like aequorin emit tiny amounts of light—at best, 
a few photons per indicator molecule—that are difficult to measure. Fluorescent 
indicators produce orders of magnitude more photons per molecule; they are 
therefore easier to measure and can give better spatial resolution. Genetically 
encoded fluorescent Ca*+ indicators have been synthesized that bind Ca?+ 
tightly and are excited by or emit light at slightly different wavelengths when 
they are free of Ca** than when they are in their Ca**-bound form. By measuring 
the ratio of fluorescence intensity at two excitation or emission wavelengths, 
we can determine the concentration ratio of the Ca**-bound indicator to the 
Ca**-free indicator, thereby providing an accurate measurement of the free 
Ca** concentration (see Movie 15.4). Indicators of this type are widely used for 
second-by-second monitoring of changes in intracellular Ca** concentration, or 
other ion concentrations, in the different parts of a cell viewed in a fluorescence 
microscope (Figure 9-31). 

Similar fluorescent indicators measure other ions; some detect H*, for exam- 
ple, and hence measure intracellular pH. Some of these indicators can enter cells 
by diffusion and thus need not be microinjected; this makes it possible to mon- 
itor large numbers of individual cells simultaneously in a fluorescence micro- 
scope. New types of indicators, used in conjunction with modern image-process- 
ing methods, make possible similarly rapid and precise methods for analyzing 
changes in the concentrations of many types of small molecules in cells. 


Single Molecules Can Be Visualized by Total Internal Reflection 
Fluorescence Microscopy 


In ordinary microscopes, single fluorescent molecules such as tagged proteins 
cannot be reliably detected. The limitation has nothing to do with the resolution 
limit, but instead arises from the strong background due to light emitted or scat- 
tered by out-of-focus molecules. This tends to blot out the fluorescence from the 


Figure 9-31 Visualizing intracellular Ca2* concentrations by using a 
fluorescent indicator. The branching tree of dendrites of a Purkinje cell in 
the cerebellum receives more than 100,000 synapses from other neurons. 
The output from the cell is conveyed along the single axon seen leaving the 
cell body at the bottom of the picture. This image of the intracellular Ca?+ 
concentration in a single Purkinje cell (from the brain of a guinea pig) was 
taken with a low-light camera and the Ca?*-sensitive fluorescent indicator 
fura-2. The concentration of free Ca** is represented by different colors, red 
being the highest and blue the lowest. The highest Ca?+ levels are present in 
the thousands of dendritic branches. (Courtesy of D.W. Tank, J.A. Connor, 
M. Sugimori, and R.R. Llinas.) 
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Figure 9-32 TIRF microscopy allows the detection of single fluorescent molecules. (A) TIRF microscopy uses excitatory 
laser light to illuminate the cover-slip surface at the critical angle at which all the light is reflected by the glass—water interface. 
Some electromagnetic energy extends a short distance across the interface as an evanescent wave that excites just those 
molecules that are attached to the cover slip or are very close to its surface. (B) TIRF microscopy is used here to image 
individual myosin-GFP molecules (green dots) attached to nonfluorescent actin filaments (C), which are invisible but stuck to the 
surface of the cover slip. (Courtesy of Dmitry Cherny and Clive R. Bagshaw.) 


particular molecule of interest. This problem can be solved by the use of a special 
optical technique called total internal reflection fluorescence (TIRF) microscopy. 
In a TIRF microscope, laser light shines onto the cover-slip surface at the precise 
critical angle at which total internal reflection occurs (Figure 9-32A). Because of 
total internal reflection, the light does not enter the sample, and the majority of 
fluorescent molecules are not, therefore, illuminated. However, electromagnetic 
energy does extend, as an evanescent field, for a very short distance beyond the 
surface of the cover slip and into the specimen, allowing just those molecules in 
the layer closest to the surface to become excited. When these molecules fluo- 
resce, their emitted light is no longer competing with out-of-focus light from the 
overlying molecules, and can now be detected. TIRE has allowed several dramatic 
experiments, for instance imaging of single motor proteins moving along micro- 
tubules or single actin filaments forming and branching. At present, the tech- 
nique is restricted to a thin layer within only 100-200 nm of the cell surface (Figure 
9-32B and C). 


Individual Molecules Can Be Touched, Imaged, and Moved Using 
Atomic Force Microscopy 


While TIRF allows single molecules to be visualized under certain conditions, it is 
strictly a passive observation method. In order to probe molecular function, it is 
ultimately useful to be able to manipulate individual molecules themselves, and 
atomic force microscopy (AFM) provides a method to do just that. In an AFM device, 
an extremely small and sharply pointed tip, often of silicon or silicon nitride, is 
made using nanofabrication methods similar to those used in the semiconductor 
industry. The tip of the AFM probe is attached to a springy cantilever arm mounted 
on a highly precise positioning system that allows it to be moved over very small 
distances. In addition to this precise movement capability, the AFM device is 
able to collect information about a variety of forces that it encounters—including 
electrostatic, van der Waals, and mechanical forces—which are felt by its tip as 
it moves close to or touches the surface (Figure 9-33A). When AFM was first 
developed, it was intended as an imaging technology to measure molecular-scale 
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Figure 9-33 Single molecules can be imaged and manipulated by atomic force microscopy. (A) Schematic diagram of the key components of 
an atomic force microscope (AFM), showing the force-sensing tip attached to one end of a single protein molecule, as in the experiment described in 
(D). (B) and (C) An AFM in imaging mode created these images of a single heteroduplex DNA molecule with a MutS protein dimer (larger white regions) 
bound near its center, at the point of a mismatched base pair. MutS is the first protein that binds to DNA when the mismatch repair process is initiated 
(see Figure 5-19). The smaller white dots are single streptavidin molecules, used to label the two ends of each DNA molecule. (D) Titin is an enormous 
protein molecule that provides muscle with its passive elasticity (See Figure 16-34). The extensibility of this protein can be directly tested using a short, 
artificially produced protein that contains eight repeated immunoglobulin (lg) domains from one region of the titin protein. In this experiment, the tip 

of the AFM is used to pick up, and progressively stretch, a single molecule until it eventually ruptures. As force is applied, each lg domain suddenly 
begins to unfold, and the force needed in each case (about 200 pN) can be recorded. The region of the force—extension curve shaded green records 
the sequential unfolding event for each of the eight protein domains. (B and C, from Y. Jiang and PE. Marszalek, EMBO J. 30:2881-2893, 2011. 
Reprinted with permission of John Wiley & Sons; D, adapted from W.A. Linke et al., J. Struct. Biol. 137:194—-205, 2002. With permission from Elsevier.) 


features on a surface. When used in this mode, the probe is scanned over the 
surface, moving up and down as necessary to maintain a constant interaction force 
with the surface, thus revealing any objects such as proteins or other molecules 
that might be present on the otherwise flat surface (Figure 9-33B and C). AFM is 
not limited to simply imaging surfaces, however, and can also be used to pick up 
and move single molecules that adsorb strongly to the tip. Using this technology, 
the mechanical properties of individual protein molecules can be measured in 
detail. For example, AFM has been used to unfold a single protein molecule in 
order to measure the energetics of domain folding (Figure 9-33 D). 


Superresolution Fluorescence Techniques Can Overcome 
Diffraction-Limited Resolution 


The variations on light microscopy we have described so far are all constrained 
by the classic diffraction limit to resolution described earlier; that is, to about 200 
nm (see Figure 9-6). Yet many cellular structures—from nuclear pores to nucle- 
osomes and clathrin-coated pits—are much smaller than this and so are unre- 
solvable by conventional light microscopy. Several approaches, however, are now 
available that bypass the limit imposed by the diffraction of light, and successfully 
allow objects as small as 20 nm to be imaged and clearly resolved: a remarkable, 
order-of-magnitude improvement. 


550 Chapter 9: Visualizing Cells 





WWI 


NN 
i) 





KER 
y) 
lil 





——— 
— 
=—— 
— 
— 
u 
e—a 
—— 
—=— 
—— 
— 
——— 
——_ 
—— 
—— te a 
= 
-m 
——— oo 
— 
—— 
— 
m 
—=— 
——_— 
— 
m 
—_ 
——— 
E 
—— 
— 


Ml 





The first of these so-called superresolution approaches, structured illumi- 
nation microscopy (SIM), is a fluorescence imaging method with a resolution of 
about 100 nm, or twice the resolution of conventional bright-field and confocal 
microscopy. SIM overcomes the diffraction limit by using a grated or structured 
pattern of light to illuminate the sample. The microscope’s physical set-up and 
operation is quite complex, but the general principle can be thought of as simi- 
lar to creating a moiré pattern, an interference pattern created by overlaying two 
grids with different angles or mesh sizes (Figure 9-34). In a similar way to creat- 
ing a moiré pattern, the illuminating grid and the sample features combine into 
an interference pattern, from which the original high-resolution contributions 
to the image of features beyond the classical resolution limit can be calculated. 
Illumination by a grid means that the parts of the sample in the dark stripes of 
the grid are not illuminated and therefore not imaged, so the imaging is repeated 
several times (usually three) after translating the grid through a fraction of the 
grid spacing between each image. As the interference effect is strongest for image 
components close to the direction of the grid bars, the whole process is repeated 
with the grid pattern rotated through a series of angles to obtain an equivalent 
enhancement in all directions. Finally, mathematically combining all these sepa- 
rate images by computer creates an enhanced superresolution image. SIM is ver- 
satile because it can be used with any fluorescent dye or protein, and combining 
SIM images captured at consecutive focal planes can create three-dimensional 
data sets (Figure 9-35). 


Figure 9-34 Structured illumination 
microscopy. The principle, illustrated here, 
is to illuminate a sample with patterned 
light and measure the moiré pattern. 
Shown are (A) the pattern from an unknown 
structure and (B) a known pattern. 

(C) When these are combined, the resulting 
moiré pattern contains more information 
than is easily seen in (A), the original 
pattern. If the Known pattern (B) has higher 
spatial frequencies, then better resolution 
will result. However, because the spatial 
patterns that can be created optically 

are also diffraction-limited, SIM can only 
improve the resolution by about a factor 

of two. (From B.O. Leung and K.C. Chou, 
Appl. Spectrosc. 65:967-980, 2011.) 
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Figure 9-35 Structured illumination microscopy can be used to create three-dimensional data. These three-dimensional projections of the 
meiotic chromosomes at pachytene in a maize cell show the paired lateral elements of the synaptonemal complexes. (A) The chromosome set has 
been stained with a fluorescent antibody to cohesin and is viewed here by conventional fluorescence microscopy. Because the distance between 
the two lateral elements is about 200 nm, the diffraction limit, the two lateral elements that make up each complex are not resolved. (B) In the 
three-dimensional SIM image, the improved resolution enables each lateral element, about 100 nm across, to be clearly resolved, and the two 
chromosomes can clearly be seen to coil around each other. (C) Because the complete three-dimensional data set for the whole nucleus is available, 
the path of each separate pair of chromosomes can be traced and artificially assigned a different color. (Courtesy of C.J. Rachel Wang, Peter Carlton 


and Zacheus Cande.) 
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To get around the diffraction limit, the other two superresolution techniques 
exploit aspects of the point spread function, a property of the optical system men- 
tioned earlier. The point spread function is the distribution of light intensity within 
the three-dimensional, blurred image that is formed when a single point source 
of light is brought to a focus with a lens. Instead of being identical to the point 
source, the image has an intensity distribution that is approximately described 
by a Gaussian distribution, which in turn determines the resolution of the lens 
system (Figure 9-36). Two points that are closer than the width at half-maximum 
height of this distribution will become hard to resolve because their images over- 
lap too much (see Figure 9-36C). 

In fluorescence microscopy, the excitation light is focused to a spot on the 
specimen by the objective lens, which then captures the photons emitted by any 
fluorescent molecule that the beam has raised from a ground state to an excited 
state. Because the excitation spot is blurred according to the point spread func- 
tion, fluorescent molecules that are closer than about 200 nm will be imaged as 
a single blurred spot. One approach to increasing the resolution is to switch all 
the fluorescent molecules at the periphery of the blurry excitation spot back to 
their ground state, or to a state where they no longer fluoresce in the normal way, 
leaving only those at the very center to be recorded. This can be done in practice 
by adding a second, very bright laser beam that wraps around the excitation beam 
like a torus. The wavelength and intensity of this second beam are adjusted so as 
to switch the fluorescent molecules off everywhere except at the very center of 
the point spread function, a region that can be as small as 20 nm across (Figure 
9-37). The fluorescent probes used must be in a special class that is photoswitch- 
able: their emission can be reversibly switched on and off with lights of different 
wavelengths. As the specimen is scanned with this arrangement of lasers, fluo- 
rescent molecules are switched on and off, and the small point spread function at 
each location is recorded. The diffraction limit is breached because the technique 
ensures that similar but very closely spaced molecules are in one of two different 
states, either fluorescing or dark. This approach is called STED (stimulated emis- 
sion depletion microscopy) and various microscopes using versions of the general 
method are now in wide use. Resolutions of 20 nm have been achieved in biologi- 
cal specimens, and even higher resolution attained with nonbiological specimens 
(see Figure 9-37). 


Superresolution Can Also be Achieved Using Single-Molecule 
Localization Methods 


If a single fluorescent molecule is imaged, it appears as a circular blurry disc, but 
if sufficient photons have contributed to this image, the precise mathematical 
center of the disclike image can be determined very accurately, often to within a 
few nanometers. But the problem with a specimen that contains a large number 
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Figure 9-36 The point spread function of 
a lens determines resolution. (A) When a 
point source of light is brought to a focus 
by a lens system, diffraction effects mean 
that, instead of being imaged as a point, 

it is blurred in all dimensions. (B) In the 
plane of the image, the distribution of light 
approximates a Gaussian distribution, 
whose width at half-maximum height under 
ideal conditions is about 200 nm. (C) Two 
point sources that are about 200 nm apart 
can still just be distinguished as separate 
objects in the image, but if they are any 
nearer than that, their images will overlap 
and not be resolvable. 
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Figure 9-37 Superresolution microscopy can be achieved by reducing 
the size of the point spread function. (A) The size of a normal focused 
beam of excitatory light. (B) An extremely strong superimposed laser beam, 
at a different wavelength and in the shape of a torus, depletes emitted 
fluorescence everywhere in the specimen except right in the center of the 
beam, reducing the effective width of the point spread function (C). As 

the specimen is scanned, this small point spread function can then build 
up a crisp image in a process called STED (stimulated emission depletion 
microscopy). (D) Synaptic vesicles in live cultured neurons, fluorescently 
labeled and imaged by ordinary confocal microscopy, with a resolution of 
260 nm. (E) The same vesicles imaged by STED, with a resolution of 60 
nm, which allows single vesicles to be resolved. (F) Fluorescently labeled 
replication factories in the nucleus of a cultured cell, imaged by ordinary 
confocal microscopy. (G) The same replication factories imaged by STED. 
Single, discrete replication sites can be resolved by STED that cannot be 
seen in the confocal image. (A, B, and C, from G. Donnert et al., Proc. Natl 
Acad. Sci. USA 103:11440-11445, 2006. With permission from National 
Academy of Sciences; D and E, from V. Westphal et al., Science 320:246- 
249, 2008. With permission from AAAS; F and G, from Z. Cseresnyes, 

U. Schwarz and C.M. Green, BMC Cell Biol. 10:88, 2009.) 


of adjacent fluorescent molecules, as we saw earlier, is that they each contribute 
blurry, overlapping point spread functions to the image, making the exact posi- 
tion of any one molecule impossible to resolve. Another way round this limitation 
is to arrange for only a very few, clearly separated molecules to actively fluoresce 
at any one moment. The exact position of each of these can then be computed, 
before subsequent sets of molecules are examined. 

In practice, this can be achieved by using lasers to sequentially switch on a 
sparse subset of fluorescent molecules in a specimen containing photoactivatable 
or photoswitchable fluorescent labels. Labels are activated, for example, by illu- 
mination with near-ultraviolet light, which modifies a small subset of molecules 
so that they fluoresce when exposed to an excitation beam at another wavelength. 
These are then imaged before bleaching quenches their fluorescence and a new 
subset is activated. Each molecule emits a few thousand photons in response to 
the excitation before switching off, and the switching process can be repeated 
hundreds or even thousands of times, allowing the exact coordinates of a very 
large set of single molecules to be determined. The full set can be combined and 
digitally displayed as an image in which the computed location of each individual 
molecule is exactly marked (Figure 9-38). This class of methods has been vari- 
ously termed photoactivated localization microscopy (PALM) or stochastic optical 
reconstruction microscopy (STORM). 

By switching the fluorophores off and on sequentially in different regions 
of the specimen as a function of time, all the superresolution imaging methods 
described above allow the resolution of molecules that are much closer together 
than the 200 nm diffraction limit. In STED, the locations of the molecules are 
determined by using optical methods to define exactly where their fluorescence 
will be on or off. In PALM and STORM, individual fluorescent molecules are 
switched on and off at random over a period of time, allowing their positions to 
be accurately determined. PALM and STORM techniques have depended on the 
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Figure 9-38 Single fluorescent molecules can be located with great accuracy. (A) Determining the exact mathematical 
center of the blurred image of a single fluorescent molecule becomes more accurate the more photons contribute to the final 
image. The point spread function described in the text dictates that the size of the molecular image is about 200 nm across, 
but in very bright specimens, the position of its center can be pinpointed to within a nanometer. (B) In this imaginary specimen, 
sparse subsets of fluorescent molecules are individually switched on briefly and then bleached. The exact positions of all these 
well-spaced molecules can be gradually built up into an image at superresolution. (C) In this portion of a cell, the microtubules 
have been fluorescently labeled and imaged at the top in a TIRF microscope (see Figure 9-32) and below, at superresolution, 
in a PALM microscope. The diameter of the microtubules in the lower panel now resembles their true size, about 25 nm, 
rather than the 250 nm in the blurred image at the top. (A, from A.L. McEvoy et al., BMC Biol. 8:106, 2010; C, courtesy of 
Carl Zeiss Ltd.) 


development of novel fluorescent probes that exhibit the appropriate switching 
behavior. All these methods are now being extended to incorporate multicolor 
imaging, three-dimensional imaging (Figure 9-39), and live-cell imaging in real 
time. Ending the long reign of the diffraction limit has certainly reinvigorated light 
microscopy and its place in cell biology research. 


Figure 9-39 Small fluorescent structures can be imaged in three 
dimensions with superresolution. (A) The image of two touching 
180-nm-diameter clathrin-coated pits on the plasma membrane of a cultured 
cell is diffraction-limited, and the individual pits cannot be distinguished in 
this conventional fluorescence image. (B) Using STORM superresolution 
microscopy, however, the pits are clearly resolvable. Not only can such pits 
be imaged using probes of different colors, but additional three-dimensional 
information can also be obtained. (C) and (D) Shown are two different 
orthogonal views of one single coated pit. The clathrin is labeled red and 
transferrin—the cargo within the pit—is labeled green. Images of this sort can 
be acquired in less than one second, making possible dynamic observations 
on living cells. These techniques depend heavily on the development of new, 





(A) (B) O | 





very fast-switching, and extremely bright fluorescent probes. (A and (© (D) C 
B, from M. Bates et al., Science 317:1749-1753, 2007; C and D, from feancrenin: clathrin- 200 nm 
S.A. Jones et al., Nat. Methods 8:499-508, 2011. With permission from cargo coated pit 


Macmillan Publishers Ltd.) (green) (red) 
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Summary 


Many light-microscope techniques are available for observing cells. Cells that have 
been fixed and stained can be studied in a conventional light microscope, whereas 
antibodies coupled to fluorescent dyes can be used to locate specific molecules in 
cells in a fluorescence microscope. Living cells can be seen with phase-contrast, 
differential-interference-contrast, dark-field, or bright-field microscopes. All forms 
of light microscopy are facilitated by digital image-processing techniques, which 
enhance sensitivity and refine the image. Confocal microscopy and image decon- 
volution both provide thin optical sections and can be used to reconstruct three-di- 
mensional images. 

Techniques are now available for detecting, measuring, and following almost 
any desired molecule in a living cell. Fluorescent indicator dyes can be introduced 
to measure the concentrations of specific ions in individual cells or in different parts 
of a cell. Virtually any protein of interest can be genetically engineered as a fluores- 
cent fusion protein, and then imaged in living cells by fluorescence microscopy. The 
dynamic behavior and interactions of many molecules can be followed in living 
cells by variations on the use of fluorescent protein tags, in some cases at the level of 
single molecules. Various superresolution techniques can circumvent the diffraction 
limit and resolve molecules separated by distances as small as 20 nm. 


LOOKING AT CELLS AND MOLECULES IN THE 
ELECTRON MICROSCOPE 


Light microscopy is limited in the fineness of detail that it can reveal. Microscopes 
using other types of radiation—in particular, electron microscopes—can resolve 
much smaller structures than is possible with visible light. This higher resolution 
comes at a cost: specimen preparation for electron microscopy is complex and 
it is harder to be sure that what we see in the image corresponds precisely to the 
original living structure. It is possible, however, to use very rapid freezing to pre- 
serve structures faithfully for electron microscopy. Digital image analysis can be 
used to reconstruct three-dimensional objects by combining information either 
from many individual particles or from multiple tilted views of a single object. 
Together, these approaches extend the resolution and scope of electron micros- 
copy to the point at which we can faithfully image the structures of individual 
macromolecules and the complexes they form. 


The Electron Microscope Resolves the Fine Structure of the Cell 


The formal relationship between the diffraction limit to resolution and the wave- 
length of the illuminating radiation (see Figure 9-6) holds true for any form of 
radiation, whether it is a beam of light or a beam of electrons. With electrons, how- 
ever, the limit of resolution is very small. The wavelength of an electron decreases 
as its velocity increases. In an electron microscope with an accelerating voltage 
of 100,000 V, the wavelength of an electron is 0.004 nm. In theory, the resolution 
of such a microscope should be about 0.002 nm, which is 100,000 times that of 
the light microscope. Because the aberrations of an electron lens are considerably 
harder to correct than those of a glass lens, however, the practical resolving power 
of modern electron microscopes is, even with careful image processing to correct 
for lens aberrations, about 0.05 nm (0.5 A) (Figure 9-40). This is because only the 
very center of the electron lenses can be used, and the effective numerical aper- 
ture is tiny. Furthermore, problems of specimen preparation, contrast, and radia- 
tion damage have generally limited the normal effective resolution for biological 
objects to 1 nm (10 A). This is nonetheless about 200 times better than the resolu- 
tion of the light microscope. Moreover, the performance of electron microscopes 
is improved by electron illumination sources called field emission guns. These 
very bright and coherent sources substantially improve the resolution achieved. 
In overall design, the transmission electron microscope (TEM) is similar to a 
light microscope, although it is much larger and “upside down” (Figure 9-41). 





Figure 9-40 The resolution of the 
electron microscope. This transmission 
electron micrograph of a monolayer of 
graphene resolves the individual carbon 
atoms as bright spots in a hexagonal 
lattice. Graphene is a single isolated atomic 
plane of graphite and forms the basis of 
carbon nanotubes. The distance between 
adjacent bonded carbon atoms is 

0.14 nm (1.4 A). Such resolution can only 
be obtained in a specially built transmission 
electron microscope in which all lens 
aberrations are carefully corrected, and 
with optimal specimens; it cannot be 
achieved with most conventional biological 
specimens. (From A. Dato et al., Chem. 
Commun. 40:6095-6097, 2009. With 
permission from The Royal Society of 
Chemistry.) 
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The source of illumination is a filament or cathode that emits electrons at the top 
of a cylindrical column about 2 m high. Since electrons are scattered by collisions 
with air molecules, air must first be pumped out of the column to create a vac- 
uum. The electrons are then accelerated from the filament by a nearby anode and 
allowed to pass through a tiny hole to form an electron beam that travels down the 
column. Magnetic coils placed at intervals along the column focus the electron 
beam, just as glass lenses focus the light in a light microscope. The specimen is put 
into the vacuum, through an airlock, into the path of the electron beam. As in light 
microscopy, the specimen is usually stained—in this case, with electron-dense 
material. Some of the electrons passing through the specimen are scattered by 
structures stained with the electron-dense material; the remainder are focused 
to form an image, in a manner analogous to the way an image is formed in a light 
microscope. The image can be observed on a phosphorescent screen or recorded 
with a high-resolution digital camera. Because the scattered electrons are lost 
from the beam, the dense regions of the specimen show up in the image as areas 
of reduced electron flux, which look dark. 


Biological Specimens Require Special Preparation for Electron 
Microscopy 


In the early days of its application to biological materials, the electron microscope 
revealed many previously unimagined structures in cells. But before these discov- 
eries could be made, electron microscopists had to develop new procedures for 
embedding, cutting, and staining tissues. 

Since the specimen is exposed to a very high vacuum in the electron micro- 
scope, living tissue is usually killed and preserved by fixation—first with glutar- 
aldehyde, which covalently cross-links protein molecules to their neighbors, and 
then with osmium tetroxide, which binds to and stabilizes lipid bilayers as well as 
proteins (Figure 9-42). Because electrons have very limited penetrating power, 
the fixed tissues normally have to be cut into extremely thin sections (25-100 nm 
thick, about 1/200 the thickness of a single cell) before they are viewed. ‘This is 
achieved by dehydrating the specimen, permeating it with a monomeric resin 
that polymerizes to form a solid block of plastic, then cutting the block with a fine 
glass or diamond knife on a special microtome. The resulting thin sections, free of 
water and other volatile solvents, are supported on a small metal grid for viewing 
in the microscope (Figure 9-43). 
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Figure 9-41 The principal features of 

a light microscope and a transmission 
electron microscope. These drawings 
emphasize the similarities of overall 

design. Whereas the lenses in the light 
microscope are made of glass, those in the 
electron microscope are magnetic coils. 
The electron microscope requires that the 
specimen be placed in a vacuum. The inset 
shows a transmission electron microscope 
in use. (Photograph courtesy of JEOL Ltd.) 
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glutaraldehyde osmium tetroxide 
Figure 9-42 Two common chemical 
fixatives used for electron microscopy. 
The two reactive aldehyde groups of 
glutaraldehyde enable it to cross-link 
various types of molecules, forming 
covalent bonds between them. Osmium 
tetroxide forms cross-linked complexes 
with many organic compounds, and in the 
process becomes reduced. This reaction is 
especially useful for fixing cell membranes, 
since the C=C double bonds present 

in many fatty acids react with osmium 
tetroxide. 


556 Chapter 9: Visualizing Cells 


The steps required to prepare biological material for electron microscopy are 
challenging. Howcan we be sure that the image of the fixed, dehydrated, resin-em- 
bedded specimen bears any relation to the delicate, aqueous biological system 
present in the living cell? The best current approaches to this problem depend on 
rapid freezing. If an aqueous system is cooled fast enough and to a low enough 
temperature, the water and other components in it do not have time to rearrange 
themselves or crystallize into ice. Instead, the water is supercooled into a rigid but 
noncrystalline state—a “glass” —called vitreous ice. This state can be achieved by 
slamming the specimen onto a polished copper block cooled by liquid helium, by 
plunging it into or spraying it with a jet of a coolant such as liquid propane, or by 
cooling it at high pressure. 

Some rapidly frozen specimens can be examined directly in the electron micro- 
scope using a special cooled specimen holder. In other cases, the frozen block can 
be fractured to reveal interior cell surfaces, or the surrounding ice can be sublimed 
away to expose external surfaces. However, we often want to examine thin sec- 
tions. A compromise is therefore to rapid-freeze the tissue, replace the water with 
organic solvents, embed the tissue in plastic resin, and finally cut sections and 
stain. Although technically still difficult, this approach stabilizes and preserves the 
tissue in a condition very close to its original living state (Figure 9-44). 

Image clarity in an electron micrograph depends upon having a range of con- 
trasting electron densities within the specimen. Electron density in turn depends 
on the atomic number of the atoms that are present: the higher the atomic num- 
ber, the more electrons are scattered and the darker that part of the image. Bio- 
logical tissues are composed mainly of atoms of very low atomic number (pri- 
marily carbon, oxygen, nitrogen, and hydrogen). To make them visible, tissues 
are usually impregnated (before or after sectioning) with the salts of heavy metals 
such as uranium, lead, and osmium. The degree of impregnation, or “staining,” 
with these salts will vary for different cell constituents. Lipids, for example, tend 
to stain darkly after osmium fixation, revealing the location of cell membranes. 


Specific Macromolecules Can Be Localized by Immunogold 
Electron Microscopy 


We have seen how antibodies can be used in conjunction with fluorescence micros- 
copy to localize specific macromolecules. An analogous method—immunogold 
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Figure 9-43 The metal grid that supports 
the thin sections of a specimen ina 
transmission electron microscope. 


Figure 9-44 Thin section of a cell. This 
thin section is of a yeast cell that has been 
very rapidly frozen and the vitreous ice 
replaced by organic solvents and then by 
plastic resin. The nucleus, mitochondria, cell 
wall, Golgi stacks, and ribosomes can all 

be readily seen in a state that is presumed 
to be as lifelike as possible. (Courtesy of 
Andrew Staehelin.) 
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Figure 9-45 Localizing proteins in 
electron microscopy. Immunogold 
electron microscopy is used here to find 
the specific location of four different protein 
components within the spindle pole body 
of yeast. At the top is a thin section of a 
yeast mitotic spindle showing the spindle 
microtubules that cross the nucleus and 
nme —_— — E m connect at each end to spindle pole 

E En i |__| bodies embedded in the nuclear envelope. 

spindle pole body A diagram of the components of a single 

spindle pole body is shown below. On 
separate sections, antibodies against 
four different proteins of the spindle pole 
body are used, together with colloidal 
gold particles (black dots), to reveal where 
within the complex structure each protein is 
located. (Courtesy of John Kilmartin.) 
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electron microscopy—can be used in the electron microscope. The usual proce- 
dure is to incubate a thin section first with a specific primary antibody, and then 
with a secondary antibody to which a colloidal gold particle has been attached. 
The gold particle is electron-dense and can be seen as a black dot in the electron 
microscope (Figure 9-45). Different antibodies can be conjugated to different 
sized gold particles so multiple proteins can be localized in a single sample. 

A complication for immunogold labeling is that the antibodies and colloidal 
gold particles do not penetrate into the resin used for embedding; therefore, they 
detect antigens only at the surface of the section. This means that the method’s 
sensitivity is low, since antigen molecules in the deeper parts of the section are 
not detected. Furthermore, we may get a false impression regarding which struc- 
tures contain the antigen and which do not. One solution is to label the specimen 
before embedding it in plastic, when cells and tissues are still fully accessible to 
labeling reagents. Extremely small gold particles, about 1 nm in diameter, work 
best for this procedure. Such small gold particles are usually not easily visible in 
the final sections, so additional silver or gold is nucleated around the tiny 1 nm 
gold particles in a chemical process very much like photographic development. 


Different Views of a Single Object Can Be Combined to Give a 
Three-Dimensional Reconstruction 


Thin sections often fail to convey the three-dimensional arrangement of cellular 
components viewed in a TEM, and the image can be very misleading: a linear 
structure such as a microtubule may appear in section as a pointlike object, for 
example, and a section through protruding parts of a single irregularly shaped 
solid body may give the appearance of two or more separate objects (Figure 9-46). 
The third dimension can be reconstructed from serial sections, but this is a lengthy 
and tedious process. Even thin sections, however, have a significant depth com- 
pared with the resolution of the electron microscope, so the TEM image can also 
be misleading in an opposite way, through the superimposition of objects that lie 
at different depths. 
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Because of the large depth of field of electron microscopes, all the parts of the 
three-dimensional specimen are in focus, and the resulting image is a projection 
(a superimposition of layers) of the structure along the viewing direction. The lost 
information in the third dimension can be recovered if we have views of the same 
specimen from many different directions. The computational methods for this 
technique are widely used in medical CT scans. In a CT scan, the imaging equip- 
ment is moved around the patient to generate the different views. In electron-mi- 
croscope (EM) tomography, the specimen holder is tilted in the microscope, 
which achieves the same result. In this way, we can arrive at a three-dimensional 
reconstruction, in a chosen standard orientation, by combining different views 
of a single object. Each individual view will be very noisy but by combining them 
in three dimensions and taking an average, the noise can be largely eliminated. 
Starting with thick plastic sections of embedded material, three-dimensional 
reconstructions, or tomograms, are used extensively to describe the detailed anat- 
omy of specific regions of the cell, such as the Golgi apparatus (Figure 9-47) or 
the cytoskeleton. Increasingly, microscopists are also applying EM tomography 
to unstained frozen, hydrated sections, and even to rapidly frozen whole cells 
or organelles (Figure 9-48). Electron microscopy now provides a robust bridge 
between the scale of the single molecule and that of the whole cell. 


Images of Surfaces Can Be Obtained by Scanning Electron 
Microscopy 


A scanning electron microscope (SEM) directly produces an image of the 
three-dimensional structure of the surface of a specimen. The SEM is usually 
smaller, simpler, and cheaper than a transmission electron microscope. Whereas 
the TEM uses the electrons that have passed through the specimen to form an 








Figure 9-46 A three-dimensional 
reconstruction from serial sections. 
Single thin sections in the electron 
microscope sometimes give misleading 
impressions. In this example, most sections 
through a cell containing a branched 
mitochondrion seem to contain two or 
three separate mitochondria (compare 
Figure 9-44). Sections 4 and 7, moreover, 
might be interpreted as showing a 
mitochondrion in the process of dividing. 
The true three-dimensional shape can 

be reconstructed from a complete set of 
serial sections. 


Figure 9-47 Electron-microscope (EM) 
tomography. Samples that have been 
rapidly frozen, and then freeze-substituted 
and embedded in plastic, preserve their 
structure in a condition that is very close 
to their original living state (Movie 9.2). 
This example shows the three-dimensional 
structure of the Golgi apparatus from 

a rat kidney cell. Several thick sections 
(250 nm) of the cell were tilted in a high- 
voltage electron microscope, along two 
different axes, and about 160 different 
views recorded. The digital data allow 
individual thin slices of the complete three- 
dimensional data set, or tomogram, to 

be viewed; for example, the serial slices, 
each only 4 nm thick, are shown in (A) and 
(B). Very little changes from one slice to 
the next, but using the full data set, and 
manually color-coding the membranes (B), 
one can obtain a full three-dimensional 
reconstruction, at a resolution of about 

7 nm, of the complete Golgi complex and 
its associated vesicles (C). (From M.S. 
Ladinsky et al., J.Cell Biol. 144:11385-1149, 
1999. With permission from the authors.) 
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image, the SEM uses electrons that are scattered or emitted from the specimen’s 
surface. The specimen to be examined is fixed, dried, and coated with a thin layer 
of heavy metal. Alternatively, it can be rapidly frozen, and then transferred to a 
cooled specimen stage for direct examination in the microscope. Often an entire 
plant part or small animal can be put into the microscope with very little prepara- 
tion (Figure 9-49). The specimen is scanned with a very narrow beam of electrons. 
The quantity of electrons scattered or emitted as this primary beam bombards 
each successive point of the metallic surface is measured and used to control the 
intensity of a second beam, which moves in synchrony with the primary beam 
and forms an image on a computer screen. Eventually a highly enlarged image of 
the surface as a whole is built up (Figure 9-50). 

The SEM technique provides great depth of field; moreover, since the amount 
of electron scattering depends on the angle of the surface relative to the beam, 
the image has highlights and shadows that give it a three-dimensional appear- 
ance (see Figure 9-49 and Figure 9-51). Only surface features can be examined, 
however, and in most forms of SEM, the resolution attainable is not very high 
(about 10 nm, with an effective magnification of up to 20,000 times). As a result, 
the technique is usually used to study whole cells and tissues rather than subcel- 
lular organelles (see Movie 21.3). Very-high-resolution SEMs have, however, been 
developed with a bright coherent-field emission gun as the electron source. This 
type of SEM can produce images that rival the resolution possible with a TEM 
(Figure 9-52). 


Negative Staining and Cryoelectron Microscopy Both Allow 
Macromolecules to Be Viewed at High Resolution 


If they are shadowed with a heavy metal to provide contrast, isolated macromol- 
ecules such as DNA or large proteins can be visualized readily in the electron 
microscope, but negative staining allows finer detail to be seen. In this tech- 
nique, the molecules are supported on a thin film of carbon and mixed with a 
solution of a heavy-metal salt such as uranyl acetate. After the sample has dried, 
a very thin film of metal salt covers the carbon film everywhere except where it 
has been excluded by the presence of an adsorbed macromolecule. Because the 
macromolecule allows electrons to pass through it much more readily than does 
the surrounding heavy-metal stain, a reverse or negative image of the molecule is 
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Figure 9-48 Combining cryoelectron- 
microscope tomography and single- 
particle reconstruction. Small, unfixed, 
rapidly frozen specimens can be examined 
while still frozen. In this example, the 

small nuclei of the amoeba Dictyostelium 
were gently isolated and then very rapidly 
frozen before a series of angled views 
were recorded with the aid of a tilting 
microscope stage. These digital views 

are combined by EM tomographs to 
produce a three-dimensional tomogram. 
Two thin digital slices (10 nm) through 

this tomogram show (A) top views and 

(B) side views of individual nuclear pores 
(white arrows). (C) In the three-dimensional 
model, a surface rendering of the pores 
(blue) is seen embedded in the nuclear 
envelope (yellow). From a series of 
tomograms it was possible to extract data 
sets for nearly 300 separate nuclear pores, 
whose structures could then be averaged 
using the techniques of single-particle 
reconstruction. The surface-rendered view 
of one of these reconstructed pores is 
shown (D) from the nuclear face and (E) in 
cross section (compare with Figure 12-8). 
The pore complex is colored blue and 

the nuclear basket brown. (From M. Beck 
et al., Science 306:1387-1390, 2004. 
With permission from AAAS.) 
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Figure 9-49 A developing wheat flower, 
or spike. This delicate flower spike was 
rapidly frozen, coated with a thin metal film, 
and examined in the frozen state with an 
SEM. This low-magnification micrograph 
demonstrates the large depth of focus of 
an SEM. (Courtesy of Kim Findlay.) 
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created. Negative staining is especially useful for viewing large macromolecular 
aggregates such as viruses or ribosomes, and for seeing the subunit structure of 
protein filaments (Figure 9-53). 

Shadowing and negative staining can provide high-contrast surface views of 
small macromolecular assemblies, but the size of the smallest metal particles in 
the shadow or stain used limits the resolution of both techniques. An alternative 
that allows us to visualize directly at high resolution even the interior features 
of three-dimensional structures such as viruses and organelles is cryoelectron 
microscopy, in which rapid freezing to form vitreous ice is again the key. A very 
thin (about 100 nm) film of an aqueous suspension of virus or purified macromo- 
lecular complex is prepared on a microscope grid and is then rapidly frozen by 








Figure 9-50 The scanning electron 
microscope. In an SEM, the specimen is 
scanned by a beam of electrons brought 
to a focus on the specimen by the 
electromagnetic coils that act as lenses. 
The detector measures the quantity of 
electrons scattered or emitted as the beam 
bombards each successive point on the 
surface of the specimen and controls the 
intensity of successive points in an image 
built up on a screen. The SEM creates 
striking images of three-dimensional 
objects with great depth of focus and 

a resolution between 3 nm and 20 nm 
depending on the instrument. (Photograph 
courtesy of Andrew Davies.) 


Figure 9-51 Scanning electron 
microscopy. (A) A scanning electron 
micrograph of the stereocilia projecting 
from a hair cell in the inner ear of a bullfrog. 
For comparison, the same structure is 
shown by (B) differential-interference- 
contrast light microscopy (Movie 9.3) 

and (C) thin-section transmission electron 
microscopy. (Courtesy of Richard Jacobs 
and James Hudspeth.) 
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being plunged into a coolant. A special sample holder keeps this hydrated speci- 
men at -160°C in the vacuum of the microscope, where it can be viewed directly 
without fixation, staining, or drying. Unlike negative staining, in which what we 
see is the envelope of stain exclusion around the particle, hydrated cryoelectron 
microscopy produces an image from the macromolecular structure itself. How- 
ever, the contrast in this image is very low, and to extract the maximum amount 
of structural information, special image-processing techniques must be used, as 
we describe next. 


Multiple Images Can Be Combined to Increase Resolution 


As we Saw earlier (p. 532), noise is important in light microscopy at low light levels, 
but it is a particularly severe problem for electron microscopy of unstained mac- 
romolecules. A protein molecule can tolerate a dose of only a few tens of electrons 
per square nanometer without damage, and this dose is orders of magnitude 
below what is needed to define an image at atomic resolution. 

The solution is to obtain images of many identical molecules—perhaps tens 
of thousands of individual images—and combine them to produce an averaged 
image, revealing structural details that are hidden by the noise in the original 
images. This procedure is called single-particle reconstruction. Before com- 
bining all the individual images, however, they must be aligned with each other. 
Sometimes it is possible to induce proteins and complexes to form crystalline 
arrays, in which each molecule is held in the same orientation in a regular lattice. 
In this case, the alignment problem is easily solved, and several protein structures 
have been determined at atomic resolution by this type of electron crystallogra- 
phy. In principle, however, crystalline arrays are not absolutely required. With the 
help of a computer, the digital images of randomly distributed and unaligned mol- 
ecules can be processed and combined to yield high-resolution reconstructions 
(see Movie 13.1). Although structures that have some intrinsic symmetry make 
the task of alignment easier and more accurate, this technique has also been used 
for objects like ribosomes, with no symmetry. Figure 9-54 shows the structure of 
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Figure 9-52 The nuclear pore. Rapidly 
frozen nuclear envelopes were imaged in 
a high-resolution SEM, equipped with a 
field emission gun as the electron source. 
These views of each side of a nuclear pore 
represent the limit of resolution of the SEM 
(compare with Figure 12-8). (Courtesy of 
Martin Goldberg and Terry Allen.) 


WHAT WE DON’T KNOW 


e We know in detail about many cell 
processes, such as DNA replication 
and transcription and RNA translation, 
but will we ever be able to visualize 
such rapid molecular processes in 
action in cells? 


e Will we ever be able to image 
intracellular structures at the resolution 
of the electron microscope in living 
cells? 


e How can we improve crystallization 
and single-particle cryoelectron 
microscopy techniques to obtain high- 
resolution structures of all important 
membrane channels and transporters? 
What new concepts might these 
structures reveal? 


Figure 9-53 Negatively stained actin 
filaments. In this transmission electron 
micrograph, each filament is about 8 nm in 
diameter and is seen, on close inspection, 
to be composed of a helical chain of 
globular actin molecules. (Courtesy of 
Roger Craig.) 
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membrane envelope 





the protein capsid inside a human immunodeficiency virus (HIV) that has been 
determined at high resolution by the combination of many particles, multiple 
views, and molecular modeling. 

A resolution of 0.3 nm has been achieved by electron microscopy—enough 
to begin to see the internal atomic arrangements in a protein and to rival x-ray 
crystallography in resolution. Although electron microscopy is unlikely to super- 
sede x-ray crystallography (discussed in Chapter 8) as a method for macromolec- 
ular structure determination, it has some very clear advantages. First, it does not 
absolutely require crystalline specimens. Second, it can deal with extremely large 
complexes—structures that may be too large or too variable to crystallize satis- 
factorily. Third, it allows the rapid analysis of different conformations of protein 
complexes. 

The analysis of large and complex macromolecular structures is helped consid- 
erably if the atomic structure of one or more of the subunits is known, for example 
from x-ray crystallography. Molecular models can then be mathematically “fitted” 
into the envelope of the structure determined at lower resolution using the elec- 
tron microscope (see Figures 16-16D and 16-46). Figure 9-55 shows the structure 
of a ribosome with the location of a bound release factor displayed in this way (see 
also Figure 6-72). 


Summary 


Discovering the detailed structure of membranes and organelles requires the higher 
resolution attainable in a transmission electron microscope. Specific macromole- 
cules can be localized after being labeled with colloidal gold linked to antibodies. 
Three-dimensional views of the surfaces of cells and tissues are obtained by scanning 
electron microscopy. The shapes of isolated molecules can be readily determined by 
electron microscopy techniques involving fast freezing or negative staining. Elec- 
tron tomography and single-particle reconstruction use computational manipula- 
tions of data obtained from multiple images and multiple viewing angles to pro- 
duce detailed reconstructions of macromolecules and molecular complexes. The 
resolution obtained with these methods means that atomic structures of individual 
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Figure 9-54 Single-particle 
reconstruction. The structure of a 
complete human immunodeficiency virus 
(HIV) capsid has been determined by a 
combination of cryoelectron microscopy, 
protein structure determination, and 
modeling. (A) A single 4 nm slice from an 
EM tomographic model (see also Figure 
9-48) of an intact HIV particle with its 
membrane outer envelope and its internal, 
irregularly shaped protein capsid that 
houses its RNA genome. (B) Electron 
microscopy of capsid subunits that have 
self-assembled into a helical tube can be 
used to derive an electron-density map 

at a resolution of 8 nm, in which details of 
the hexamers are clearly visible. (C) Using 
the known atomic coordinates of a single 
subunit of the hexamer, the structure 

has been modeled into the electron- 
density map from (B). (D) A molecular 
reconstruction of the entire HIV capsid, 
based on the detailed structures shown 

in (A) and (C). This capsid contains 216 
hexamers (blue) and 12 pentamers (yellow). 
(Adapted from G. Zhao et al., Nature 
497:643-646, 2013. With permission from 
Macmillan Publishers Ltd. C, PDB code: 
3J34.) 


Figure 9-55 Single-particle 
reconstruction and molecular model 
fitting. Bacterial ribosomes, with and 
without the release factor required for 
peptide release from the ribosome, 

were used to derive high-resolution, 
three-dimensional cryoelectron microscopy 
maps at a resolution of better than 1 

nm. Images of nearly 20,000 separate 
ribosomes preserved in ice were used to 
produce single-particle reconstructions. 

(A) The 30S ribosomal subunit (yellow) and 
the 50S subunit (blue) can be distinguished 
from the additional electron density that 
can be attributed to the release factor RF2 
(ourple). (B) The known molecular structure 
of RF2 modeled into the electron density 
from (A). (From U.B.S. Rawat et al., Nature 
421:87-90, 2003. With permission from 
Macmillan Publishers Ltd.) 
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macromolecules can often be fitted” to the images derived by electron microscopy. 
In this way, the TEM is increasingly able to bridge the gap between structures dis- 
covered by x-ray crystallography and those discovered with the light microscope. 


PROBLEMS 


Which statements are true? Explain why or why not. 


9-1 Because the DNA double helix is only 2 nm wide— 
well below the limit of resolution of the light microscope— 
it is impossible to see chromosomes in living cells without 
special stains. 


9-2 A fluorescent molecule, having absorbed a single 
photon of light at one wavelength, always emits it at a lon- 
ger wavelength. 


Discuss the following problems. 


9-3 The diagrams in Figure Q9-1 show the paths of 
light rays passing through a specimen with a dry lens and 
with an oil-immersion lens. Offer an explanation for why 
oil-immersion lenses should give better resolution. Air, 
glass, and oil have refractive indices of 1.00, 1.51, and 1.51, 
respectively. 


DRY LENS OIL-IMMERSION LENS 
objective 
lens , 
air oil 
cover slip 
slide © 


Figure Q9-1 Paths of light rays through dry and oil-immersion lenses 
(Problem 9-3). The red circle at the origin of the light rays is the 
specimen. 


9-4 Figure Q9-2 shows a diagram of the human eye. The 
refractive indices of the components in the light path are: 
cornea 1.38, aqueous humor 1.33, crystalline lens 1.41, and 
vitreous humor 1.38. Where does the main refraction—the 
main focusing—occur? What role do you suppose the lens 
plays? 

Figure Q9-2 Diagram 
of the human eye 
(Problem 9-4). 






iris 
vitreous 


humor retina 





cornea 


aqueous 
humor 


9-5 Why do humans see so poorly under water? And why 
do goggles help? 


9-6 Explain the difference between resolution and mag- 
nification. 


9-7 Antibodies that bind to specific proteins are import- 
ant tools for defining the locations of molecules in cells. 
The sensitivity of the primary antibody—the antibody that 
reacts with the target molecule—is often enhanced by 
using labeled secondary antibodies that bind to it. What 
are the advantages and disadvantages of using secondary 
antibodies that carry fluorescent tags versus those that 
carry bound enzymes? 


9-8 Figure Q9-3 shows a series of modified fluorescent 
proteins that emit light in a range of colors. How do you 
suppose the exact same chromophore can fluoresce at so 
many different wavelengths? 





Figure Q9-3 A rainbow of colors produced by modified fluorescent 
proteins (Problem 9-8). (Courtesy of Nathan Shaner, Paul Steinbach 
and Roger Tsien.) 


9-9 Consider a fluorescent detector designed to report 
the cellular location of active protein tyrosine kinases. A 
blue (cyan) fluorescent protein (CFP) and a yellow fluo- 
rescent protein (YFP) were fused to either end of a hybrid 
protein domain. The hybrid protein segment consisted of 
a substrate peptide recognized by the Abl protein tyro- 
sine kinase and a phosphotyrosine-binding domain 
(Figure Q9-4A). Stimulation of the CFP domain does not 
cause emission by the YFP domain when the domains are 
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Figure Q9-4 Fluorescent reporter protein designed to detect tyrosine 
phosphorylation (Problem 9-9). (A) Domain structure of reporter protein. 
Four domains are indicated: CFP, YFP, tyrosine kinase substrate peptide, 
and a phosphotyrosine-binding domain. (B) FRET assay. YFP/CFP 

is normalized to 1.0 at time zero. The reporter was incubated in the 
presence (or absence) of Abl and ATP for the indicated times. Arrow 
indicates time of addition of a tyrosine phosphatase. (From A.Y. Ting, 
K.H. Kain, R.L. Klemke and R.Y. Tsien, Proc. Natl Acad. Sci. USA 
98:15003-15008, 2001. With permission from National Academy of 
Sciences.) 
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separated. When the CFP and YFP domains are brought 
close together, however, fluorescence resonance energy 
transfer (FRET) allows excitation of CFP to stimulate emis- 
sion by YFP. FRET shows up experimentally as an increase 
in the ratio of emission at 526 nm versus 476 nm (YFP/ 
CFP) when CFP is excited by 434 nm light. 
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Membrane Structure 


Cell membranes are crucial to the life of the cell. The plasma membrane encloses 
the cell, defines its boundaries, and maintains the essential differences between 
the cytosol and the extracellular environment. Inside eukaryotic cells, the mem- 
branes of the nucleus, endoplasmic reticulum, Golgi apparatus, mitochondria, 
and other membrane-enclosed organelles maintain the characteristic differences 
between the contents of each organelle and the cytosol. Ion gradients across 
membranes, established by the activities of specialized membrane proteins, can 
be used to synthesize ATP, to drive the transport of selected solutes across the 
membrane, or, as in nerve and muscle cells, to produce and transmit electrical 
signals. In all cells, the plasma membrane also contains proteins that act as sen- 
sors of external signals, allowing the cell to change its behavior in response to 
environmental cues, including signals from other cells; these protein sensors, or 
receptors, transfer information—rather than molecules—across the membrane. 
Despite their differing functions, all biological membranes have a common 
general structure: each is a very thin film of lipid and protein molecules, held 
together mainly by noncovalent interactions (Figure 10-1). Cell membranes 





(A) 


Figure 10-1 Two views of a cell membrane. (A) An electron . x, 
micrograph of a segment of the plasma membrane of a > N B- ‘4 
human red blood cell seen in cross section, showing its bilayer lipid molecule h 
structure. (B) A three-dimensional schematic view of a cell 
membrane and the general disposition of its lipid and protein 
constituents. (A, courtesy of Daniel S. Friend.) (B) 
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are dynamic, fluid structures, and most of their molecules move about in the 
plane of the membrane. The lipid molecules are arranged as a continuous dou- 
ble layer about 5 nm thick. This lipid bilayer provides the basic fluid structure of 
the membrane and serves as a relatively impermeable barrier to the passage of 
most water-soluble molecules. Most membrane proteins span the lipid bilayer and 
mediate nearly all of the other functions of the membrane, including the trans- 
port of specific molecules across it, and the catalysis of membrane-associated 
reactions such as ATP synthesis. In the plasma membrane, some transmembrane 
proteins serve as structural links that connect the cytoskeleton through the lipid 
bilayer to either the extracellular matrix or an adjacent cell, while others serve as 
receptors to detect and transduce chemical signals in the cell’s environment. It 
takes many kinds of membrane proteins to enable a cell to function and interact 
with its environment, and it is estimated that about 30% of the proteins encoded 
in an animal’s genome are membrane proteins. 

In this chapter, we consider the structure and organization of the two main 
constituents of biological membranes—the lipids and the proteins. Although 
we focus mainly on the plasma membrane, most concepts discussed apply to 
the various internal membranes of eukaryotic cells as well. The functions of cell 
membranes are considered in later chapters: their role in energy conversion and 
ATP synthesis, for example, is discussed in Chapter 14; their role in the ttansmem- 
brane transport of small molecules in Chapter 11; and their roles in cell signaling 
and cell adhesion in Chapters 15 and 19, respectively. In Chapters 12 and 13, we 
discuss the internal membranes of the cell and the protein traffic through and 
between them. 


THE LIPID BILAYER 


The lipid bilayer provides the basic structure for all cell membranes. It is easily 
seen by electron microscopy, and its bilayer structure is attributable exclusively 
to the special properties of the lipid molecules, which assemble spontaneously 
into bilayers even under simple artificial conditions. In this section, we discuss 
the different types of lipid molecules found in cell membranes and the general 
properties of lipid bilayers. 


Phosphoglycerides, Sphingolipids, and Sterols Are the Major 
Lipids in Cell Membranes 


Lipid molecules constitute about 50% of the mass of most animal cell membranes, 
nearly all of the remainder being protein. There are approximately 5 x 10° lipid 
molecules in a 1 um x 1 um area of lipid bilayer, or about 10° lipid molecules in 
the plasma membrane of a small animal cell. All of the lipid molecules in cell 
membranes are amphiphilic—that is, they have a hydrophilic (“water-loving” ) or 
polar end and a hydrophobic (“water-fearing” ) or nonpolar end. 

The most abundant membrane lipids are the phospholipids. These have a 
polar head group containing a phosphate group and two hydrophobic hydrocar- 
bon tails. In animal, plant, and bacterial cells, the tails are usually fatty acids, and 
they can differ in length (they normally contain between 14 and 24 carbon atoms). 
One tail typically has one or more cis-double bonds (that is, it is unsaturated), 
while the other tail does not (that is, it is saturated). As shown in Figure 10-2, 
each cis-double bond creates a kink in the tail. Differences in the length and sat- 
uration of the fatty acid tails influence how phospholipid molecules pack against 
one another, thereby affecting the fluidity of the membrane, as we discuss later. 

The main phospholipids in most animal cell membranes are the phospho- 
glycerides, which have a three-carbon glycerol backbone (see Figure 10-2). Two 
long-chain fatty acids are linked through ester bonds to adjacent carbon atoms 
of the glycerol, and the third carbon atom of the glycerol is attached to a phos- 
phate group, which in turn is linked to one of several types of head group. By 
combining several different fatty acids and head groups, cells make many dif- 
ferent phosphoglycerides. Phosphatidylethanolamine, phosphatidylserine, and 
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phosphatidylcholine are the most abundant ones in mammalian cell membranes 


(Figure 10-3A-C). 


Another important class of phospholipids are the sphingolipids, which are built 
from sphingosine rather than glycerol (Figure10-3D-E). Sphingosine is a long acyl 
chain with an amino group (NH2) and two hydroxyl groups (OH) at one end. In 
sphingomyelin, the most common sphingolipid, a fatty acid tail is attached to the 
amino group, and a phosphocholine group is attached to the terminal hydroxyl 
group. Together, the phospholipids phosphatidylcholine, phosphatidylethanol- 
amine, phosphatidylserine, and sphingomyelin constitute more than half the 
mass of lipid in most mammalian cell membranes (see Table 10-1, p. 571). 
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Figure 10-2 The parts of a typical 
phospholipid molecule. This example 
is a phosphatidylcholine, represented (A) 
schematically, (B) by a formula, (C) as a 
space-filling model (Movie 10.1), and 
(D) as a symbol. 


Figure 10-3 Four major phospholipids in 
mammalian plasma membranes. Different 
head groups are represented by different 
colors in the symbols. The lipid molecules 
shown in (A-C) are phosphoglycerides, 
which are derived from glycerol. The 
molecule in (D) is sohingomyelin, which 

is derived from sphingosine (E) and is 
therefore a sphingolipid. Note that only 
phosphatidylserine carries a net negative 
charge, the importance of which we discuss 
later; the other three are electrically neutral at 
physiological pH, carrying one positive and 
one negative charge. 
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In addition to phospholipids, the lipid bilayers in many cell membranes con- 
tain glycolipids and cholesterol. Glycolipids resemble sphingolipids, but, instead 
of a phosphate-linked head group, they have sugars attached. We discuss glyco- 
lipids later. Eukaryotic plasma membranes contain especially large amounts of 
cholesterol—up to one molecule for every phospholipid molecule. Cholesterol 
is a sterol. It contains a rigid ring structure, to which is attached a single polar 
hydroxyl group and a short nonpolar hydrocarbon chain (Figure 10-4). The cho- 
lesterol molecules orient themselves in the bilayer with their hydroxyl group close 
to the polar head groups of adjacent phospholipid molecules (Figure 10-5). 


Phospholipids Spontaneously Form Bilayers 


The shape and amphiphilic nature of the phospholipid molecules cause them to 
form bilayers spontaneously in aqueous environments. As discussed in Chapter 
2, hydrophilic molecules dissolve readily in water because they contain charged 
groups or uncharged polar groups that can form either favorable electrostatic 
interactions or hydrogen bonds with water molecules (Figure 10-6A). Hydro- 
phobic molecules, by contrast, are insoluble in water because all, or almost all, 
of their atoms are uncharged and nonpolar and therefore cannot form energeti- 
cally favorable interactions with water molecules. If dispersed in water, they force 
the adjacent water molecules to reorganize into icelike cages that surround the 
hydrophobic molecule (Figure 10-6B). Because these cage structures are more 
ordered than the surrounding water, their formation increases the free energy. 
This free-energy cost is minimized, however, if the hydrophobic molecules (or 
the hydrophobic portions of amphiphilic molecules) cluster together so that the 
smallest number of water molecules is affected. 

When amphiphilic molecules are exposed to an aqueous environment, they 
behave as you would expect from the above discussion. They spontaneously 
aggregate to bury their hydrophobic tails in the interior, where they are shielded 
from the water, and they expose their hydrophilic heads to water. Depending 
on their shape, they can do this in either of two ways: they can form spherical 
micelles, with the tails inward, or they can form double-layered sheets, or bilay- 
ers, with the hydrophobic tails sandwiched between the hydrophilic head groups 
(Figure 10-7). 

The same forces that drive phospholipids to form bilayers also provide a 
self-sealing property. A small tear in the bilayer creates a free edge with water; 
because this is energetically unfavorable, the lipids tend to rearrange sponta- 
neously to eliminate the free edge. (In eukaryotic plasma membranes, the fusion 
of intracellular vesicles repairs larger tears.) The prohibition of free edges has a 
profound consequence: the only way for a bilayer to avoid having edges is by clos- 
ing in on itself and forming a sealed compartment (Figure 10-8). This remarkable 


Figure 10-4 The structure of cholesterol. 
Cholesterol is represented (A) by a formula, 
(B) by a schematic drawing, and (C) as a 
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1,0. 


z 
c 


0 


Figure 10-5 Cholesterol in a lipid 
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behavior, fundamental to the creation of a living cell, follows directly from the 
shape and amphiphilic nature of the phospholipid molecule. 

A lipid bilayer also has other characteristics that make it an ideal structure for 
cell membranes. One of the most important of these is its fluidity, which is crucial 
to many membrane functions (Movie 10.2). 


The Lipid Bilayer Is a Two-dimensional Fluid 


Around 1970, researchers first recognized that individual lipid molecules are able 
to diffuse freely within the plane of a lipid bilayer. The initial demonstration came 
from studies of synthetic (artificial) lipid bilayers, which can be made in the form 
of spherical vesicles, called liposomes (Figure 10-9); or in the form of planar 
bilayers formed across a hole in a partition between two aqueous compartments 
or on a solid support. 

Various techniques have been used to measure the motion of individual lipid 
molecules and their components. One can construct a lipid molecule, for exam- 
ple, with a fluorescent dye or a small gold particle attached to its polar head group 
and follow the diffusion of even individual molecules in a membrane. Alterna- 
tively, one can modify a lipid head group to carry a “spin label,” such as a nitroxide 
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Figure 10-7 Packing arrangements of amphiphilic molecules in an aqueous environment. 
(A) These molecules spontaneously form micelles or bilayers in water, depending on their shape. 
Cone-shaped amphiphilic molecules (above) form micelles, whereas cylinder-shaped amphiphilic 
molecules such as phospholipids (below) form bilayers. (B) A micelle and a lipid bilayer seen in 
cross section. Note that micelles of amphiphilic molecules are thought to be much more irregular 
than drawn here (see Figure 10-260). 
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Figure 10-6 How hydrophilic and 
hydrophobic molecules interact differently 
with water. (A) Because acetone is polar, 

it can form hydrogen bonds (red) and 
favorable electrostatic interactions (yellow) 
with water molecules, which are also polar. 
Thus, acetone readily dissolves in water. 

(B) By contrast, 2-methyl propane is entirely 
hydrophobic. Because it cannot form 
favorable interactions with water, it forces 
adjacent water molecules to reorganize into 
icelike cage structures, which increases the 
free energy. This compound is therefore 
virtually insoluble in water. The symbol 

ò` indicates a partial negative charge, and 
6* indicates a partial positive charge. Polar 
atoms are shown in color and nonpolar 
groups are shown in gray. 





planar phospholipid bilayer 
with edges exposed to water 
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formed by phospholipid 
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Figure 10-8 The spontaneous closure of 
a phospholipid bilayer to form a sealed 
compartment. The closed structure is 
stable because it avoids the exposure of 
the hydrophobic hydrocarbon tails to water, 
which would be energetically unfavorable. 
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Figure 10-9 Liposomes. (A) An electron micrograph of unfixed, unstained, 
synthetic phospholipid vesicles — liposomes —in water, which have been 
rapidly frozen at liquid-nitrogen temperature. (B) A drawing of a small 
spherical liposome seen in cross section. Liposomes are commonly used as 
model membranes in experimental studies, especially to study incorporated 
membrane proteins. (A, from P. Frederik and D. Hubert, Methods Enzymol. 
391:431-448, 2005. With permission from Elsevier.) 


group (=N-O); this contains an unpaired electron whose spin creates a paramag- 
netic signal that can be detected by electron spin resonance (ESR) spectroscopy, 
the principles of which are similar to those of nuclear magnetic resonance (NMR), 
discussed in Chapter 8. The motion and orientation of a spin-labeled lipid in a 
bilayer can be deduced from the ESR spectrum. Such studies show that phospho- 
lipid molecules in synthetic bilayers very rarely migrate from the monolayer (also 
called a leaflet) on one side to that on the other. This process, known as “flip-flop,” 
occurs on a time scale of hours for any individual molecule, although cholesterol 
is an exception to this rule and can flip-flop rapidly. In contrast, lipid molecules 
rapidly exchange places with their neighbors within a monolayer (~10/ times per 
second). This gives rise to a rapid lateral diffusion, with a diffusion coefficient (D) 
of about 1078 cm?/sec, which means that an average lipid molecule diffuses the 
length of a large bacterial cell (~2 um) in about 1 second. These studies have also 
shown that individual lipid molecules rotate very rapidly about their long axis and 
have flexible hydrocarbon chains. Computer simulations show that lipid mole- 
cules in synthetic bilayers are very disordered, presenting an irregular surface of 
variously spaced and oriented head groups to the water phase on either side of the 
bilayer (Figure 10-10). 

Similar mobility studies on labeled lipid molecules in isolated biological 
membranes and in living cells give results similar to those in synthetic bilayers. 
They demonstrate that the lipid component of a biological membrane is a two-di- 
mensional liquid in which the constituent molecules are free to move laterally. As 
in synthetic bilayers, individual phospholipid molecules are normally confined 
to their own monolayer. This confinement creates a problem for their synthesis. 
Phospholipid molecules are manufactured in only one monolayer of amembrane, 
mainly in the cytosolic monolayer of the endoplasmic reticulum membrane. If 
none of these newly made molecules could migrate reasonably promptly to the 
noncytosolic monolayer, new lipid bilayer could not be made. The problem is 
solved by a special class of membrane proteins called phospholipid translocators, 
or flippases, which catalyze the rapid flip-flop of phospholipids from one mono- 
layer to the other, as discussed in Chapter 12. 

Despite the fluidity of the lipid bilayer, liposomes do not fuse spontaneously 
with one another when suspended in water. Fusion does not occur because the 
polar lipid head groups bind water molecules that need to be displaced for the 
bilayers of two different liposomes to fuse. The hydration shell that keeps lipo- 
somes apart also insulates the many internal membranes in a eukaryotic cell 
and prevents their uncontrolled fusion, thereby maintaining the compartmen- 
tal integrity of membrane-enclosed organelles. All cell membrane fusion events 
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Figure 10-10 The mobility of 
phospholipid molecules in an artificial 
lipid bilayer. Starting with a model of 100 
phosphatidylcholine molecules arranged 

in a regular bilayer, a computer calculated 
the position of every atom after 300 
picoseconds of simulated time. From these 
theoretical calculations, a model of the lipid 
bilayer emerges that accounts for almost all 
of the measurable properties of a synthetic 
lipid bilayer, including its thickness, 
number of lipid molecules per membrane 
area, depth of water penetration, and 
unevenness of the two surfaces. Note that 
the tails in one monolayer can interact with 
those in the other monolayer, if the tails 

are long enough. (B) The different motions 
of a lipid molecule in a bilayer. (A, based 
on S.W. Chiu et al., Biophys. J. 69:1230- 
1245, 1995. With permission from the 
Biophysical Society.) 
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are catalyzed by tightly regulated fusion proteins, which force appropriate mem- 
branes into tight proximity, squeezing out the water layer that keeps the bilayers 
apart, as we discuss in Chapter 13. 


The Fluidity of a Lipid Bilayer Depends on Its Composition 


The fluidity of cell membranes has to be precisely regulated. Certain membrane 
transport processes and enzyme activities, for example, cease when the bilayer 
viscosity is experimentally increased beyond a threshold level. 

The fluidity of a lipid bilayer depends on both its composition and its tempera- 
ture, as is readily demonstrated in studies of synthetic lipid bilayers. A synthetic 
bilayer made from a single type of phospholipid changes from a liquid state to a 
two-dimensional rigid crystalline (or gel) state at a characteristic temperature. This 
change of state is called a phase transition, and the temperature at which it occurs 
is lower (that is, the membrane becomes more difficult to freeze) if the hydrocar- 
bon chains are short or have double bonds. A shorter chain length reduces the 
tendency of the hydrocarbon tails to interact with one another, in both the same 
and opposite monolayer, and cis-double bonds produce kinks in the chains that 
make them more difficult to pack together, so that the membrane remains fluid at 
lower temperatures (Figure 10-11). Bacteria, yeasts, and other organisms whose 
temperature fluctuates with that of their environment adjust the fatty acid com- 
position of their membrane lipids to maintain a relatively constant fluidity. As the 
temperature falls, for instance, the cells of those organisms synthesize fatty acids 
with more cis-double bonds, thereby avoiding the decrease in bilayer fluidity that 
would otherwise result from the temperature drop. 

Cholesterol modulates the properties of lipid bilayers. When mixed with phos- 
pholipids, it enhances the permeability-barrier properties of the lipid bilayer. 
Cholesterol inserts into the bilayer with its hydroxyl group close to the polar head 
groups of the phospholipids, so that its rigid, platelike steroid rings interact with— 
and partly immobilize—those regions of the hydrocarbon chains closest to the 
polar head groups (see Figure 10-5 and Movie 10.3). By decreasing the mobility of 
the first few CH2 groups of the chains of the phospholipid molecules, cholesterol 
makes the lipid bilayer less deformable in this region and thereby decreases the 
permeability of the bilayer to small water-soluble molecules. Although choles- 
terol tightens the packing of the lipids in a bilayer, it does not make membranes 
any less fluid. At the high concentrations found in most eukaryotic plasma mem- 
branes, cholesterol also prevents the hydrocarbon chains from coming together 
and crystallizing. 

Table 10-1 compares the lipid compositions of several biological membranes. 
Note that bacterial plasma membranes are often composed of one main type 
of phospholipid and contain no cholesterol. In archaea, lipids usually contain 


TABLE 10-1 
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unsaturated saturated 
hydrocarbon chains hydrocarbon chains 
with cis-double bonds 


Figure 10-11 The influence of cis- 
double bonds in hydrocarbon chains. 
The double bonds make it more difficult to 
pack the chains together, thereby making 
the lipid bilayer more difficult to freeze. In 
addition, because the hydrocarbon chains 
of unsaturated lipids are more spread 
apart, lipid bilayers containing them are 
thinner than bilayers formed exclusively 
from saturated lipids. 
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20-25-carbon-long prenyl chains instead of fatty acids; prenyl and fatty acid 
chains are similarly hydrophobic and flexible (see Figure 10-20F); in thermo- 
philic archaea, the longest lipid chains span both leaflets, making the membrane 
particularly stable to heat. Thus, lipid bilayers can be built from molecules with 
similar features but different molecular designs. The plasma membranes of most 
eukaryotic cells are more varied than those of prokaryotes and archaea, not only 
in containing large amounts of cholesterol but also in containing a mixture of dif- 
ferent phospholipids. 

Analysis of membrane lipids by mass spectrometry has revealed that the lipid 
composition of a typical eukaryotic cell membrane is much more complex than 
originally thought. These membranes contain a bewildering variety of perhaps 
500-2000 different lipid species with even the simple plasma membrane of a red 
blood cell containing well over 150. While some of this complexity reflects the 
combinatorial variation in head groups, hydrocarbon chain lengths, and desat- 
uration of the major phospholipid classes, some membranes also contain many 
structurally distinct minor lipids, at least some of which have important functions. 
The inositol phospholipids, for example, are present in small quantities in animal 
cell membranes and have crucial functions in guiding membrane traffic and in 
cell signaling (discussed in Chapters 13 and 15, respectively). Their local synthesis 
and destruction are regulated by a large number of enzymes, which create both 
small intracellular signaling molecules and lipid docking sites on membranes that 
recruit specific proteins from the cytosol, as we discuss later. 


Despite Their Fluidity, Lipid Bilayers Can Form Domains of Different 
Compositions 


Because a lipid bilayer is a two-dimensional fluid, we might expect most types 
of lipid molecules in it to be well mixed and randomly distributed in their own 
monolayer. The van der Waals attractive forces between neighboring hydrocarbon 
tails are not selective enough to hold groups of phospholipid molecules together. 
With certain lipid mixtures in artificial bilayers, however, one can observe phase 
segregations in which specific lipids come together in separate domains (Figure 
10-12). 

There has been a long debate among cell biologists about whether the lipid 
molecules in the plasma membrane of living cells similarly segregate into spe- 
cialized domains, called lipid rafts. Although many lipids and membrane pro- 
teins are not distributed uniformly, large-scale lipid phase segregations are rarely 
seen in living cell membranes. Instead, specific membrane proteins and lipids are 
seen to concentrate in a more temporary, dynamic fashion facilitated by protein- 
protein interactions that allow the transient formation of specialized membrane 
regions (Figure 10-13). Such clusters can be tiny nanoclusters on a scale of a few 
molecules, or larger assemblies that can be seen with electron microscopy, such 
as the caveolae involved in endocytosis (discussed in Chapter 13). The tendency 
of mixtures of lipids to undergo phase partitioning, as seen in artificial bilayers 
(see Figure 10-12), may help create rafts in living cell membranes—organizing 
and concentrating membrane proteins either for transport in membrane vesicles 
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Figure 10-12 Lateral phase separation 
in artificial lipid bilayers. (A) Giant 
liposomes produced from a 1:1 mixture 

of phosphatidylcholine and sphingomyelin 
form uniform bilayers. (B) By contrast, 
liposomes produced from a 1:1:1 mixture 
of phosphatidylcholine, sohingomyelin, 
and cholesterol form bilayers with two 
separate phases. The liposomes are 
stained with trace concentrations of a 
fluorescent dye that preferentially partitions 
into one of the two phases. The average 
size of the domains formed in these giant 
artificial liposomes is much larger than that 
expected in cell membranes, where “lipid 
rafts” (see text) may be as small as a few 
nanometers in diameter. (A, from N. Kahya 
et al., J. Struct. Biol. 147:77-89, 2004. 
With permission from Elsevier; B, courtesy 
of Petra Schwille.) 
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(discussed in Chapter 13) or for working together in protein assemblies, such 
as when they convert extracellular signals into intracellular ones (discussed in 
Chapter 15). 


Lipid Droplets Are Surrounded by a Phospholipid Monolayer 


Most cells store an excess of lipids in lipid droplets, from where they can be 
retrieved as building blocks for membrane synthesis or as a food source. Fat 
cells, or adipocytes, are specialized for lipid storage. They contain a giant lipid 
droplet that fills up most of their cytoplasm. Most other cells have many smaller 
lipid droplets, the number and size varying with the cell’s metabolic state. Fatty 
acids can be liberated from lipid droplets on demand and exported to other cells 
through the bloodstream. Lipid droplets store neutral lipids, such as triacylglycer- 
ols and cholesterol esters, which are synthesized from fatty acids and cholesterol 
by enzymes in the endoplasmic reticulum membrane. Because these lipids do not 
contain hydrophilic head groups, they are exclusively hydrophobic molecules, 
and therefore aggregate into three-dimensional droplets rather than into bilayers. 

Lipid droplets are unique organelles in that they are surrounded by a single 
monolayer of phospholipids, which contains a large variety of proteins. Some of 
the proteins are enzymes involved in lipid metabolism, but the functions of most 
are unknown. Lipid droplets form rapidly when cells are exposed to high con- 
centrations of fatty acids. They are thought to form from discrete regions of the 
endoplasmic reticulum membrane where many enzymes of lipid metabolism are 
concentrated. Figure 10-14 shows one model of how lipid droplets may form and 
acquire their surrounding monolayer of phospholipids and proteins. 


The Asymmetry of the Lipid Bilayer Is Functionally Important 


The lipid compositions of the two monolayers of the lipid bilayer in many mem- 
branes are strikingly different. In the human red blood cell (erythrocyte) mem- 
brane, for example, almost all of the phospholipid molecules that have cho- 
line—(CH3)3N*CH2CH2OH—in their head group (phosphatidylcholine and 
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Figure 10-13 A model of a raft domain. 
Weak protein-protein, protein-lipid, and 
lipid-lipid interactions reinforce one another 
to partition the interacting components into 
raft domains. Cholesterol, sphingolipids, 
glycolipids, glycosylohosphatidylinositol 
(GPl)-anchored proteins, and some 
transmembrane proteins are enriched 

in these domains. Note that because of 
their composition, raft domains have an 
increased membrane thickness.We discuss 
glycolipids, GPl-anchored proteins, and 
oligosaccharide linkers later. (Adapted 

from D. Lingwood and K. Simons, Science 
327:46-50, 2010.) 


Figure 10-14 A model for the formation of 
lipid droplets. Neutral lipids are deposited 
between the two monolayers of the 
endoplasmic reticulum membrane. There, 
they aggregate into a three-dimensional 
droplet, which buds and pinches off from 
the endoplasmic reticulum membrane as 

a unique organelle, surrounded by a single 
monolayer of phospholipids and associated 
proteins. (Adapted from S. Martin and R.G. 
Parton, Nat. Rev. Mol. Cell Biol. 7:373-378, 
2006. With permission from Macmillan 
Publishers Ltd.) 
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sphingomyelin) are in the outer monolayer, whereas almost all that contain a 
terminal primary amino group (phosphatidylethanolamine and phosphatidylser- 
ine) are in the inner monolayer (Figure 10-15). Because the negatively charged 
phosphatidylserine is located in the inner monolayer, there is a significant dif- 
ference in charge between the two halves of the bilayer. We discuss in Chapter 12 
how membrane-bound phospholipid translocators generate and maintain lipid 
asymmetry. 

Lipid asymmetry is functionally important, especially in converting extra- 
cellular signals into intracellular ones (discussed in Chapter 15). Many cytosolic 
proteins bind to specific lipid head groups found in the cytosolic monolayer of 
the lipid bilayer. The enzyme protein kinase C (PKC), for example, which is acti- 
vated in response to various extracellular signals, binds to the cytosolic face of the 
plasma membrane, where phosphatidylserine is concentrated, and requires this 
negatively charged phospholipid for its activity. 

In other cases, specific lipid head groups must first be modified to create pro- 
tein-binding sites at a particular time and place. One example is phosphatidyli- 
nositol (PI), one of the minor phospholipids that are concentrated in the cytosolic 
monolayer of cell membranes (see Figure 13-10A-C). Various lipid kinases can 
add phosphate groups at distinct positions on the inositol ring, creating binding 
sites that recruit specific proteins from the cytosol to the membrane. An important 
example of such a lipid kinase is phosphoinositide 3-kinase (PI 3-kinase), which is 
activated in response to extracellular signals and helps to recruit specific intracel- 
lular signaling proteins to the cytosolic face of the plasma membrane (see Figure 
15-53). Similar lipid kinases phosphorylate inositol phospholipids in intracellular 
membranes and thereby help to recruit proteins that guide membrane transport. 

Phospholipids in the plasma membrane are used in yet another way to con- 
vert extracellular signals into intracellular ones. The plasma membrane contains 
various phospholipases that are activated by extracellular signals to cleave spe- 
cific phospholipid molecules, generating fragments of these molecules that act 
as short-lived intracellular mediators. Phospholipase C, for example, cleaves an 
inositol phospholipid in the cytosolic monolayer of the plasma membrane to gen- 
erate two fragments, one of which remains in the membrane and helps activate 
protein kinase C, while the other is released into the cytosol and stimulates the 
release of Ca** from the endoplasmic reticulum (see Figure 15-28). 

Animals exploit the phospholipid asymmetry of their plasma membranes to 
distinguish between live and dead cells. When animal cells undergo apoptosis (a 
form of programmed cell death, discussed in Chapter 18), phosphatidylserine, 
which is normally confined to the cytosolic (or inner) monolayer of the plasma 
membrane lipid bilayer, rapidly translocates to the extracellular (or outer) mono- 
layer. The phosphatidylserine exposed on the cell surface signals neighboring 
cells, such as macrophages, to phagocytose the dead cell and digest it. The trans- 
location of the phosphatidylserine in apoptotic cells is thought to occur by two 
mechanisms: 





1. The phospholipid translocator that normally transports this lipid from the 
outer monolayer to the inner monolayer is inactivated. 

2. A “scramblase” that transfers phospholipids nonspecifically in both direc- 
tions between the two monolayers is activated. 


Figure 10-15 The asymmetrical 
distribution of phospholipids and 
glycolipids in the lipid bilayer of human 
red blood cells. The colors used for 

the phospholipid head groups are those 
introduced in Figure 10-3. In addition, 
glycolipids are drawn with hexagonal 

polar head groups (blue). Cholesterol (not 
shown) is distributed roughly equally in both 
monolayers. 
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Glycolipids Are Found on the Surface of All Eukaryotic 
Plasma Membranes 


Sugar-containing lipid molecules called glycolipids have the most extreme asym- 
metry in their membrane distribution: these molecules, whether in the plasma 
membrane or in intracellular membranes, are found exclusively in the monolayer 
facing away from the cytosol. In animal cells, they are made from sphingosine, just 
like sphingomyelin (see Figure 10-3). These intriguing molecules tend to self-as- 
sociate, partly through hydrogen bonds between their sugars and partly through 
van der Waals forces between their long and straight hydrocarbon chains, which 
causes them to partition preferentially into lipid raft phases (see Figure 10-13). 
The asymmetric distribution of glycolipids in the bilayer results from the addi- 
tion of sugar groups to the lipid molecules in the lumen of the Golgi apparatus. 
Thus, the compartment in which they are manufactured is topologically equiv- 
alent to the exterior of the cell (discussed in Chapter 12). As they are delivered 
to the plasma membrane, the sugar groups are exposed at the cell surface (see 
Figure 10-15), where they have important roles in interactions of the cell with its 
surroundings. 

Glycolipids probably occur in all eukaryotic cell plasma membranes, where 
they generally constitute about 5% of the lipid molecules in the outer monolayer. 
They are also found in some intracellular membranes. The most complex of the 
glycolipids, the gangliosides, contain oligosaccharides with one or more sialic 
acid moieties, which give gangliosides a net negative charge (Figure 10-16). The 
most abundant of the more than 40 different gangliosides that have been iden- 
tified are in the plasma membrane of nerve cells, where gangliosides constitute 
5-10% of the total lipid mass; they are also found in much smaller quantities in 
other cell types. 

Hints as to the functions of glycolipids come from their localization. In the 
plasma membrane of epithelial cells, for example, glycolipids are confined to the 
exposed apical surface, where they may help to protect the membrane against 
the harsh conditions frequently found there (such as low pH and high concen- 
trations of degradative enzymes). Charged glycolipids, such as gangliosides, may 
be important because of their electrical effects: their presence alters the electrical 
field across the membrane and the concentrations of ions—especially Ca**—at 
the membrane surface. Glycolipids also function in cell-recognition processes, 
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Figure 10-16 Glycolipid molecules. 

(A) Galactocerebroside is called a neutral 
glycolipid because the sugar that forms its 
head group is uncharged. (B) A ganglioside 
always contains one or more negatively 
charged sialic acid moiety. There are 
various types of sialic acid; in human cells, 
it is mostly N-acetylneuraminic acid, or 
NANA), whose structure is shown in (C). 
Whereas in bacteria and plants almost all 
glycolipids are derived from glycerol, as are 
most phospholipids, in animal cells almost 
all glycolipids are based on sphingosine, as 
is the case for sphingomyelin (see Figure 
10-3). Gal = galactose; Glc = glucose, 
GalNAc = N-acetylgalactosamine; these 
three Sugars are uncharged. 
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in which membrane-bound carbohydrate-binding proteins (lectins) bind to the 
sugar groups on both glycolipids and glycoproteins in the process of cell-cell 
adhesion (discussed in Chapter 19). Mutant mice that are deficient in all of their 
complex gangliosides show abnormalities in the nervous system, including axo- 
nal degeneration and reduced myelination. 

Some glycolipids provide entry points for certain bacterial toxins and viruses. 
The ganglioside Gy) (see Figure 10-16), for example, acts as a cell-surface recep- 
tor for the bacterial toxin that causes the debilitating diarrhea of cholera. Cholera 
toxin binds to and enters only those cells that have Gm1 on their surface, including 
intestinal epithelial cells. Its entry into a cell leads to a prolonged increase in the 
concentration of intracellular cyclic AMP (discussed in Chapter 15), which in turn 
causes a large efflux of Cl, leading to the secretion of Na+, K*, HCO37, and water 
into the intestine. Polyomaviruses also enter the cell after binding initially to gan- 
gliosides. 


Summary 


Biological membranes consist of a continuous double layer of lipid molecules in 
which membrane proteins are embedded. This lipid bilayer is fluid, with individual 
lipid molecules able to diffuse rapidly within their own monolayer. The membrane 
lipid molecules are amphiphilic. When placed in water, they assemble sponta- 
neously into bilayers, which form sealed compartments. 

Although cell membranes can contain hundreds of different lipid species, the 
plasma membrane in animal cells contains three major classes—phospholipids, 
cholesterol, and glycolipids. Because of their different backbone structure, phos- 
pholipids fall into two subclasses—phosphoglycerides and sphingolipids. The lipid 
compositions of the inner and outer monolayers are different, reflecting the different 
functions of the two faces of a cell membrane. Different mixtures of lipids are found 
in the membranes of cells of different types, as well as in the various membranes of 
a single eukaryotic cell. Inositol phospholipids are a minor class of phospholipids, 
which in the cytosolic leaflet of the plasma membrane lipid bilayer play an import- 
ant part in cell signaling: in response to extracellular signals, specific lipid kinases 
phosphorylate the head groups of these lipids to form docking sites for cytosolic sig- 
naling proteins, whereas specific phospholipases cleave certain inositol phospho- 
lipids to generate small intracellular signaling molecules. 


MEMBRANE PROTEINS 


Although the lipid bilayer provides the basic structure of biological membranes, 
the membrane proteins perform most of the membrane’s specific tasks and 
therefore give each type of cell membrane its characteristic functional properties. 
Accordingly, the amounts and types of proteins in a membrane are highly vari- 
able. In the myelin membrane, which serves mainly as electrical insulation for 
nerve-cell axons, less than 25% of the membrane mass is protein. By contrast, in 
the membranes involved in ATP production (such as the internal membranes of 
mitochondria and chloroplasts), approximately 75% is protein. A typical plasma 
membrane is somewhere in between, with protein accounting for about half of its 
mass. Because lipid molecules are small compared with protein molecules, how- 
ever, there are always many more lipid molecules than protein molecules in cell 
membranes—about 50 lipid molecules for each protein molecule in cell mem- 
branes that are 50% protein by mass. Membrane proteins vary widely in structure 
and in the way they associate with the lipid bilayer, which reflects their diverse 
functions. 


Membrane Proteins Can Be Associated with the Lipid Bilayer in 
Various Ways 


Figure 10-17 shows the different ways in which proteins can associate with the 
membrane. Like their lipid neighbors, membrane proteins are amphiphilic, 
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having hydrophobic and hydrophilic regions. Many membrane proteins extend 
through the lipid bilayer, and hence are called transmembrane proteins, with 
part of their mass on either side (Figure 10-17, examples 1, 2, and 3). Their hydro- 
phobic regions pass through the membrane and interact with the hydrophobic 
tails of the lipid molecules in the interior of the bilayer, where they are seques- 
tered away from water. Their hydrophilic regions are exposed to water on either 
side of the membrane. The covalent attachment of a fatty acid chain that inserts 
into the cytosolic monolayer of the lipid bilayer increases the hydrophobicity of 
some of these transmembrane proteins (see Figure 10-17, example 1). 

Other membrane proteins are located entirely in the cytosol and are attached 
to the cytosolic monolayer of the lipid bilayer, either by an amphiphilic a helix 
exposed on the surface of the protein (Figure 10-17, example 4) or by one or more 
covalently attached lipid chains (Figure 10-17, example 5). Yet other membrane 
proteins are entirely exposed at the external cell surface, being attached to the lipid 
bilayer only by a covalent linkage (via a specific oligosaccharide) to a lipid anchor 
in the outer monolayer of the plasma membrane (Figure 10-17, example 6). 

The lipid-linked proteins in example 5 in Figure 10-17 are made as soluble pro- 
teins in the cytosol and are subsequently anchored to the membrane by the cova- 
lent attachment of the lipid group. The proteins in example 6, however, are made 
as single-pass membrane proteins in the endoplasmic reticulum (ER). While 
still in the ER, the transmembrane segment of the protein is cleaved off and a 
glycosylphosphatidylinositol (GPI) anchor is added, leaving the protein bound 
to the noncytosolic surface of the ER membrane solely by this anchor (discussed 
in Chapter 12); transport vesicles eventually deliver the protein to the plasma 
membrane (discussed in Chapter 13). 

By contrast to these examples, membrane-associated proteins do not extend 
into the hydrophobic interior of the lipid bilayer at all; they are instead bound to 
either face of the membrane by noncovalent interactions with other membrane 
proteins (Figure 10-17, examples 7 and 8). Many of the proteins of this type can 
be released from the membrane by relatively gentle extraction procedures, such 
as exposure to solutions of very high or low ionic strength or of extreme pH, which 
interfere with protein-protein interactions but leave the lipid bilayer intact; these 
proteins are often referred to as peripheral membrane proteins. Transmembrane 
proteins and many proteins held in the bilayer by lipid groups or hydrophobic 
polypeptide regions that insert into the hydrophobic core of the lipid bilayer can- 
not be released in these ways. 


Lipid Anchors Control the Membrane Localization of Some 
Signaling Proteins 


How a membrane protein is associated with the lipid bilayer reflects the func- 
tion of the protein. Only transmembrane proteins can function on both sides of 
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Figure 10-17 Various ways in which 
proteins associate with the lipid bilayer. 
Most membrane proteins are thought to 
extend across the bilayer as (1) a single 

a helix, (2) as multiple a helices, or (3) 

as a rolled-up B sheet (a B barrel). Some 
of these “single-pass” and “multipass” 
proteins have a covalently attached 

fatty acid chain inserted in the cytosolic 
lipid monolayer (1). Other membrane 
proteins are exposed at only one side 

of the membrane. (4) Some of these are 
anchored to the cytosolic surface by an 
amphiphilic a helix that partitions into the 
cytosolic monolayer of the lipid bilayer 
through the hydrophobic face of the helix. 
(5) Others are attached to the bilayer solely 
by a covalently bound lipid chain — either 
a fatty acid chain or a prenyl group (see 
Figure 10-18)—in the cytosolic monolayer 
or, (6) via an oligosaccharide linker, to 
phosphatidylinositol in the noncytosolic 
monolayer — called a GPI anchor. (7, 8) 
Finally, membrane-associated proteins 
are attached to the membrane only 

by noncovalent interactions with other 
membrane proteins. The way in which 
the structure in (5) is formed is illustrated 
in Figure 10-18, while the way in which 
the GPI anchor shown in (6) is formed is 
illustrated in Figure 12-52. The details 

of how membrane proteins become 
associated with the lipid bilayer are 
discussed in Chapter 12. 
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the bilayer or transport molecules across it. Cell-surface receptors, for example, 
are usually transmembrane proteins that bind signal molecules in the extracel- 
lular space and generate different intracellular signals on the opposite side of the 
plasma membrane. To transfer small hydrophilic molecules across a membrane, 
a membrane transport protein must provide a path for the molecules to cross the 
hydrophobic permeability barrier of the lipid bilayer; the molecular architecture 
of multipass transmembrane proteins (Figure 10-17, examples 2 and 3) is ideally 
suited for this task, as we discuss in Chapter 11. 

Proteins that function on only one side of the lipid bilayer, by contrast, are 
often associated exclusively with either the lipid monolayer or a protein domain 
on that side. Some intracellular signaling proteins, for example, that help relay 
extracellular signals into the cell interior are bound to the cytosolic half of the 
plasma membrane by one or more covalently attached lipid groups, which can 
be fatty acid chains or prenyl groups (Figure 10-18). In some cases, myristic acid, 
a saturated 14-carbon fatty acid, is added to the N-terminal amino group of the 
protein during its synthesis on a ribosome. All members of the Src family of cyto- 
plasmic protein tyrosine kinases (discussed in Chapter 15) are myristoylated in 
this way. Membrane attachment through a single lipid anchor is not very strong, 
however, and a second lipid group is often added to anchor proteins more firmly 
to amembrane. For most Src kinases, the second lipid modification is the attach- 
ment of palmitic acid, a saturated 16-carbon fatty acid, to a cysteine side chain of 
the protein. This modification occurs in response to an extracellular signal and 
helps recruit the kinases to the plasma membrane. When the signaling pathway is 
turned off, the palmitic acid is removed, allowing the kinase to return to the cyto- 
sol. Other intracellular signaling proteins, such as the Ras family small GTPases 
(discussed in Chapter 15), use a combination of prenyl group and palmitic acid 
attachment to recruit the proteins to the plasma membrane. 

Many proteins attach to membranes transiently. Some are classical peripheral 
membrane proteins that associate with membranes by regulated protein-pro- 
tein interactions. Others undergo a transition from soluble to membrane protein 
by a conformational change that exposes a hydrophobic peptide or covalently 
attached lipid anchor. Many of the small GTPases of the Rab protein family that 
regulate intracellular membrane traffic (discussed in Chapter 13), for example, 
switch depending on the nucleotide that is bound to the protein. In their GDP- 
bound state they are soluble and free in the cytosol, whereas in their GTP-bound 
state their lipid anchor is exposed and tethers them to membranes. They are 
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Figure 10-18 Membrane protein 
attachment by a fatty acid chain ora 
prenyl group. The covalent attachment 
of either type of lipid can help localize a 
water-soluble protein to a membrane after 
its synthesis in the cytosol. (A) A fatty acid 
chain (myristic acid) is attached via an 
amide linkage to an N-terminal glycine. 

(B) A fatty acid chain (palmitic acid) 

is attached via a thioester linkage to 

a cysteine. (C) A prenyl chain (either 
farnesyl or a longer geranylgeranyl chain) 
is attached via a thioether linkage to a 
cysteine residue that is initially located four 
residues from the protein’s C-terminus. 
After prenylation, the terminal three 

amino acids are cleaved off, and the new 
C-terminus is methylated before insertion 
of the anchor into the membrane (not 
shown). The structures of the lipid anchors 
are shown below: (D) a myristoyl anchor 
(derived from a 14-carbon saturated 

fatty acid chain), (E) a palmitoyl anchor 

(a 16-carbon saturated fatty acid chain), 
and (F) a farnesyl anchor (a 15-carbon 
unsaturated hydrocarbon chain). 
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Figure 10-19 A segment of a membrane-spanning polypeptide chain 
crossing the lipid bilayer as an a helix. Only the a-carbon backbone 

of the polypeptide chain is shown, with the hydrophobic amino acids in 
green and yellow. The polypeptide segment shown is part of the bacterial 
photosynthetic reaction center, the structure of which was determined by 
x-ray diffraction. (Based on data from J. Deisenhofer et al., Nature 318:618- 
624, 1985, and H. Michel et al., EMBO J. 5:1149-1158, 1986.) 
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membrane proteins at one moment and soluble proteins at the next. Such highly 
dynamic interactions greatly expand the repertoire of membrane functions. 


In Most Transmembrane Proteins, the Polypeptide Chain Crosses 
the Lipid Bilayer in an a-Helical Conformation 


A transmembrane protein always has a unique orientation in the membrane. This 
reflects both the asymmetric manner in which it is inserted into the lipid bilayer 
in the ER during its biosynthesis (discussed in Chapter 12) and the different func- 
tions of its cytosolic and noncytosolic domains. These domains are separated by 
the membrane-spanning segments of the polypeptide chain, which contact the 
hydrophobic environment of the lipid bilayer and are composed largely of amino 
acids with nonpolar side chains. Because the peptide bonds themselves are polar 
and because water is absent, all peptide bonds in the bilayer are driven to form 
hydrogen bonds with one another. The hydrogen-bonding between peptide 
bonds is maximized if the polypeptide chain forms a regular a helix as it crosses 
the bilayer, and this is how most membrane-spanning segments of polypeptide 
chains traverse the bilayer (Figure 10-19). 

In single-pass transmembrane proteins, the polypeptide chain crosses only 
once (see Figure 10-17, example 1), whereas in multipass transmembrane pro- 
teins, the polypeptide chain crosses multiple times (see Figure 10-17, example 2). 
An alternative way for the peptide bonds in the lipid bilayer to satisfy their hydro- 
gen-bonding requirements is for multiple transmembrane strands of a polypep- 
tide chain to be arranged as a P sheet that is rolled up into a cylinder (a so-called 
f barrel; see Figure 10-17, example 3). This protein architecture is seen in the Figure 10-20 Using hydropathy plots to 
porin proteins that we discuss later. localize potential a-helical membrane- 

Progress in the x-ray crystallography of membrane proteins has enabled the spanning segments in a polypeptide 
determination of the three-dimensional structure of many of them. The structures Chain. The free energy needed to transfer 
confirm that it is often possible to predict from the protein’s amino acid sequence aaa > R eta is 
which parts of the polypeptide chain extend across the lipid bilayer. Segments calculated from the amino acid composition 
containing about 20-30 amino acids, with a high degree of hydrophobicity, are of each segment using data obtained from 
long enough to span a lipid bilayer as an a helix, and they can often be identified model compounds. This calculation is 
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each successive amino acid in the chain. 
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of an organism’s proteins are transmembrane proteins, emphasizing their impor- 
tance. Hydropathy plots cannot identify the membrane-spanning segments of a 
P barrel, as 10 amino acids or fewer are sufficient to traverse a lipid bilayer as an 
extended p strand and only every other amino acid side chain is hydrophobic. 

The strong drive to maximize hydrogen-bonding in the absence of water 
means that a polypeptide chain that enters the lipid bilayer is likely to pass entirely 
through it before changing direction, since chain bending requires a loss of reg- 
ular hydrogen-bonding interactions. But multipass transmembrane proteins can 
also contain regions that fold into the membrane from either side, squeezing into 
spaces between transmembrane a helices without contacting the hydrophobic 
core of the lipid bilayer. Because such regions interact only with other polypep- 
tide regions, they do not need to maximize hydrogen-bonding; they can therefore 
have a variety of secondary structures, including helices that extend only part way 
across the lipid bilayer (Figure 10-21). Such regions are important for the func- 
tion of some membrane proteins, including water channel and ion channel pro- 
teins, in which the regions contribute to the walls of the pores traversing the mem- 
brane and confer substrate specificity on the channels, as we discuss in Chapter 
11. These regions cannot be identified in hydropathy plots and are only revealed 
by x-ray crystallography or electron crystallography (a technique similar to x-ray 
diffraction but performed on two-dimensional arrays of proteins) of the protein’s 
three-dimensional structure. 


Transmembrane a Helices Often Interact with One Another 


The transmembrane a helices of many single-pass membrane proteins do not 
contribute to the folding of the protein domains on either side of the membrane. 
As a consequence, it is often possible to engineer cells to produce just the cyto- 
solic or extracellular domains of these proteins as water-soluble molecules. This 
approach has been invaluable for studying the structure and function of these 
domains, especially the domains of transmembrane receptor proteins (discussed 
in Chapter 15). A transmembrane a helix, even in a single-pass membrane pro- 
tein, however, often does more than just anchor the protein to the lipid bilayer. 
Many single-pass membrane proteins form homo- or heterodimers that are held 
together by noncovalent, but strong and highly specific, interactions between the 
two transmembrane a helices; the sequence of the hydrophobic amino acids of 
these helices contains the information that directs the protein-protein interac- 
tion. 

Similarly, the transmembrane a helices in multipass membrane proteins 
occupy specific positions in the folded protein structure that are determined by 
interactions between the neighboring helices. These interactions are crucial for 
the structure and function of the many channels and transporters that move mol- 
ecules across cell membranes. 

In these proteins, neighboring transmembrane helices in the folded structure 
of the protein shield many of the other transmembrane helices from the mem- 
brane lipids. Why, then, are these shielded helices nevertheless composed pri- 
marily of hydrophobic amino acids? The answer lies in the way in which multi- 
pass proteins are integrated into the membrane during their biosynthesis. As we 
discuss in Chapter 12, transmembrane @ helices are inserted into the lipid bilayer 
sequentially by a protein translocator. After leaving the translocator, each helix 
is transiently surrounded by lipids, which requires that the helix be hydropho- 
bic. It is only as the protein folds up into its final structure that contacts are made 
between adjacent helices, and protein-protein contacts replace some of the pro- 
tein-lipid contacts (Figure 10-22). 


Some b Barrels Form Large Channels 


Multipass membrane proteins that have their transmembrane segments arranged 
as p barrels rather than as a helices are comparatively rigid and therefore tend 
to form crystals readily when isolated. Thus, some of them were among the first 





Figure 10-21 Two short a helices in the 
aquaporin water channel, each of which 
spans only halfway through the lipid 
bilayer. In the plasma membrane, four 
monomers, one of which is shown here, 
form a tetramer. Each monomer has a 
hydrophilic pore at its center, which allows 
water molecules to cross the membrane 
in single file (see Figure 11-20 and 

Movie 11.6). The two short colored 
helices are buried at an interface formed 
by protein-protein interactions. The 
mechanism by which the channel allows 
the passage of water molecules is 
discussed in more detail in Chapter 11. 


MEMBRANE PROTEINS 


lipid 


upal Nf 
bilayer Wy 


folded 
membrane protein 


newly synthesized multipass 
transmembrane protein 


multipass membrane protein structures to be determined by x-ray crystallogra- 
phy. The number of B strands in a p barrel varies widely, from as few as 8 strands 
to as many as 22 (Figure 10-23). 

B-barrel proteins are abundant in the outer membranes of bacteria, mitochon- 
dria, and chloroplasts. Some are pore-forming proteins, which create water-filled 
channels that allow selected small hydrophilic molecules to cross the membrane. 
The porins are well-studied examples (example 3 in Figure 10-23C). Many porin 
barrels are formed from a 16-strand, antiparallel B sheet rolled up into a cylindri- 
cal structure. Polar amino acid side chains line the aqueous channel on the inside, 
while nonpolar side chains project from the outside of the barrel to interact with 
the hydrophobic core of the lipid bilayer. Loops of the polypeptide chain often 
protrude into the lumen of the channel, narrowing it so that only certain solutes 
can pass. Some porins are therefore highly selective: maltoporin, for example, 
preferentially allows maltose and maltose oligomers to cross the outer membrane 
of E. coli. 

The FepA protein is a more complex example of a P barrel transport protein 
(Figure 10-23D). It transports iron ions across the bacterial outer membrane. It 
is constructed from 22 f strands, and a large globular domain completely fills the 
inside of the barrel. Iron ions bind to this domain, which by an unknown mech- 
anism moves or changes its conformation to transfer the iron across the mem- 
brane. 

Not all B-barrel proteins are transport proteins. Some form smaller barrels that 
are completely filled by amino acid side chains that project into the center of the 
barrel. These proteins function as receptors or enzymes (Figure 10-23A and B); 
the barrel serves as a rigid anchor, which holds the protein in the membrane and 
orients the cytosolic loops that form binding sites for specific intracellular mole- 
cules. 

Most multipass membrane proteins in eukaryotic cells and in the bacterial 
plasma membrane are constructed from transmembrane a helices. The helices 
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Figure 10-22 Steps in the folding of a 
multipass transmembrane protein. 

When a newly synthesized transmembrane 
a helix is released into the lipid bilayer, it is 
initially Surrounded by lipid molecules. As 
the protein folds, contacts between the 
helices displace some of the lipid molecules 
surrounding the helices. 


Figure 10-23 B barrels formed from 
different numbers of ß strands. 

(A) The E. coli OmpA protein serves as a 
receptor for a bacterial virus. (B) The E. coli 
OMPLA protein is an enzyme (a lipase) that 
hydrolyzes lipid molecules. The amino acids 
that catalyze the enzymatic reaction (shown 
in red) protrude from the outside surface of 
the barrel. (C) A porin from the bacterium 
Rhodobacter capsulatus forms a water- 
filled pore across the outer membrane. 

The diameter of the channel is restricted 

by loops (shown in blue) that protrude into 
the channel. (D) The E. coli FepA protein 
transports Iron ions. The inside of the barrel 
is completely filled by a globular protein 
domain (shown in blue) that contains an 
iron-binding site (not shown). 
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Figure 10-24 A single-pass transmembrane protein. Note that the 
polypeptide chain traverses the lipid bilayer as a right-handed a helix and that 
the oligosaccharide chains and disulfide bonds are all on the noncytosolic 
surface of the membrane. The sulfhydryl groups in the cytosolic domain 

of the protein do not normally form disulfide bonds because the reducing 
environment in the cytosol maintains these groups in their reduced (—SH) 
form. 


can slide against each other, allowing conformational changes in the protein that 
can open and shut ion channels, transport solutes, or transduce extracellular 
signals into intracellular ones. In -barrel proteins, by contrast, hydrogen bonds 
bind each f strand rigidly to its neighbors, making conformational changes within 
the wall of the barrel unlikely. 


Many Membrane Proteins Are Glycosylated 


Most transmembrane proteins in animal cells are glycosylated. As in glycolip- 
ids, the sugar residues are added in the lumen of the ER and the Golgi appara- 
tus (discussed in Chapters 12 and 13). For this reason, the oligosaccharide chains 
are always present on the noncytosolic side of the membrane. Another important 
difference between proteins (or parts of proteins) on the two sides of the mem- 
brane results from the reducing environment of the cytosol. This environment 
decreases the likelihood that intrachain or interchain disulfide (S-S) bonds will 
form between cysteines on the cytosolic side of membranes. These bonds form 
on the noncytosolic side, where they can help stabilize either the folded structure 
of the polypeptide chain or its association with other polypeptide chains (Figure 
10-24). 

Because the extracellular part of most plasma membrane proteins are glyco- 
sylated, carbohydrates extensively coat the surface of all eukaryotic cells. These 
carbohydrates occur as oligosaccharide chains covalently bound to membrane 
proteins (glycoproteins) and lipids (glycolipids). They also occur as the polysac- 
charide chains of integral membrane proteoglycan molecules. Proteoglycans, 
which consist of long polysaccharide chains linked covalently to a protein core, 
are found mainly outside the cell, as part of the extracellular matrix (discussed in 
Chapter 19). But, for some proteoglycans, the protein core either extends across 
the lipid bilayer or is attached to the bilayer by a glycosylphosphatidylinositol 
(GPI) anchor. 

The terms cell coat or glycocalyx are sometimes used to describe the carbohy- 
drate-rich zone on the cell surface. This carbohydrate layer can be visualized by 
various stains, such as ruthenium red (Figure 10-25A), as well as by its affinity for 
carbohydrate-binding proteins called lectins, which can be labeled with a fluo- 
rescent dye or some other visible marker. Although most of the sugar groups are 
attached to intrinsic plasma membrane molecules, the carbohydrate layer also 
contains both glycoproteins and proteoglycans that have been secreted into the 
extracellular space and then adsorbed onto the cell surface (Figure 10-25B). Many 
of these adsorbed macromolecules are components of the extracellular matrix, so 
that the boundary between the plasma membrane and the extracellular matrix is 
often not sharply defined. One of the many functions of the carbohydrate layer is 
to protect cells against mechanical and chemical damage; it also keeps various 
other cells at a distance, preventing unwanted cell-cell interactions. 

The oligosaccharide side chains of glycoproteins and glycolipids are enor- 
mously diverse in their arrangement of sugars. Although they usually contain 
fewer than 15 sugars, the chains are often branched, and the sugars can be bonded 
together by various kinds of covalent linkages—unlike the amino acids in a poly- 
peptide chain, which are all linked by identical peptide bonds. Even three sugars 
can be put together to form hundreds of different trisaccharides. Both the diversity 
and the exposed position of the oligosaccharides on the cell surface make them 
especially well suited to function in specific cell-recognition processes. As we dis- 
cuss in Chapter 19, plasma-membrane-bound lectins that recognize specific oli- 
gosaccharides on cell-surface glycolipids and glycoproteins mediate a variety of 
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transient cell-cell adhesion processes, including those occurring in lymphocyte 
recirculation and inflammatory responses (see Figure 19-28). 


glycolipid 





CYTOSOL 


Membrane Proteins Can Be Solubilized and Purified in Detergents 


In general, only agents that disrupt hydrophobic associations and destroy the 
lipid bilayer can solubilize membrane proteins. The most useful of these for the 
membrane biochemist are detergents, which are small amphiphilic molecules of 
variable structure (Movie 10.4). Detergents are much more soluble in water than 
lipids. Their polar (hydrophilic) ends can be either charged (ionic), as in sodium 
dodecyl sulfate (SDS), or uncharged (nonionic), as in octylglucoside and Triton 
(Figure 10-26A). At low concentration, detergents are monomeric in solution, 
but when their concentration is increased above a threshold, called the critical 
micelle concentration (CMC), they aggregate to form micelles (Figure 10-26B-D). 
Above the CMC, detergent molecules rapidly diffuse in and out of micelles, keep- 
ing the concentration of monomer in the solution constant, no matter how many 
micelles are present. Both the CMC and the average number of detergent mol- 
ecules in a micelle are characteristic properties of each detergent, but they also 
depend on the temperature, pH, and salt concentration. Detergent solutions are 
therefore complex systems and are difficult to study. 

When mixed with membranes, the hydrophobic ends of detergents bind to the 
hydrophobic regions of the membrane proteins, where they displace lipid mol- 
ecules with a collar of detergent molecules. Since the other end of the detergent 
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Figure 10-25 The carbohydrate layer 

on the cell surface. (A) This electron 
micrograph of the surface of a lymphocyte 
stained with ruthenium red emphasizes the 
thick carbohydrate-rich layer surrounding 
the cell. (B) The carbohydrate layer is made 
up of the oligosaccharide side chains of 
membrane glycolipids and membrane 
glycoproteins and the polysaccharide 
chains on membrane proteoglycans. 

In addition, adsorbed glycoproteins, 

and adsorbed proteoglycans (not 

shown), contribute to the carbohydrate 
layer in many cells. Note that all of the 
carbohydrate is on the extracellular surface 
of the membrane. (A, courtesy of Audrey 
M. Glauert and G.M.W. Cook.) 
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Figure 10-26 The structure and function of detergents. (A) Three commonly used detergents are sodium dodecyl sulfate 
(SDS), an anionic detergent, and Triton X-100 and B-octylglucoside, two nonionic detergents. Triton X-100 is a mixture of 
compounds in which the region in brackets is repeated between 9 and 10 times. The hydrophobic portion of each detergent is 
shown in yellow, and the hydrophilic portion is shown in orange. (B) At low concentration, detergent molecules are monomeric 
in solution. As their concentration is increased beyond the critical micelle concentration (CMC), some of the detergent molecules 
form micelles. Note that the concentration of detergent monomer stays constant above the CMC. (C) Because they have both 
polar and nonpolar ends, detergent molecules are amphiphilic; and because they are cone-shaped, they form micelles rather 
than bilayers (See Figure 10-7). Detergent micelles are thought to have irregular shapes, and, due to packing constraints, 

the hydrophobic tails are partially exposed to water. (D) The space-filling model shows the structure of a micelle composed 

of 20 B-octylglucoside molecules, predicted by molecular dynamics calculations. The head groups are shown in red and the 
hydrophobic tails in gray. (B, adapted from G. Gunnarsson, B. Jönsson and H. Wennerström, J. Phys. Chem. 84:3114-3121, 
1980; C, from S. Bogusz, R.M. Venable and R.W. Pastor, J. Phys. Chem. B 104:5462-5470, 2000.) 


molecule is polar, this binding tends to bring the membrane proteins into solution 
as detergent-protein complexes (Figure 10-27). Usually, some lipid molecules 
also remain attached to the protein. 

Strong ionic detergents, such as SDS, can solubilize even the most hydropho- 
bic membrane proteins. This allows the proteins to be analyzed by SDS polyacryl- 
amide-gel electrophoresis (discussed in Chapter 8), a procedure that has revolu- 
tionized the study of proteins. Such strong detergents, however, unfold (denature) 
proteins by binding to their internal “hydrophobic cores,’ thereby rendering the 
proteins inactive and unusable for functional studies. Nonetheless, proteins can 
be readily separated and purified in their SDS-denatured form. In some cases, 
removal of the SDS allows the purified protein to renature, with recovery of func- 
tional activity. 

Many membrane proteins can be solubilized and then purified in an active 
form by the use of mild detergents. These detergents cover the hydrophobic 
regions on membrane-spanning segments that become exposed after lipid 
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removal but do not unfold the protein. If the detergent concentration ofa solution Figure 10-27 Solubilizing a membrane 

of solubilized membrane proteins is reduced (by dilution, for example), mem- Protein with a mild nonionic detergent. 

brane proteins do not remain soluble. In the presence of an excess of phospho- e cere a ed ersas 
and brings the protein into solution as 

lipid molecules in such a solution, however, membrane proteins incorporate into _pyrotein—lipid-detergent complexes. The 

small liposomes that form spontaneously. In this way, functionally active mem- phospholipids in the membrane are also 

brane protein systems can be reconstituted from purified components, providing Solubilized by the detergent, as lipid- 

a powerful means of analyzing the activities of membrane transporters, ion chan- ¢te"gent micelles. 

nels, signaling receptors, and so on (Figure 10-28). Such functional reconstitu- 


tion, for example, provided proof for the hypothesis that the enzymes that make 
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Figure 10-28 The use of mild nonionic 
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B detergents for solubilizing, purifying, 
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discussed in Chapter 11. 
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ATP (ATP synthases) use H* gradients in mitochondrial, chloroplast, and bacterial 
membranes to produce ATP. 

Membrane proteins can also be reconstituted from detergent solution into 
nanodiscs, which are small, uniformly sized patches of membrane that are sur- 
rounded by a belt of protein, which covers the exposed edge of the bilayer to keep 
the patch in solution (Figure 10-29). The belt is derived from high-density lipo- 
proteins (HDL), which keep lipids soluble for transport in the blood. In nanodiscs 
the membrane protein of interest can be studied in its native lipid environment 
and is experimentally accessible from both sides of the bilayer, which is useful, 
for example, for ligand-binding experiments. Proteins contained in nanodiscs can 
also be analyzed by single particle electron microscopy techniques to determine 
their structure. By this rapidly improving technique (discussed in Chapter 9), the 
structure of a membrane protein can be determined to high resolution without a 
requirement of the protein of interest to crystallize into a regular lattice, which is 
often hard to achieve for membrane proteins. 

Detergents have also played a crucial part in the purification and crystalliza- 
tion of membrane proteins. The development of new detergents and new expres- 
sion systems that produce large quantities of membrane proteins from cDNA 
clones has led to a rapid increase in the number of three-dimensional structures 
of membrane proteins and protein complexes that are known, although they are 
still few compared to the known structures of water-soluble proteins and protein 
complexes. 


Bacteriorhodopsin Is a Light-driven Proton (H*) Pump That 
Traverses the Lipid Bilayer as Seven a Helices 


In Chapter 11, we consider how multipass transmembrane proteins mediate 
the selective transport of small hydrophilic molecules across cell membranes. 
But a detailed understanding of how such a membrane transport protein works 
requires precise information about its three-dimensional structure in the bilayer. 
Bacteriorhodopsin was the first membrane transport protein whose structure was 
determined, and it has remained the prototype of many multipass membrane 
proteins with a similar structure. 

The “purple membrane” of the archaeon Halobacterium salinarum is a spe- 
cialized patch in the plasma membrane that contains a single species of pro- 
tein molecule, bacteriorhodopsin (Figure 10-30A). The protein functions as 
a light-activated H* pump that transfers H* out of the archaeal cell. Because 
the bacteriorhodopsin molecules are tightly packed and arranged as a planar 
two-dimensional crystal (FIgure 10-30B and C), it was possible to determine 
their three-dimensional structure by combining electron microscopy and elec- 
tron diffraction analysis—a procedure called electron crystallography, which we 


Figure 10-29 Model of a membrane 
protein reconstituted into a 

nanodisc. When detergent is removed 
from a solution containing a multipass 
membrane protein, lipids, and a protein 
subunit of the high-density lipoprotein 
(HDL), the membrane protein becomes 
embedded in a small patch of lipid 

bilayer, which is surrounded by a belt of 
the HDL protein. In such nanodiscs, the 
hydrophobic edges of the bilayer patch are 
shielded by the protein belt, which renders 
the assembly water-soluble. 
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mentioned earlier. This method has provided the first structural views of many 
membrane proteins that were found to be difficult to crystallize from detergent 
solutions. For bacteriorhodopsin, the structure was later confirmed and extended 
to very high resolution by x-ray crystallography. 

Each bacteriorhodopsin molecule is folded into seven closely packed trans- 
membrane @ helices and contains a single light-absorbing group, or chromophore 
(in this case, retinal), which gives the protein its purple color. Retinal is vitamin A 
in its aldehyde form and is identical to the chromophore found in rhodopsin of 
the photoreceptor cells of the vertebrate eye (discussed in Chapter 15). Retinal is 
covalently linked to a lysine side chain of the bacteriorhodopsin protein. When 
activated by a single photon of light, the excited chromophore changes its shape 
and causes a series of small conformational changes in the protein, resulting in 
the transfer of one H+ from the inside to the outside of the cell (Figure 10-31A). In 
bright light, each bacteriorhodopsin molecule can pump several hundred protons 
per second. The light-driven proton transfer establishes an H* gradient across 
the plasma membrane, which in turn drives the production of ATP by a second 
protein in the cell’s plasma membrane. The energy stored in the H* gradient also 
drives other energy-requiring processes in the cell. Thus, bacteriorhodopsin con- 
verts solar energy into a H* gradient, which provides energy to the archaeal cell. 

The high-resolution crystal structure of bacteriorhodopsin reveals many 
lipid molecules bound in specific places on the protein surface (Figure 10-31B). 
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Figure 10-30 Patches of purple 
membrane, which contain 
bacteriorhodopsin in the archaeon 
Halobacterium salinarum. (A) These 
archaea live in saltwater pools, where 

they are exposed to sunlight. They have 
evolved a variety of light-activated proteins, 
including bacteriornodopsin, which is a 
light-activated H* pump in the plasma 
membrane. (B) The bacteriornodopsin 
molecules in the purple membrane patches 
are tightly packed into two-dimensional 
crystalline arrays. (C) Details of the 
molecular surface visualized by atomic 
force microscopy. With this technique, 
individual bacteriornodopsin molecules can 
be seen. (D) Outline of the approximate 
location of the bacteriornodopsin monomer 
and the individual a helices in the image 
shown in (C). (B-C, courtesy of Dieter 
Oesterhelt; D, PDB code: 2BRD.) 


Figure 10-31 The three-dimensional 
structure of a bacteriorhodopsin 
molecule. (Movie 10.5) (A) The 
polypeptide chain crosses the lipid bilayer 
seven times as a helices. The location of 
the retinal chromophore (purple) and the 
probable pathway taken by H* during the 
light-activated pumping cycle are shown. 
The first and key step is the passing of an 
Ht from the chromophore to the side chain 
of aspartic acid 85 (red, located next to the 
chromophore) that occurs upon absorption 
of a photon by the chromophore. 
Subsequently, other Ht transfers—in the 
numerical order indicated and utilizing the 
hydrophilic amino acid side chains that line 
a path through the membrane— complete 
the pumping cycle and return the enzyme 
to its starting state. Color code: glutamic 
acid (orange), aspartic acid (red), arginine 
(blue). (B) The high-resolution crystal 
structure of bacteriorhodopsin shows 
many lipid molecules (yellow with red head 
groups) that are tightly bound to specific 
places on the surface of the protein. 

(A, adapted from H. Luecke et al., Science 
286:255-261, 1999. With permission from 
AAAS; B, from H. Luecke et al., J. Mol. 
Biol. 291:899-911, 1999. With permission 
from Academic Press.) 
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Interactions with specific lipids are thought to help stabilize many membrane 
proteins, which work best and sometimes crystallize more readily if some of the 
lipids remain bound during detergent extraction, or if specific lipids are added 
back to the proteins in detergent solutions. The specificity of these lipid-protein 
interactions helps explain why eukaryotic membranes contain such a variety of 
lipids, with head groups that differ in size, shape, and charge. We can think of the 
membrane lipids as constituting a two-dimensional solvent for the proteins in the 
membrane, just as water constitutes a three-dimensional solvent for proteins in 
an aqueous solution: some membrane proteins can function only in the presence 
of specific lipid head groups, just as many enzymes in aqueous solution require a 
particular ion for activity. 

Bacteriorhodopsin is a member of a large superfamily of membrane proteins 
with similar structures but different functions. For example, rhodopsin in rod cells 
of the vertebrate retina and many cell-surface receptor proteins that bind extracel- 
lular signal molecules are also built from seven transmembrane a helices. These 
proteins function as signal transducers rather than as transporters: each responds 
to an extracellular signal by activating a GTP-binding protein (G protein) inside the 
cell and they are therefore called G-protein-coupled receptors (GPCRs), as we dis- 
cuss in Chapter 15 (see Figure 15-6B). Although the structures of bacteriorhodop- 
sins and GPCRs are strikingly similar, they show no sequence similarity and thus 
probably belong to two evolutionarily distant branches ofan ancient protein family. 
A related class of membrane proteins, the channelrhodopsins that green algae use 
to detect light, form ion channels when they absorb a photon. When engineered so 
that they are expressed in animal brains, these proteins have become invaluable 
tools in neurobiology because they allow specific neurons to be stimulated experi- 
mentally by shining light on them, as we discuss in Chapter 11 (Figure 11-32). 


Membrane Proteins Often Function as Large Complexes 


Many membrane proteins function as part of multicomponent complexes, sev- 
eral of which have been studied by x-ray crystallography. One is a bacterial pho- 
tosynthetic reaction center, which was the first membrane protein complex to be 
crystallized and analyzed by x-ray diffraction. In Chapter 14, we discuss how such 
photosynthetic complexes function to capture light energy and use it to pump 
H* across the membrane. Many of the membrane protein complexes involved in 
photosynthesis, proton pumping, and electron transport are even larger than the 
photosynthetic reaction center. The enormous photosystem II complex from cya- 
nobacteria, for example, contains 19 protein subunits and well over 60 transmem- 
brane helices (see Figure 14-49). Membrane proteins are often arranged in large 
complexes, not only for harvesting various forms of energy, but also for transduc- 
ing extracellular signals into intracellular ones (discussed in Chapter 15). 


Many Membrane Proteins Diffuse in the Plane of the Membrane 


Like most membrane lipids, membrane proteins do not tumble (flip-flop) across 
the lipid bilayer, but they do rotate about an axis perpendicular to the plane of the 
bilayer (rotational diffusion). In addition, many membrane proteins are able to 
move laterally within the membrane (lateral diffusion). An experiment in which 
mouse cells were artificially fused with human cells to produce hybrid cells (het- 
erokaryons) provided the first direct evidence that some plasma membrane pro- 
teins are mobile in the plane of the membrane. Two differently labeled antibodies 
were used to distinguish selected mouse and human plasma membrane proteins. 
Although at first the mouse and human proteins were confined to their own 
halves of the newly formed heterokaryon, the two sets of proteins diffused and 
mixed over the entire cell surface in about half an hour (Figure 10-32). 

The lateral diffusion rates of membrane proteins can be measured by using 
the technique of fluorescence recovery after photobleaching (FRAP). The method 
usually involves marking the membrane protein of interest with a specific flu- 
orescent group. This can be done either with a fluorescent ligand such as a 
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fluorophore-labeled antibody that binds to the protein or with recombinant DNA 
technology to express the protein fused to a fluorescent protein such as green flu- 
orescent protein (GFP) (discussed in Chapter 9). The fluorescent group is then 
bleached in a small area of membrane by a laser beam, and the time taken for 
adjacent membrane proteins carrying unbleached ligand or GFP to diffuse into 
the bleached area is measured (Figure 10-33). From FRAP measurements, we can 
estimate the diffusion coefficient for the marked cell-surface protein. The values 
of the diffusion coefficients for different membrane proteins in different cells are 
highly variable, because interactions with other proteins impede the diffusion 
of the proteins to varying degrees. Measurements of proteins that are minimally 
impeded in this way indicate that cell membranes have a viscosity comparable to 
that of olive oil. 

One drawback to the FRAP technique is that it monitors the movement of 
large populations of molecules in a relatively large area of membrane; one cannot 
follow individual protein molecules. If a protein fails to migrate into a bleached 
area, for example, one cannot tell whether the molecule is truly immobile or just 
restricted in its movement to a very small region of membrane—perhaps by cyto- 
skeletal proteins. Single-particle tracking techniques overcome this problem by 
labeling individual membrane molecules with antibodies coupled to fluorescent 
dyes or tiny gold particles and tracking their movement by video microscopy. 
Using single-particle tracking, one can record the diffusion path of a single mem- 
brane protein molecule over time. Results from all of these techniques indicate 
that plasma membrane proteins differ widely in their diffusion characteristics, as 
we now discuss. 
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Figure 10-32 An experiment 
demonstrating the diffusion of proteins 
in the plasma membrane of mouse- 
human hybrid cells. In this experiment, 

a mouse and a human cell were fused to 
create a hybrid cell, which was then stained 
with two fluorescently labeled antibodies. 
One antibody (labeled with a green dye) 
detects mouse plasma membrane proteins, 
the other antibody (labeled with a red dye) 
detects human plasma membrane proteins. 
When cells were stained immediately 

after fusion, mouse and human plasma 
membrane proteins are still found in the 
membrane domains originating from the 
mouse and human cell, respectively. After a 
short time, however, the plasma membrane 
proteins diffuse over the entire cell surface 
and completely intermix. (From L.D. Frye 
and M. Edidine, J. Cell Sci. 7:319-335, 
1970. With permission from The Company 
of Biologists.) 


Figure 10-33 Measuring the rate 

of lateral diffusion of a membrane 
protein by fluorescence recovery after 
photobleaching. A specific protein of 
interest can be expressed as a fusion 
protein with green fluorescent protein 
(GFP), which is intrinsically fluorescent. 
The fluorescent molecules are bleached 
in a small area using a laser beam. The 
fluorescence intensity recovers as the 
bleached molecules diffuse away and 
unbleached molecules diffuse into the 
irradiated area (shown here in side and 
top views). The diffusion coefficient 

is calculated from a graph of the rate 

of recovery: the greater the diffusion 
coefficient of the membrane protein, the 
faster the recovery (Movie 10.6). 
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Cells Can Confine Proteins and Lipids to Specific Domains Within 
a Membrane 


The recognition that biological membranes are two-dimensional fluids was a 
major advance in understanding membrane structure and function. It has become 
clear, however, that the picture of amembrane as a lipid sea in which all proteins 
float freely is greatly oversimplified. Most cells confine membrane proteins to spe- 
cific regions in a continuous lipid bilayer. We have already discussed how bac- 
teriorhodopsin molecules in the purple membrane of Halobacterium assemble 
into large two-dimensional crystals, in which individual protein molecules are 
relatively fixed in relationship to one another (see Figure 10-30). ATP synthase 
complexes in the inner mitochondrial membrane also associate into long double 
rows, as we discuss in Chapter 14 (see Figure 14-32). Large aggregates of this kind 
diffuse very slowly. 

In epithelial cells, such as those that line the gut or the tubules of the kidney, 
certain plasma membrane enzymes and transport proteins are confined to the 
apical surface of the cells, whereas others are confined to the basal and lateral 
surfaces (Figure 10-34). This asymmetric distribution of membrane proteins is 
often essential for the function of the epithelium, as we discuss in Chapter 11 (see 
Figure 11-11). The lipid compositions of these two membrane domains are also 
different, demonstrating that epithelial cells can prevent the diffusion of lipid as 
well as protein molecules between the domains. The barriers set up by a specific 
type of intercellular junction (called a tight junction, discussed in Chapter 19; 
see Figure 19-18) maintain the separation of both protein and lipid molecules. 
Clearly, the membrane proteins that form these intercellular junctions cannot be 
allowed to diffuse laterally in the interacting membranes. 

A cell can also create membrane domains without using intercellular junc- 
tions. As we already discussed, regulated protein-protein interactions in mem- 
branes are thought to create nanoscale raft domains that function in signaling 
and membrane trafficking. A more extreme example is seen in the mammalian 
spermatozoon, a single cell that consists of several structurally and functionally 
distinct parts covered by a continuous plasma membrane. When a sperm cell is 
examined by immunofluorescence microscopy with a variety of antibodies, each 
of which reacts with a specific cell-surface molecule, the plasma membrane is 
found to consist of at least three distinct domains (Figure 10-35). Some of the 
membrane molecules are able to diffuse freely within the confines of their own 
domain. The molecular nature of the “fence” that prevents the molecules from 


protein A 


— tight 
junction 





apical plasma 
membrane 
protein B 
lateral plasma 
membrane 


basal plasma 
membrane 


basal lamina 


Figure 10-34 How membrane molecules can be restricted to a particular membrane domain. 
In this drawing of an epithelial cell, protein A (in the apical domain of the plasma membrane) 

and protein B (in the basal and lateral domains) can diffuse laterally in their own domains but are 
prevented from entering the other domain, at least partly by the specialized cell-cell junction called 
a tight junction. Lipid molecules in the outer (extracellular) monolayer of the plasma membrane 

are likewise unable to diffuse between the two domains; lipids in the inner (cytosolic) monolayer, 
however, are able to do so (not shown). The basal lamina is a thin mat of extracellular matrix that 
separates epithelial sheets from other tissues (discussed in Chapter 19). 
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leaving their domain is not known. Many other cells have similar membrane 
fences that confine membrane protein diffusion to certain membrane domains. 
The plasma membrane of nerve cells, for example, contains a domain enclosing 
the cell body and dendrites, and another enclosing the axon; it is thought that a 
belt of actin filaments tightly associated with the plasma membrane at the cell- 
body-axon junction forms part of the barrier. 

Figure 10-36 shows four common ways of immobilizing specific membrane 
proteins through protein-protein interactions. 


The Cortical Cytoskeleton Gives Membranes Mechanical Strength 
and Restricts Membrane Protein Diffusion 


As shown in Figure 10-36B and C, a common way in which a cell restricts the lat- 
eral mobility of specific membrane proteins is to tether them to macromolecular 
assemblies on either side of the membrane. The characteristic biconcave shape of 
a red blood cell (Figure 10-37), for example, results from interactions of its plasma 
membrane proteins with an underlying cytoskeleton, which consists mainly of a 
meshwork of the filamentous protein spectrin. Spectrin is a long, thin, flexible 
rod about 100 nm in length. As the principal component of the red cell cytoskel- 
eton, it maintains the structural integrity and shape of the plasma membrane, 
which is the red cell’s only membrane, as the cell has no nucleus or other organ- 
elles. The spectrin cytoskeleton is riveted to the membrane through various mem- 
brane proteins. The final result is a deformable, netlike meshwork that covers 
the entire cytosolic surface of the red cell membrane (Figure 10-38). This spec- 
trin-based cytoskeleton enables the red cell to withstand the stress on its mem- 
brane as it is forced through narrow capillaries. Mice and humans with genetic 
abnormalities in spectrin are anemic and have red cells that are spherical (instead 
of concave) and fragile; the severity of the anemia increases with the degree of 
spectrin deficiency. 
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Figure 10-35 Three domains in the 
plasma membrane of a guinea pig 
sperm. (A) A drawing of a guinea 

pig sperm. (B-D) In the three pairs of 
micrographs, phase-contrast micrographs 
are on the /eft, and the same cell is shown 
with cell-surface immunofluorescence 
staining on the right. Different monoclonal 
antibodies selectively label cell-surface 
molecules on (B) the anterior head, 

(C) the posterior head, and (D) the tail. 
(Micrographs courtesy of Selena Carroll 
and Diana Myles.) 
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Figure 10-36 Four ways of restricting 
the lateral mobility of specific plasma 
membrane proteins. (A) The proteins can 
self-assemble into large aggregates (as 
seen for bacteriorhodopsin in the purple 
membrane of Halobacterium salinarum); 
they can be tethered by interactions with 
assemblies of macromolecules (B) outside 
or (C) inside the cell; or (D) they can interact 
with proteins on the surface of another cell. 
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Figure 10-37 A scanning electron 
micrograph of human red blood cells. 
The cells have a biconcave shape and 
lack a nucleus and other organelles 
(Movie 10.7). (Courtesy of Bernadette 
Chailley.) 





An analogous but much more elaborate and highly dynamic cytoskeletal net- 
work exists beneath the plasma membrane of most other cells in our body. This 
network, which constitutes the cortex of the cell, is rich in actin filaments, which 
are attached to the plasma membrane in numerous ways. The dynamic remod- 
eling of the cortical actin network provides a driving force for many essential 
cell functions, including cell movement, endocytosis, and the formation of tran- 
sient, mobile plasma membrane structures such as filopodia and lamellopodia 
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Figure 10-38 The spectrin-based cytoskeleton on the cytosolic side of the human 
red blood cell plasma membrane. (A) The arrangement shown in the drawing has been 
deduced mainly from studies on the interactions of purified proteins in vitro. Spectrin 
heterodimers (enlarged in the drawing on the right) are linked together into a netlike 
meshwork by “junctional complexes” (enlarged in the drawing on the /eft). Each spectrin 


heterodimer consists of two antiparallel, loosely intertwined, flexible polypeptide chains called oR 
a and B. The two spectrin chains are attached noncovalently to each other at multiple points, rie EN US 
including at both ends. Both the a and B chains are composed largely of repeating domains. ERA ERSTE: e's 


Two spectrin heterodimers join end-to-end to form tetramers. 

The junctional complexes are composed of short actin filaments (containing 13 actin 
monomers) and these proteins—band 4.1, adducin, and a tropomyosin molecule that 
probably determines the length of the actin filaments. The cytoskeleton is linked to the 
membrane through two transmembrane proteins—a multipass protein called band 3 anda 
single-pass protein called glycophorin. The spectrin tetramers bind to some band 3 proteins 
via ankyrin molecules, and to glycophorin and band 3 (not shown) via band 4.1 proteins. 

(B) The electron micrograph shows the cytoskeleton on the cytosolic side of a red 
blood cell membrane after fixation and negative staining. The spectrin meshwork has been 
purposely stretched out to allow the details of its structure to be seen. In a normal cell, the 
meshwork shown would be much more crowded and occupy only about one-tenth of this 
area. (B, courtesy of T. Byers and D. Branton, Proc. Natl Acad. Sci. USA 82:6153-6157, 
1985. With permission from The National Academy of Sciences.) (B) 
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discussed in Chapter 16. The cortex of nucleated cells also contains proteins that 
are structurally homologous to spectrin and the other components of the red cell 
cytoskeleton. We discuss the cortical cytoskeleton in nucleated cells and its inter- 
actions with the plasma membrane in Chapter 16. 

The cortical cytoskeletal network restricts diffusion of not only the plasma 
membrane proteins that are directly anchored to it. Because the cytoskeletal fila- 
ments are often closely apposed to the cytosolic surface of the plasma membrane, 
they can form mechanical barriers that obstruct the free diffusion of proteins in 
the membrane. These barriers partition the membrane into small domains, or 
corrals (Figure 10-39A), which can be either permanent, as in the sperm (see 
Figure 10-35), or transient. The barriers can be detected when the diffusion of 
individual membrane proteins is followed by high-speed, single-particle tracking. 
The proteins diffuse rapidly but are confined within an individual corral (Figure 
10-39B); occasionally, however, thermal motions cause a few cortical filaments 
to detach transiently from the membrane, allowing the protein to escape into an 
adjacent corral. 

The extent to which a transmembrane protein is confined within a corral depends 
on its association with other proteins and the size of its cytoplasmic domain; pro- 
teins with a large cytosolic domain will have a harder time passing through cyto- 
skeletal barriers. When a cell-surface receptor binds its extracellular signal mole- 
cules, for example, large protein complexes build up on the cytosolic domain of 
the receptor, making it more difficult for the receptor to escape from its corral. It is 
thought that corralling helps concentrate such signaling complexes, increasing the 
speed and efficiency of the signaling process (discussed in Chapter 15). 


Membrane-bending Proteins Deform Bilayers 


Cell membranes assume many different shapes, as illustrated by the elaborate 
and varied structures of cell-surface protrusions and membrane-enclosed organ- 
elles in eukaryotic cells. Flat sheets, narrow tubules, round vesicles, fenestrated 
sheets, and pitta bread-shaped cisternae are all part of the repertoire: often, a vari- 
ety of shapes will be present in different regions of the same continuous bilayer. 
Membrane shape is controlled dynamically, as many essential cell processes— 
including vesicle budding, cell movement, and cell division—require elaborate 
transient membrane deformations. In many cases, membrane shape is influ- 
enced by dynamic pushing and pulling forces exerted by cytoskeletal or extracel- 
lular structures, as we discuss in Chapters 13 and 16). A crucial part in producing 
these deformations is played by membrane-bending proteins, which control 
local membrane curvature. Often, cytoskeletal dynamics and membrane-bend- 
ing-protein forces work together. Membrane-bending proteins attach to specific 
membrane regions as needed and act by one or more of three principal mecha- 
nisms: 


1. Some insert hydrophobic protein domains or attached lipid anchors into 
one of the leaflets of a lipid bilayer. Increasing the area of only one leaflet 
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Figure 10-39 Corralling plasma 
membrane proteins by cortical 
cytoskeletal filaments. (A) The filaments 
are thought to provide diffusion barriers 
that divide the membrane into small 
domains, or corrals. (B) High-speed, 
single-particle tracking was used to follow 
the path of single fluorescently labeled 
membrane protein of one type over time. 
The trace shows that the individual protein 
molecules (the movement of each shown 
in a different color) diffuse within a tightly 
delimited membrane domain and only 
infrequently escape into a neighboring 
domain. (Adapted from A. Kusumi et 

al., Annu. Rev. Biophys. Biomol. Struct. 
34:351-378, 2005. With permission from 
Annual Reviews.) 
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Figure 10-40 Three ways in which membrane-bending proteins shape membranes. Lipid bilayers are blue and 
proteins are green. (A) Bilayer without protein bound. (B) A hydrophobic region of the protein can insert as a wedge into 
one monolayer to pry lipid head groups apart. Such regions can either be amphiphilic helices as shown or hydrophobic 
hairpins. (C) The curved surface of the protein can bind to lipid head groups and deform the membrane or stabilize its 
curvature. (D) A protein can bind to and cluster lipids that have large head groups and thereby bend the membrane. 
(Adapted from W.A. Prinz and J.E. Hinshaw, Crit. Rev. Biochem. Mol. Biol. 44:278-291, 2009.) 


causes the membrane to bend (Figure 10-40B). The proteins that shape the 
convoluted network of narrow ER tubules are thought to work in this way. 


2. Some membrane-bending proteins form rigid scaffolds that deform the 
membrane or stabilize an already bent membrane (Figure 10-40C). The 
coat proteins that shape the budding vesicles in intracellular transport fall 
into this class. 


3. Some membrane-bending proteins cause particular membrane lipids 
to cluster together, thereby inducing membrane curvature. The ability of 
a lipid to induce positive or negative membrane curvature is determined 
by the relative cross-sectional areas of its head group and its hydrocarbon 
tails. For example, the large head group of phosphoinositides make these 
lipid molecules wedge-shaped, and their accumulation in a domain of one 
leaflet of a bilayer therefore induces positive curvature (Figure 10-40D). By 
contrast, phospholipases that remove lipid head groups produce inversely 
shaped lipid molecules that induce negative curvature. 


Often, different membrane-bending proteins collaborate to achieve a particular 
curvature, as in shaping a budding transport vesicle, as we discuss in Chapter 13. 


Summary 


Whereas the lipid bilayer determines the basic structure of biological membranes, 
proteins are responsible for most membrane functions, serving as specific recep- 
tors, enzymes, transporters, and so on. Transmembrane proteins extend across the 
lipid bilayer. Some of these membrane proteins are single-pass proteins, in which 
the polypeptide chain crosses the bilayer as a single a helix. Others are multipass 
proteins, in which the polypeptide chain crosses the bilayer multiple times—either 
as a series of a helices or as a p sheet rolled up into the shape of a barrel. All pro- 
teins responsible for the transport of ions and other small water-soluble molecules 
through the membrane are multipass proteins. Some membrane proteins do not 
span the bilayer but instead are attached to either side of the membrane: some are 
attached to the cytosolic side by an amphipathic a helix on the protein surface or by 
the covalent attachment of one or more lipid chains, others are attached to the non- 
cytosolic side by a GPI anchor. Some membrane-associated proteins are bound by 
noncovalent interactions with transmembrane proteins. In the plasma membrane 
of all eukaryotic cells, most of the proteins exposed on the cell surface and some of 
the lipid molecules in the outer lipid monolayer have oligosaccharide chains cova- 
lently attached to them. Like the lipid molecules in the bilayer, many membrane 
proteins are able to diffuse rapidly in the plane of the membrane. However, cells 
have ways of immobilizing specific membrane proteins, as well as ways of confining 
both membrane protein and lipid molecules to particular domains in a continuous 
lipid bilayer. The dynamic association of membrane-bending proteins confers on 
membranes their characteristic three-dimensional shapes. 


WHAT WE DON’T KNOW 


e Given the highly complex lipid 
composition of cell membranes, 
what are the variations within 
different organelle membranes in an 
animal cell? What are the functional 
consequences of these differences, 
and what are the roles of the minor 
lipid species? 


e Is the biophysical tendency of lipids 
to partition into separate phases within 
a lipid bilayer functionally utilized in cell 
membranes? If so, how is it regulated 
and what membrane functions does it 
control? 


e How commonly do specific lipid 
molecules associate with membrane 
proteins to regulate their function? 


e Given that the structure of only a tiny 
fraction of all membrane proteins has 
been determined, what new principles 
of membrane protein structure remain 
to be discovered? 


CHAPTER 10 END-OF-CHAPTER PROBLEMS 


PROBLEMS 


Which statements are true? Explain why or why not. 


10-1 Although lipid molecules are free to diffuse in the 
plane of the bilayer, they cannot flip-flop across the bilayer 
unless enzyme catalysts called phospholipid translocators 
are present in the membrane. 


10-2 Whereas all the carbohydrate in the plasma mem- 
brane faces outward on the external surface of the cell, all 
the carbohydrate on internal membranes faces toward the 
cytosol. 


10-3 Although membrane domains with different pro- 
tein compositions are well known, there are at present no 
examples of membrane domains that differ in lipid com- 
position. 


Discuss the following problems. 


10-4 When a lipid bilayer is torn, why does it not seal 
itself by forming a “hemi-micelle” cap at the edges, as 
shown in Figure Q10-1? 
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Figure Q10-1 A torn lipid bilayer sealed with a hypothetical “hemi- 
micelle” cap (Problem 10-4). 


10-5 Margarine is made from vegetable oil by a chem- 
ical process. Do you suppose this process converts satu- 
rated fatty acids to unsaturated ones, or vice versa? Explain 
your answer. 


10-7 Monomeric single-pass transmembrane proteins 
span a membrane with a single a helix that has character- 
istic chemical properties in the region of the bilayer. Which 
of the three 20-amino-acid sequences listed below is the 
most likely candidate for such a transmembrane segment? 
Explain the reasons for your choice. (See back of book for 
one-letter amino acid code; FAMILY VW is a convenient 
mnemonic for hydrophobic amino acids.) 


A. ITLIYFGVMAGVIGTILLIS 
B. ITPIYFGPMAGVIGTPLLIS 
C. ITEIYFGRMAGVIGTDLLIS 
10-6 Ifa lipid raft is typically 70 nm in diameter and 


each lipid molecule has a diameter of 0.5 nm, about how 
many lipid molecules would there be in a lipid raft com- 
posed entirely of lipid? At a ratio of 50 lipid molecules 
per protein molecule (50% protein by mass), how many 
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proteins would be in a typical raft? (Neglect the loss of lipid 
from the raft that would be required to accommodate the 
protein.) 


10-8 You are studying the binding of proteins to the 
cytoplasmic face of cultured neuroblastoma cells and 
have found a method that gives a good yield of inside-out 
vesicles from the plasma membrane. Unfortunately, your 
preparations are contaminated with variable amounts of 
right-side-out vesicles. Nothing you have tried avoids this 
problem. A friend suggests that you pass your vesicles over 
an affinity column made of lectin coupled to solid beads. 
What is the point of your friend’s suggestion? 


10-9 Glycophorin, a protein in the plasma membrane 
of the red blood cell, normally exists as a homodimer that 
is held together entirely by interactions between its trans- 
membrane domains. Since transmembrane domains are 
hydrophobic, how is it that they can associate with one 
another so specifically? 


10-10 Three mechanisms by which membrane-bind- 
ing proteins bend a membrane are illustrated in Figure 
Q10-2A, B, and C. As shown, each of these cytosolic mem- 
brane-bending proteins would induce an invagination of 
the plasma membrane. Could similar kinds of cytosolic 
proteins induce a protrusion of the plasma membrane 
(Figure Q10-2D)? Which ones? Explain how they might 
work. 


(A) (B) 








(C) (D) 
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Figure Q10-2 Bending of the plasma membrane by cytosolic proteins 
(Problem 10-10). (A) Insertion of a protein “finger” into the cytosolic 
leaflet of the membrane. (B) Binding of lipids to the curved surface of 
a membrane-binding protein. (C) Binding of membrane proteins to 
membrane lipids with large head groups. (D) A segment of the plasma 
membrane showing a protrusion. 
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Membrane Transport of Small 
Molecules and the Electrical 
Properties of Membranes 


CHAPTER 


Because of its hydrophobic interior, the lipid bilayer of cell membranes restricts IN THIS CHAPTER 
the passage of most polar molecules. This barrier function allows the cell to main- 
tain concentrations of solutes in its cytosol that differ from those in the extracel- PRINCIPLES OF MEMBRANE 
lular fluid and in each of the intracellular membrane-enclosed compartments. TRANSPORT 
To benefit from this barrier, however, cells have had to evolve ways of transferring 
specific water-soluble molecules and ions across their membranes in order to = |RANSPORTERS AND ACTIVE 
ingest essential nutrients, excrete metabolic waste products, and regulate intra- MEMBRANE TRANSPORT 
cellular ion concentrations. Cells use specialized membrane transport proteins to 
accomplish this goal. The importance of such small molecule transport is reflected CHANNELS AND THE 
in the large number of genes in all organisms that code for the transmembrane ELECTRICAL PROPERTIES OF 
transport proteins involved, which make up 15-30% of the membrane proteins in MEMBRANES 
all cells. Some mammalian cells, such as nerve and kidney cells, devote up to two- 
thirds of their total metabolic energy consumption to such transport processes. 
Cells can also transfer macromolecules and even large particles across their 
membranes, but the mechanisms involved in most of these cases differ from 
those used for transferring small molecules, and they are discussed in Chapters 
12 and 13. 
We begin this chapter by describing some general principles of how small 
water-soluble molecules traverse cell membranes. We then consider, in turn, the 
two main classes of membrane proteins that mediate this transmembrane traffic: 
transporters, which undergo sequential conformational changes to transport spe- 
cific small molecules across membranes, and channels, which form narrow pores, 
allowing passive transmembrane movement, primarily of water and small inor- 
ganic ions. Transporters can be coupled to a source of energy to catalyze active 
transport, which together with selective passive permeability, creates large dif- 
ferences in the composition of the cytosol compared with that of either the extra- 
cellular fluid (Table 11-1) or the fluid within membrane-enclosed organelles. By 
generating inorganic ion-concentration differences across the lipid bilayer, cell 
membranes can store potential energy in the form of electrochemical gradients, 
which drive various transport processes, convey electrical signals in electrically 
excitable cells, and (in mitochondria, chloroplasts, and bacteria) make most of 
the cell’s ATP. We focus our discussion mainly on transport across the plasma 
membrane, but similar mechanisms operate across the other membranes of the 
eukaryotic cell, as discussed in later chapters. 
In the last part of the chapter, we concentrate mainly on the functions of ion 
channels in neurons (nerve cells). In these cells, channel proteins perform at their 
highest level of sophistication, enabling networks of neurons to carry out all the 
astonishing feats your brain is capable of. 


PRINCIPLES OF MEMBRANE TRANSPORT 


We begin this section by describing the permeability properties of protein-free, 
synthetic lipid bilayers. We then introduce some of the terms used to describe the 
various forms of membrane transport and some strategies for characterizing the 
proteins and processes involved. 
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TABLE 11-1 
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*The cell must contain equal quantities of positive and negative charges (that is, it must be 
electrically neutral). Thus, in addition to CI, the cell contains many other anions not listed in 
this table; in fact, most cell constituents are negatively charged (HCO, PO4%, nucleic acids, 
metabolites carrying phosphate and carboxyl groups, etc.). The concentrations of 

Ca?+ and Mg?* given are for the free ions: although there is a total of about 20 mM Mg?* and 
1-2 mM Ca?* in cells, both ions are mostly bound to other substances (such as proteins, free 
nucleotides, RNA, etc.) and, for Ca?+, stored within various organelles. 





Protein-Free Lipid Bilayers Are Impermeable to lons 


Given enough time, virtually any molecule will diffuse across a protein-free lipid 
bilayer down its concentration gradient. The rate of diffusion, however, varies 
enormously, depending partly on the size of the molecule but mostly on its rela- 
tive hydrophobicity (solubility in oil). In general, the smaller the molecule and the 
more hydrophobic, or nonpolar, it is, the more easily it will diffuse across a lipid 
bilayer. Small nonpolar molecules, such as O2 and COs, readily dissolve in lipid 
bilayers and therefore diffuse rapidly across them. Small uncharged polar mole- 
cules, such as water or urea, also diffuse across a bilayer, albeit much more slowly 
(Figure 11-1 and see Movie 10.3). By contrast, lipid bilayers are essentially imper- 
meable to charged molecules (ions), no matter how small: the charge and high 
degree of hydration of such molecules prevents them from entering the hydrocar- 
bon phase of the bilayer (Figure 11-2). 


There Are Two Main Classes of Membrane Transport Proteins: 
Transporters and Channels 


Like synthetic lipid bilayers, cell membranes allow small nonpolar molecules to 
permeate by diffusion. Cell membranes, however, also have to allow the passage 
of various polar molecules, such as ions, sugars, amino acids, nucleotides, water, 
and many cell metabolites that cross synthetic lipid bilayers only very slowly. Spe- 
cial membrane transport proteins transfer such solutes across cell membranes. 
These proteins occur in many forms and in all types of biological membranes. Each 
protein often transports only a specific molecular species or sometimes a class of 
molecules (such as ions, sugars, or amino acids). Studies in the 1950s found that 
bacteria with a single-gene mutation were unable to transport sugars across their 
plasma membrane, thereby demonstrating the specificity of membrane transport 
proteins. We now know that humans with similar mutations suffer from various 
inherited diseases that hinder the transport of a specific solute or solute class in 
the kidney, intestine, or other cell type. Individuals with the inherited disease cys- 
tinuria, for example, cannot transport certain amino acids (including cystine, the 
disulfide-linked dimer of cysteine) from either the urine or the intestine into the 
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Figure 11-1 The relative permeability 
of a synthetic lipid bilayer to different 
classes of molecules. The smaller the 
molecule and, more importantly, the less 
strongly it associates with water, the more 
rapidly the molecule diffuses across the 
bilayer. 
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Figure 11-2 Permeability coefficients for the passage of various 
molecules through synthetic lipid bilayers. The rate of flow of a solute 
across the bilayer is directly proportional to the difference in its concentration 
on the two sides of the membrane. Multiplying this concentration difference (in 
mol/cm?) by the permeability coefficient (in cm/sec) gives the flow of solute in 
moles per second per square centimeter of bilayer. A concentration difference 
of tryptophan of 10-* mol/cm® (1074 mol / 107 L = 0.1 M), for example, would 
cause a flow of 104 mol/cm® x 10-7” cm/sec = 10711 mol/sec through 1 cm? 
of bilayer, or 6 x 10* molecules/sec through 1 pm? of bilayer. 


blood; the resulting accumulation of cystine in the urine leads to the formation of 
cystine stones in the kidneys. 

All membrane transport proteins that have been studied in detail are multi- 
pass transmembrane proteins—that is, their polypeptide chains traverse the lipid 
bilayer multiple times. By forming a protein-lined pathway across the membrane, 
these proteins enable specific hydrophilic solutes to cross the membrane without 
coming into direct contact with the hydrophobic interior of the lipid bilayer. 

Transporters and channels are the two major classes of membrane transport 
proteins (Figure 11-3). Transporters (also called carriers, or permeases) bind the 
specific solute to be transported and undergo a series of conformational changes 
that alternately expose solute-binding sites on one side of the membrane and 
then on the other to transfer the solute across it. Channels, by contrast, interact 
with the solute to be transported much more weakly. They form continuous pores 
that extend across the lipid bilayer. When open, these pores allow specific solutes 
(such as inorganic ions of appropriate size and charge and in some cases small 
molecules, including water, glycerol, and ammonia) to pass through them and 
thereby cross the membrane. Not surprisingly, transport through channels occurs 
at a much faster rate than transport mediated by transporters. Although water can 
slowly diffuse across synthetic lipid bilayers, cells use dedicated channel proteins 
(called water channels, or aquaporins) that greatly increase the permeability of 
their membranes to water, as we discuss later. 


Active Transport Is Mediated by Transporters Coupled to an 
Energy Source 


All channels and many transporters allow solutes to cross the membrane only 
passively (“downhill”), a process called passive transport. In the case of transport 
of a single uncharged molecule, the difference in the concentration on the two 
sides of the membrane—its concentration gradient—drives passive transport and 
determines its direction (Figure 11-4A). If the solute carries a net charge, how- 
ever, both its concentration gradient and the electrical potential difference across 
the membrane, the membrane potential, influence its transport. The concentra- 
tion gradient and the electrical gradient combine to form a net driving force, the 
electrochemical gradient, for each charged solute (Figure 11-4B). We discuss 
electrochemical gradients in more detail later and in Chapter 14. In fact, almost all 
plasma membranes have an electrical potential (i.e., a voltage) across them, with 
the inside usually negative with respect to the outside. This potential favors the 
entry of positively charged ions into the cell but opposes the entry of negatively 
charged ions (see Figure 11-4B); it also opposes the efflux of positively charged 
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Figure 11-3 Transporters and channel 
proteins. (A) A transporter alternates 
between two conformations, so that 

the solute-binding site is sequentially 
accessible on one side of the bilayer 

and then on the other. (B) In contrast, a 
channel protein forms a pore across the 
bilayer through which specific solutes can 
passively diffuse. 
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As shown in Figure 11-4A, in addition to passive transport, cells need to be 
able to actively pump certain solutes across the membrane “uphill,” against their 
electrochemical gradients. Such active transport is mediated by transporters 
whose pumping activity is directional because it is tightly coupled to a source of 
metabolic energy, such as an ion gradient or ATP hydrolysis, as discussed later. 
Transmembrane movement of small molecules mediated by transporters can be 
either active or passive, whereas that mediated by channels is always passive (see 
Figure 11-4A). 


Summary 


Lipid bilayers are virtually impermeable to most polar molecules. To transport 
small water-soluble molecules into or out of cells or intracellular membrane-en- 
closed compartments, cell membranes contain various membrane transport pro- 
teins, each of which is responsible for transferring a particular solute or class of 
solutes across the membrane. There are two classes of membrane transport pro- 
teins—transporters and channels. Both form protein pathways across the lipid 
bilayer. Whereas transmembrane movement mediated by transporters can be 
either active or passive, solute flow through channel proteins is always passive. Both 
active and passive ion transport is influenced by the ion's concentration gradient 
and the membrane potential—that is, its electrochemical gradient. 


TRANSPORTERS AND ACTIVE MEMBRANE 
TRANSPORT 


The process by which a transporter transfers a solute molecule across the lipid 
bilayer resembles an enzyme-substrate reaction, and in many ways transporters 
behave like enzymes. By contrast to ordinary enzyme-substrate reactions, how- 
ever, the transporter does not modify the transported solute but instead delivers it 
unchanged to the other side of the membrane. 

Each type of transporter has one or more specific binding sites for its solute 
(substrate). It transfers the solute across the lipid bilayer by undergoing reversible 


gradient. (B) The electrochemical gradient 
of a charged solute (an ion) affects its 
transport. This gradient combines the 
membrane potential and the concentration 
gradient of the solute. The electrical and 
chemical gradients can work additively to 
increase the driving force on an ion across 
the membrane (middle) or can work against 
each other (right). 
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conformational changes that alternately expose the solute-binding site first on 
one side of the membrane and then on the other—but never on both sides at the 
same time. The transition occurs through an intermediate state in which the sol- 
ute is inaccessible, or occluded, from either side of the membrane (Figure 11-5). 
When the transporter is saturated (that is, when all solute-binding sites are occu- 
pied), the rate of transport is maximal. This rate, referred to as Vmax (V for veloc- 
ity), is characteristic of the specific carrier. Vmax measures the rate at which the 
carrier can flip between its conformational states. In addition, each transporter 
has a characteristic affinity for its solute, reflected in the Km of the reaction, which 
is equal to the concentration of solute when the transport rate is half its maximum 
value (Figure 11-6). As with enzymes, the binding of solute can be blocked by 
either competitive inhibitors (which compete for the same binding site and may 
or may not be transported) or noncompetitive inhibitors (which bind elsewhere 
and alter the structure of the transporter). 

As we discuss shortly, it requires only a relatively minor modification of the 
model shown in Figure 11-5 to link a transporter to a source of energy in order 
to pump a solute uphill against its electrochemical gradient. Cells carry out such 
active transport in three main ways (Figure 11-7): 


1. Coupled transporters harness the energy stored in concentration gradients 
to couple the uphill transport of one solute across the membrane to the 
downhill transport of another. 


2. ATP-driven pumps couple uphill transport to the hydrolysis of ATP. 


3. Light- or redox-driven pumps, which are known in bacteria, archaea, mito- 
chondria, and chloroplasts, couple uphill transport to an input of energy 
from light, as with bacteriorhodopsin (discussed in Chapter 10), or from a 
redox reaction, as with cytochrome c oxidase (discussed in Chapter 14). 


Amino acid sequence and three-dimensional structure comparisons suggest 
that, in many cases, there are strong similarities in structure between transport- 
ers that mediate active transport and those that mediate passive transport. Some 
bacterial transporters, for example, that use the energy stored in the H* gradi- 
ent across the plasma membrane to drive the active uptake of various sugars are 
structurally similar to the transporters that mediate passive glucose transport 
into most animal cells. This suggests an evolutionary relationship between vari- 
ous transporters. Given the importance of small metabolites and sugars as energy 
sources, it is not surprising that the superfamily of transporters is an ancient one. 

We begin our discussion of active membrane transport by considering a class 
of coupled transporters that are driven by ion concentration gradients. These pro- 
teins have a crucial role in the transport of small metabolites across membranes 
in all cells. We then discuss ATP-driven pumps, including the Na*-K* pump that is 
found in the plasma membrane of most animal cells. Examples of the third class 
of active transport—light- or redox-driven pumps—are discussed in Chapter 14. 


Active Transport Can Be Driven by lon-Concentration Gradients 


Some transporters simply passively mediate the movement of a single solute from 
one side of the membrane to the other at a rate determined by their Vmax and 


rate of transport 
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Figure 11-5 A model of how a 
conformational change in a transporter 
mediates the passive movement of a 
solute. The transporter is shown in three 
conformational states: in the outward- 
open state, the binding sites for solute are 
exposed on the outside; in the occluded 
state, the same sites are not accessible 
from either side; and in the inward-open 
state, the sites are exposed on the inside. 
The transitions between the states occur 
randomly. They are completely reversible 
and do not depend on whether the solute- 
binding site is occupied. Therefore, if 

the solute concentration is higher on the 
outside of the bilayer, more solute binds 
to the transporter in the outward-open 
conformation than in the inward-open 
conformation, and there is a net transport 
of solute down its concentration gradient 
(or, if the solute is an ion, down its 
electrochemical gradient). 
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Figure 11-6 The kinetics of simple 
diffusion compared with transporter- 
mediated diffusion. Whereas the rate 

of diffusion and channel-mediated 
transport is directly proportional to the 
solute concentration (within the physical 
limits imposed by total surface area 

or total channels available), the rate of 
transporter-mediated diffusion reaches a 
maximum (Vmax) when the transporter is 
saturated. The solute concentration when 
the transport rate is at half its maximal 
value approximates the binding constant 
(Km) of the transporter for the solute and 
is analogous to the Km of an enzyme 

for its substrate. The graph applies to a 
transporter moving a single solute; the 
kinetics of coupled transport of two or 
more solutes is more complex and exhibits 
cooperative behavior. 
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Km; they are called uniporters. Others function as coupled transporters, in which 
the transfer of one solute strictly depends on the transport of a second. Coupled 
transport involves either the simultaneous transfer of a second solute in the same 
direction, performed by symporters (also called co-transporters), or the transfer 
of a second solute in the opposite direction, performed by antiporters (also called 
exchangers) (Figure 11-8). 

The tight coupling between the transfer of two solutes allows the coupled 
transporters to harvest the energy stored in the electrochemical gradient of one 
solute, typically an inorganic ion, to transport the other. In this way, the free 
energy released during the movement of an inorganic ion down an electrochem- 
ical gradient is used as the driving force to pump other solutes uphill, against 
their electrochemical gradient. This strategy can work in either direction; some 
coupled transporters function as symporters, others as antiporters. In the plasma 
membrane of animal cells, Na* is the usual co-transported ion because its elec- 
trochemical gradient provides a large driving force for the active transport of a 
second molecule. The Na* that enters the cell during coupled transport is sub- 
sequently pumped out by an ATP-driven Na*-K* pump in the plasma membrane 
(as we discuss later), which, by maintaining the Na* gradient, indirectly drives 
the coupled transport. Such ion-driven coupled transporters as just described 
are said to mediate secondary active transport. In contrast, ATP-driven pumps are 
said to mediate primary active transport because in these the free energy of ATP 
hydrolysis is used to directly drive the transport of a solute against its concentra- 
tion gradient. 

Intestinal and kidney epithelial cells contain a variety of symporters that are 
driven by the Na* gradient across the plasma membrane. Each Na*-driven sym- 
porter is specific for importing a small group of related sugars or amino acids 
into the cell. Because the Na* tends to move into the cell down its electrochem- 
ical gradient, the sugar or amino acid is, in a sense, “dragged” into the cell with 
it. The greater the electrochemical gradient for Na*, the more solute is pumped 
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Figure 11-7 Three ways of driving 

active transport. The actively transported 
molecule is shown in orange, and the 
energy source is shown in red. Redox 
driven active transport is discussed in 
Chapter 14 (see Figures 14-18 and 14-19). 


Figure 11-8 This schematic diagram 
shows transporters functioning as 
uniporters, symporters, and antiporters 
(Movie 11.1). 
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Figure 11-9 Mechanism of glucose transport fueled by a Na* gradient. As in the model shown in Figure 11-5, 

the transporter alternates between inward-open and outward-open states via an occluded intermediate state. Binding of 

Na* and glucose is cooperative—that is, the binding of either solute increases the protein’s affinity for the other. Since the 

Nat concentration is much higher in the extracellular space than in the cytosol, glucose is more likely to bind to the transporter 
in the outward-facing state. The transition to the occluded state occurs only when both Na* and glucose are bound; their 
precise interactions in the solute-binding sites slightly stabilize the occluded state and thereby make this transition energetically 
favorable. Stochastic fluctuations caused by thermal energy drive the transporter randomly into the inward-open or outward- 
open conformation. If it opens outwardly, nothing is achieved, and the process starts all over. However, whenever it opens 
inwardly, Nat dissociates quickly in the low-Na*-concentration environment of the cytosol. Glucose dissociation is likewise 
enhanced when Nat is lost, because of cooperativity in binding of the two solutes. The overall result is the net transport of 
both Nat and glucose into the cell. Because the occluded state is not formed when only one of the solutes is bound, the 
transporter switches conformation only when it is fully occupied or fully empty, thereby assuring strict coupling of the transport 


of Na* and glucose. 


into the cell (Figure 11-9). Neurotransmitters (released by nerve cells to signal at 
synapses—as we discuss later) are taken up again by Nat symporters after their 
release. These neurotransmitter transporters are important drug targets: stimu- 
lants, such as cocaine and antidepressants, inhibit them and thereby prolong sig- 
naling by the neurotransmitters, which are not cleared efficiently. 

Despite their great variety, transporters share structural features that can 
explain how they function and how they evolved. Transporters are typically 
built from bundles of 10 or more a helices that span the membrane. Solute- and 
ion-binding sites are located midway through the membrane, where some helices 
are broken or distorted and amino acid side chains and polypeptide backbone 
atoms form ion- and solute-binding sites. In the inward-open and outward-open 
conformations, these binding sites are accessible by passageways from one side of 
the membrane but not the other. In switching between the two conformations, the 
transporter protein transiently adopts an occluded conformation, in which both 
passageways are closed; this prevents the driving ion and the transported solute 
from crossing the membrane unaccompanied, which would deplete the cell’s 
energy store to no purpose. Because only transporters with both types of binding 
sites appropriately filled change their conformation, tight coupling between ion 
and solute transport is assured. 

Like enzymes, transporters can work in the reverse direction if ion and solute 
gradients are appropriately adjusted experimentally. This chemical symmetry is 
mirrored in their physical structure. Crystallographic analyses have revealed that 
transporters are built from inverted repeats: the packing of the transmembrane a 
helices in one half of the helix bundle is structurally similar to the packing in the 
other half, but the two halves are inverted in the membrane relative to each other. 
Transporters are therefore said to be pseudosymmetric, and the passageways 
that open and close on either side of the membrane have closely similar geome- 
tries, allowing alternating access to the ion- and solute-binding sites in the center 
(Figure 11-10). It is thought that the two halves evolved by gene duplication of a 
smaller ancestor protein. 
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Some other types ofimportant membrane transport proteins are also built from 
inverted repeats. Examples even include channel proteins such as the aquaporin 
water channel (discussed later) and the Sec61 channel through which nascent 
polypeptides move into the endoplasmic reticulum (discussed in Chapter 12). It 
is thought that these channels evolved from coupled transporters, in which the 
gating functions were lost, allowing them to open toward both sides of the mem- 
brane simultaneously to provide a continuous path across the membrane. 

In bacteria, yeasts, and plants, as well as in many membrane-enclosed organ- 
elles of animal cells, most ion-driven active transport systems depend on H* 
rather than Na* gradients, reflecting the predominance of Ht pumps in these 
membranes. An electrochemical H* gradient across the bacterial plasma mem- 
brane, for example, drives the inward active transport of many sugars and amino 
acids. 


Transporters in the Plasma Membrane Regulate Cytosolic pH 


Most proteins operate optimally at a particular pH. Lysosomal enzymes, for 
example, function best at the low pH (~5) found in lysosomes, whereas cytosolic 
enzymes function best at the close-to-neutral pH (~7.2) found in the cytosol. It 
is therefore crucial that cells control the pH of their intracellular compartments. 

Most cells have one or more types of Na*-driven antiporters in their plasma 
membrane that help to maintain the cytosolic pH at about 7.2. These transporters 
use the energy stored in the Na* gradient to pump out excess H+, which either 
leaks in or is produced in the cell by acid-forming reactions. Two mechanisms are 
used: either H* is directly transported out of the cell or HCO3° is brought into the 
cell to neutralize H* in the cytosol (according to the reaction HCO3" + Ht —> H20 + 
CO2). One of the antiporters that uses the first mechanism is a Na*-H* exchanger, 
which couples an influx of Nat to an efflux of H*. Another, which uses a combi- 
nation of the two mechanisms, is a Na*-driven Cl--HCO3° exchanger that couples 
an influx of Nat and HCO3° to an efflux of CI and H* (so that NaHCO; comes 
in and HCl goes out). The Nat-driven Cl!-HCO3° exchanger is twice as effective 
as the Na*-H* exchanger: it pumps out one H* and neutralizes another for each 
Nat that enters the cell. If HCO3° is available, as is usually the case, this antiporter 
is the most important transporter regulating the cytosolic pH. The pH inside the 
cell regulates both exchangers; when the pH in the cytosol falls, both exchangers 
increase their activity. 

A Nat-independent Cl--HCO3 exchanger adjusts the cytosolic pH in the 
reverse direction. Like the Na*-dependent transporters, pH regulates the Na*-in- 
dependent Cl--HCO3° exchanger, but the exchanger’s activity increases as the 
cytosol becomes too alkaline. The movement of HCO3— in this case is normally 
out of the cell, down its electrochemical gradient, which decreases the pH of the 


Figure 11-10 Transporters are built from 
inverted repeats. (A) LeuT, a bacterial 
leucine/Nat symporter related to human 
neurotransmitter transporters, such as the 
serotonin transporter, is shown. The core 
of the transporter is built from two bundles, 
each composed of five a helices (blue 

and yellow). The helices shown in gray 
differ among members of this transporter 
family and are thought to play regulatory 
roles, which are specific to a particular 
transporter. (B) Both core helix bundles are 
packed in a similar arrangement (shown 
as a hand, with the broken helix as the 
thumb), but the second bundle is inverted 
with respect to the first. The transporter’s 
structural pseudosymmetry reflects its 
functional symmetry: the transporter can 
work in either direction, depending on the 
direction of the ion gradient. (Adapted from 
K.R. Vinothkumar and R. Henderson, 

Q. Rev. Biophys. 43:65-158, 2010. With 
permission from Cambridge University 
Press. PDB code: 3F3E.) 
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cytosol. A Na*-independent Cl--HCO3° exchanger in the membrane of red blood 
cells (called band 3 protein—see Figure 10-38) facilitates the quick discharge of 
CO% (as HCO37) as the cells pass through capillaries in the lung. 

The intracellular pH is not entirely regulated by transporters in the plasma 
membrane: ATP-driven H* pumps are used to control the pH of many intracellu- 
lar compartments. As discussed in Chapter 13, H* pumps maintain the low pH in 
lysosomes, as well as in endosomes and secretory vesicles. These H* pumps use 
the energy of ATP hydrolysis to pump Ht into these organelles from the cytosol. 


An Asymmetric Distribution of Transporters in Epithelial Cells 
Underlies the Transcellular Transport of Solutes 


In epithelial cells, such as those that absorb nutrients from the gut, transporters 
are distributed nonuniformly in the plasma membrane and thereby contribute 
to the transcellular transport of absorbed solutes. By the actions of the trans- 
porters in these cells, solutes are moved across the epithelial cell layer into the 
extracellular fluid from where they pass into the blood. As shown in Figure 11-11, 
Na*-linked symporters located in the apical (absorptive) domain of the plasma 
membrane actively transport nutrients into the cell, building up substantial con- 
centration gradients for these solutes across the plasma membrane. Uniporters 
in the basal and lateral (basolateral) domains allow the nutrients to leave the cell 
passively down these concentration gradients. 

In many of these epithelial cells, the plasma membrane areais greatly increased 
by the formation of thousands of microvilli, which extend as thin, fingerlike pro- 
jections from the apical surface of each cell. Such microvilli can increase the total 
absorptive area of a cell as much as 25-fold, thereby enhancing its transport capa- 
bilities. 

As we have seen, ion gradients have a crucial role in driving many essential 
transport processes in cells. lon pumps that use the energy of ATP hydrolysis 
establish and maintain these gradients, as we discuss next. 
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Figure 11-11 Transcellular transport. The 
transcellular transport of glucose across 
an intestinal epithelial cell depends on the 
nonuniform distribution of transporters in 
the cell’s plasma membrane. The process 
shown here results in the transport of 
glucose from the intestinal lumen to the 
extracellular fluid (from where it passes 
into the blood). Glucose is pumped into 
the cell through the apical domain of the 
membrane by a Nat-powered glucose 
symporter. Glucose passes out of the 

cell (down its concentration gradient) by 
passive movement through a glucose 
uniporter in the basal and lateral membrane 
domains. The Nat gradient driving the 
glucose symport is maintained by the 
Na*-K* pump in the basal and lateral 
plasma membrane domains, which keeps 
the internal concentration of Na* low 
(Movie 11.2). Adjacent cells are connected 
by impermeable tight junctions, which 
have a dual function in the transport 
process illustrated: they prevent solutes 
from crossing the epithelium between 
cells, allowing a concentration gradient 

of glucose to be maintained across the 
cell sheet (see Figure 19-18). They also 
serve as diffusion barriers (fences) within 
the plasma membrane, which help confine 
the various transporters to their respective 
membrane domains (see Figure 10-34). 
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There Are Three Classes of ATP-Driven Pumps 


ATP-driven pumps are often called transport ATPases because they hydrolyze ATP 
to ADP and phosphate and use the energy released to pump ions or other solutes 
across a membrane. There are three principal classes of ATP-driven pumps (Fig- 
ure 11-12), and representatives of each are found in all prokaryotic and eukary- 
otic cells. 


1. P-type pumps are structurally and functionally related multipass trans- 
membrane proteins. They are called “P-type” because they phosphorylate 
themselves during the pumping cycle. This class includes many of the ion 
pumps that are responsible for setting up and maintaining gradients of 
Na+, K+, H*, and Ca?* across cell membranes. 


2. ABC transporters (ATP-Binding Cassette transporters) differ structur- 
ally from P-type ATPases and primarily pump small molecules across cell 
membranes. 


3. V-type pumps are turbine-like protein machines, constructed from multi- 
ple different subunits. The V-type proton pump transfers H* into organelles 
such as lysosomes, synaptic vesicles, and plant or yeast vacuoles (V = vacu- 
olar), to acidify the interior of these organelles (see Figure 13-37). 

Structurally related to the V-type pumps is a distinct family of F-type ATPases, 

more commonly called ATP synthases because they normally work in reverse: 
instead of using ATP hydrolysis to drive H* transport, they use the H* gradient 
across the membrane to drive the synthesis of ATP from ADP and phosphate (see 
Figure 14-30). ATP synthases are found in the plasma membrane of bacteria, the 
inner membrane of mitochondria, and the thylakoid membrane of chloroplasts. 
The H* gradient is generated either during the electron-transport steps of oxida- 
tive phosphorylation (in aerobic bacteria and mitochondria), during photosyn- 
thesis (in chloroplasts), or by the light-driven Ht pump (bacteriorhodopsin) in 
Halobacterium. We discuss some of these proteins in detail in Chapter 14. 

For the remainder of this section, we focus on P-type pumps and ABC trans- 

porters. 


A P-type ATPase Pumps Ca?* into the Sarcoplasmic Reticulum 
in Muscle Cells 


Eukaryotic cells maintain very low concentrations of free Ca** in their cytosol 
(~10-’ M) in the face of a very much higher extracellular Ca** concentration (~10~° 
M). Therefore, even a small influx of Ca** significantly increases the concentra- 
tion of free Ca** in the cytosol, and the flow of Ca**+ down its steep concentration 
gradient in response to extracellular signals is one means of transmitting these 
signals rapidly across the plasma membrane (discussed in Chapter 15). It is thus 


Figure 11-12 Three types of ATP-driven 
pumps. Like any enzyme, all ATP-driven 
pumps can work in either direction, 
depending on the electrochemical 
gradients of their solutes and the ATP/ADP 
ratio. When the ATP/ADP ratio is high, they 
hydrolyze ATP; when the ATP/ADP ratio is 
low, they can synthesize ATP. The F-type 
ATPase in mitochondria normally works in 
this “reverse” mode to make most of the 
cell’s ATP. 
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important that the cell maintains a steep Ca** gradient across its plasma mem- 
brane. Ca? transporters that actively pump Ca** out of the cell help maintain the 
gradient. One of these is a P-type Ca** ATPase; the other is an antiporter (called 
a Na*-Ca** exchanger) that is driven by the Nat electrochemical gradient (dis- 
cussed in Chapter 15). 

The Ca*+ pump, or Ca2* ATPase, in the sarcoplasmic reticulum (SR) mem- 
brane of skeletal muscle cells is a well-understood P-type transport ATPase. The 
SR is a specialized type of endoplasmic reticulum that forms a network of tubular 
sacs in the muscle cell cytoplasm, and it serves as an intracellular store of Ca**. 
When an action potential depolarizes the muscle cell plasma membrane, Ca** is 
released into the cytosol from the SR through Ca*t-release channels, stimulating 
the muscle to contract (discussed in Chapters 15 and 16). The Ca** pump, which 
accounts for about 90% of the membrane protein of the SR, moves Ca** from the 
cytosol back into the SR. The endoplasmic reticulum of nonmuscle cells contains 
a similar Ca** pump, but in smaller quantities. 

Enzymatic studies and analyses of the three-dimensional structures of trans- 
port intermediates of the SR Ca*+ pump and related pumps have revealed the 
molecular mechanism of P-type transport ATPases in great detail. They all have 
similar structures, containing 10 transmembrane @ helices connected to three 
cytosolic domains (Figure 11-13). In the Ca** pump, amino acid side chains pro- 
truding from the transmembrane helices form two centrally positioned binding 
sites for Ca**. As shown in Figure 11-14, in the pump’s ATP-bound nonphosphor- 
ylated state, these binding sites are accessible only from the cytosolic side of the 
SR membrane. Ca** binding triggers a series of conformational changes that close 
the passageway to the cytosol and activate a phosphotransfer reaction in which 
the terminal phosphate of the ATP is transferred to an aspartate that is highly con- 
served among all P-type ATPases. The ADP then dissociates and is replaced with 
a fresh ATP, causing another conformational change that opens a passageway to 
the SR lumen through which the two Ca** ions exit. They are replaced by two H+ 
ions and a water molecule that stabilize the empty Ca**-binding sites and close 
the passageway to the SR lumen. Hydrolysis of the labile phosphoryl-aspartate 
bond returns the pump to the initial conformation, and the cycle starts again. The 
transient self-phosphorylation of the pump during its cycle is an essential charac- 
teristic of all P-type pumps. 


The Plasma Membrane Na*t-Kt Pump Establishes Nat and K* 
Gradients Across the Plasma Membrane 


The concentration of K* is typically 10-30 times higher inside cells than outside, 
whereas the reverse is true of Na* (see Table 11-1, p. 598). A Nat-Kt pump, or Na+- 
K* ATPase, found in the plasma membrane of virtually all animal cells maintains 
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Figure 11-13 The structure of the 
sarcoplasmic reticulum Ca?+ pump. 

The ribbon model (/eft), derived from x-ray 
crystallographic analyses, shows the pump 
in its phosphorylated, ATP-bound state. 
The three globular cytosolic domains of 
the pump—the nucleotide-binding domain 
(dark green), the activator domain (b/ue), 
and the phosphorylation domain (red), also 
shown schematically on the rght— change 
conformation dramatically during the 
pumping cycle. These changes in turn alter 
the arrangement of the transmembrane 
helices, which allows the Ca?* to be 
released from its binding cavity into the SR 
lumen (Movie 11.3). (PDB code: 3B9B.) 
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these concentration differences. Like the Ca** pump, the Na*-K*t pump belongs 
to the family of P-type ATPases and operates as an ATP-driven antiporter, actively 
pumping Na* out of the cell against its steep electrochemical gradient and pump- 
ing Kt in (Figure 11-15). 

We mentioned earlier that the Nat gradient produced by the Na*t-K* pump 
drives the transport of most nutrients into animal cells and also has a crucial role 
in regulating cytosolic pH. A typical animal cell devotes almost one-third of its 
energy to fueling this pump, and the pump consumes even more energy in nerve 
cells and in cells that are dedicated to transport processes, such as those forming 
kidney tubules. 

Since the Na*-K* pump drives three positively charged ions out of the cell for 
every two it pumps in, it is electrogenic: it drives a net electric current across the 
membrane, tending to create an electrical potential, with the cell’s inside being 
negative relative to the outside. This electrogenic effect of the pump, however, sel- 
dom directly contributes more than 10% to the membrane potential. The remain- 
ing 90%, as we discuss later, depends only indirectly on the Nat-K* pump. 
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Figure 11-14 The pumping cycle of the 
sarcoplasmic reticulum Ca2+ pump. 

lon pumping proceeds by a series of 
stepwise conformational changes in 

which movements of the pump’s three 
cytosolic domains [the nucleotide-binding 
domain (N), the phosphorylation domain 
(P), and the activator domain (A)] are 
mechanically coupled to movements of the 
transmembrane a helices. Helix movement 
opens and closes passageways through 
which Ca?* enters from the cytosol and 
binds to the two centrally located Ca?* 
binding sites. The two Ca?* then exit 

into the SR lumen and are replaced by 
two Ht, which are transported in the 
opposite direction. The Ca*+-dependent 
phosphorylation and H*-dependent 
dephosphorylation of aspartic acid are 
universally conserved steps in the reaction 
cycle of all P-type pumps: they cause the 
conformational transitions to occur in an 
orderly manner, enabling the proteins to do 
useful work. (Adapted from C. Toyoshima 
et al., Nature 432:361-368, 2004 and 

J.V. Moller et al., Q. Rev. Biophys. 43:501- 
566, 2010.) 


Figure 11-15 The function of the 

Na*-K* pump. This P-type ATPase 
actively pumps Nat out of and K* into a 
cell against their electrochemical gradients. 
It is structurally closely related to the 

Ca?+ ATPase but differs in its selectivity for 
ions: for every molecule of ATP hydrolyzed 
by the pump, three Na* are pumped out 
and two K* are pumped in. As in the 

Ca*+ pump, an aspartate is phosphorylated 
and dephosphorylated during the 

pumping cycle (Movie 11.4). 
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ABC Transporters Constitute the Largest Family of Membrane 
Transport Proteins 


The last type of transport ATPase that we discuss is the family of the ABC trans- 
porters, so named because each member contains two highly conserved ATPase 
domains, or ATP-Binding “Cassettes, on the cytosolic side of the membrane. 
ATP binding brings together the two ATPase domains, and ATP hydrolysis leads 
to their dissociation (Figure 11-16). These movements of the cytosolic domains 
are transmitted to the transmembrane segments, driving cycles of conformational 
changes that alternately expose solute-binding sites on one side of the membrane 
and then on the other, as we have seen for other transporters. In this way, ABC 
transporters harvest the energy released upon ATP binding and hydrolysis to 
drive transport of solutes across the bilayer. The transport is directional toward 
inside or toward outside, depending on the particular conformational change in 
the solute binding site that is linked to ATP hydrolysis (see Figure 11-16). 

ABC transporters constitute the largest family of membrane transport proteins 
and are of great clinical importance. The first of these proteins to be characterized 
was found in bacteria. We have already mentioned that the plasma membranes 
of all bacteria contain transporters that use the H* gradient across the membrane 
to actively transport a variety of nutrients into the cell. In addition, bacteria use 
ABC transporters to import certain small molecules. In bacteria such as E. coli 
that have double membranes (Figure 11-17), the ABC transporters are located in 
the inner membrane, and an auxiliary mechanism operates to capture the nutri- 
ents and deliver them to the transporters (Figure 11-18). 

In E. coli, 78 genes (an amazing 5% of the bacterium’s genes) encode ABC 
transporters, and animal genomes encode an even larger number. Although each 
transporter is thought to be specific for a particular molecule or class of molecules, 
the variety of substrates transported by this superfamily is great and includes inor- 
ganic ions, amino acids, mono- and polysaccharides, peptides, lipids, drugs, and, 
in some cases, even proteins that can be larger than the transporter itself. 


Figure 11-16 Small-molecule transport 
by typical ABC transporters. ABC 
small solute molecule transporters consist of multiple domains. 
@ Typically, two hydrophobic domains, 
hydrophobic each built of six membrane-spanning a 
domains helices, together form the translocation 
I pathway and provide substrate specificity. 
P~ Two ATPase domains protrude into the 
cytosol. In some cases, the two halves 
CYTOSOL of the transporter are formed by a single 


polypeptide, whereas in other cases they 
are formed by two or more separate 
polypeptides that assemble into a similar 
be ATP structure. Without ATP bound, the 
ATPase 


transporter exposes a substrate-binding 
site on one side of the membrane. ATP 
binding induces a conformational change 
that exposes the substrate-binding site 
on the opposite side; ATP hydrolysis 
followed by ADP dissociation returns the 
transporter to its original conformation. 
Most individual ABC transporters are 
unidirectional. (A) Both importing and 
exporting ABC transporters are found in 
bacteria; an ABC importer is shown in this 
cartoon. The crystal structure of a bacterial 


CYTOSOL ABC transporter is shown in Figure 3-76. 
(B) In eukaryotes, most ABC transporters 
export substances—either from the cytosol 
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ee Figure 11-17 A small section of 
the double membrane of an E. coli 
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The first eukaryotic ABC transporters identified were discovered because of 
their ability to pump hydrophobic drugs out of the cytosol. One of these transport- 
ers is the multidrug resistance (MDR) protein, also called P-glycoprotein. It is 
present at elevated levels in many human cancer cells and makes the cells simul- 
taneously resistant to a variety of chemically unrelated cytotoxic drugs that are 
widely used in cancer chemotherapy. Treatment with any one of these drugs can 
resultin the selective survival and overgrowth of those cancer cells that express an 
especially large amount of the MDR transporter. These cells pump drugs out of the 
cell very efficiently and are therefore relatively resistant to the drugs’ toxic effects 
(Movie 11.5). Selection for cancer cells with resistance to one drug can thereby 
lead to resistance to a wide variety of anticancer drugs. Some studies indicate that 
up to 40% of human cancers develop multidrug resistance, making it a major hur- 
dle in the battle against cancer. 

A related and equally sinister phenomenon occurs in the protist Plasmodium 
falciparum, which causes malaria. More than 200 million people are infected 
worldwide with this parasite, which remains a major cause of human death, 
killing almost a million people every year. The development of resistance to the 
antimalarial drug chloroquine has hampered the control of malaria. The resistant 
P. falciparum have amplified a gene encoding an ABC transporter that pumps out 
the chloroquine. 


`» m solute 


| porin 
() 
periplasmic substrate- | 


binding protein with ô 
a 


_ CELL EXTERIOR 





solute-free periplasmic 
substrate-binding 
protein 


J 


bound solute 





m — 








~ CYTOSOL 


ABC transporter 


layer is shown). This space also contains 

a variety of soluble protein molecules. The 
dashed threads (shown in green) at the 
top represent the polysaccharide chains of 
the special lipopolysaccharide molecules 
that form the external monolayer of the 
outer membrane; for clarity, only a few 

of these chains are shown. Bacteria with 
double membranes are called Gram- 
negative because they do not retain the 
dark blue dye used in Gram staining. 
Bacteria with single membranes (but 
thicker peptidoglycan cell walls), Such as 
staphylococci and streptococci, retain 

the blue dye and are therefore called 
Gram-positive; their single membrane is 
analogous to the inner (plasma) membrane 
of Gram-negative bacteria. 


Figure 11-18 The auxiliary transport 
system associated with transport 
ATPases in bacteria with double 
membranes. The solute diffuses through 
channel proteins (porins) in the outer 
membrane and binds to a periplasmic 
substrate-binding protein that delivers it to 
the ABC transporter, which pumps it across 
the plasma membrane. The peptidoglycan 
is omitted for simplicity; its porous structure 
allows the substrate-binding proteins and 
water-soluble solutes to move through it by 
diffusion. 
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In most vertebrate cells, an ABC transporter in the endoplasmic reticulum 
(ER) membrane (named transporter associated with antigen processing, or TAP 
transporter) actively pumps a wide variety of peptides from the cytosol into the 
ER lumen. These peptides are produced by protein degradation in proteasomes 
(discussed in Chapter 6). They are carried from the ER to the cell surface, where 
they are displayed for scrutiny by cytotoxic T lymphocytes, which kill the cell if the 
peptides are derived from a virus or other microorganism lurking in the cytosol of 
an infected cell (discussed in Chapter 24). 

Yet another member of the ABC transporter family is the cystic fibrosis trans- 
membrane conductance regulator protein (CFTR), which was discovered through 
studies of the common genetic disease cystic fibrosis. This disease is caused by 
a mutation in the gene encoding CFTR, a Cl transport protein in the plasma 
membrane of epithelial cells. CFTR regulates ion concentrations in the extracel- 
lular fluid, especially in the lung. One in 27 Caucasians carries a gene encoding 
a mutant form of this protein; in 1 in 2900, both copies of the gene are mutated, 
causing the disease. In contrast to other ABC transporters, ATP binding and 
hydrolysis in the CFTR protein do not drive the transport process. Instead, they 
control the opening and closing of a continuous channel, which provides a pas- 
sive conduit for Cl” to move down its electrochemical gradient. Thus, some ABC 
proteins can function as transporters and others as gated channels. 


Summary 


Transporters bind specific solutes and transfer them across the lipid bilayer by 
undergoing conformational changes that alternately expose the solute-binding site 
on one side of the membrane and then on the other. Some transporters move a sin- 
gle solute “downhill,” whereas others can act as pumps to move a solute “uphill” 
against its electrochemical gradient, using energy provided by ATP hydrolysis, by a 
downhill flow of another solute (such as Nat or H*), or by light to drive the requisite 
series of conformational changes in an orderly manner. Transporters belong to a 
small number of protein families. Each family evolved from a common ancestral 
protein, and its members all operate by a similar mechanism. The family of P-type 
transport ATPases, which includes Ca** and Na*-K* pumps, is an important exam- 
ple; each of these ATPases sequentially phosphorylates and dephosphorylates itself 
during the pumping cycle. The superfamily of ABC transporters is the largest family 
of membrane transport proteins and is especially important clinically. It includes 
proteins that are responsible for cystic fibrosis, for drug resistance in both cancer 
cells and malaria-causing parasites, and for pumping pathogen-derived peptides 
into the ER for cytotoxic lymphocytes to reorganize on the surface of infected cells. 


CHANNELS AND THE ELECTRICAL PROPERTIES 
OF MEMBRANES 


Unlike transporters, channels form pores across membranes. One class of chan- 
nel proteins found in virtually all animals forms gap junctions between adjacent 
cells; each plasma membrane contributes equally to the formation of the chan- 
nel, which connects the cytoplasm of the two cells. These channels are discussed 
in Chapter 19 and will not be considered further here. Both gap junctions and 
porins, the channels in the outer membranes of bacteria, mitochondria, and chlo- 
roplasts (discussed in Chapter 10), have relatively large and permissive pores, and 
it would be disastrous if they directly connected the inside of a cell to an extracel- 
lular space. Indeed, many bacterial toxins do exactly that to kill other cells (dis- 
cussed in Chapter 24). 

In contrast, most channels in the plasma membrane of animal and plant cells 
that connect the cytosol to the cell exterior necessarily have narrow, highly selec- 
tive pores that can open and close rapidly. Because these proteins are concerned 
specifically with inorganic ion transport, they are referred to as ion channels. For 
transport efficiency, ion channels have an advantage over transporters, in that 
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they can pass up to 100 million ions through one open channel each second—a 
rate 10° times greater than the fastest rate of transport mediated by any known 
transporter. As discussed earlier, however, channels cannot be coupled to an 
energy source to perform active transport, so the transport they mediate is always 
passive (downhill). Thus, the function ofion channels is to allow specific inorganic 
ions—primarily Nat, K+, Ca**, or Cl-—to diffuse rapidly down their electrochem- 
ical gradients across the lipid bilayer. In this section, we will see that the ability 
to control ion fluxes through these channels is essential for many cell functions. 
Nerve cells (neurons), in particular, have made a specialty of using ion channels, 
and we will consider how they use many different ion channels to receive, con- 
duct, and transmit signals. Before we discuss ion channels, however, we briefly 
consider the aquaporin water channels that we mentioned earlier. 


Aquaporins Are Permeable to Water But Impermeable to lons 


Because cells are mostly water (typically ~70% by weight), water movement across 
cell membranes is fundamentally important for life. Cells also contain a high con- 
centration of solutes, including numerous negatively charged organic molecules 
that are confined inside the cell (the so-called fixed anions) and their accompa- 
nying cations that are required for charge balance. This creates an osmotic gra- 
dient, which mostly is balanced by an opposite osmotic gradient due to a high 
concentration of inorganic ions—chiefly Nat and Cl-—in the extracellular fluid. 
The small remaining osmotic force tends to “pull” water into the cell, causing it to 
swell until the forces are balanced. Because all biological membranes are moder- 
ately permeable to water (see Figure 11-2), cell volume equilibrates in minutes or 
less in response to an osmotic gradient. For most animal cells, however, osmosis 
has only a minor role in regulating cell volume. This is because most of the cyto- 
plasm is in a gel-like state and resists large changes in its volume in response to 
changes in osmolarity. 

In addition to the direct diffusion of water across the lipid bilayer, some pro- 
karyotic and eukaryotic cells have water channels, or aquaporins, embedded in 
their plasma membrane to allow water to move more rapidly. Aquaporins are par- 
ticularly abundant in animal cells that must transport water at high rates, such as 
the epithelial cells of the kidney or exocrine cells that must transport or secrete 
large volumes of fluids, respectively (Figure 11-19). 

Aquaporins must solve a problem that is opposite to that facing ion channels. 
To avoid disrupting ion gradients across membranes, they have to allow the rapid 
passage of water molecules while completely blocking the passage of ions. The 
three-dimensional structure of an aquaporin reveals how it achieves this remark- 
able selectivity. The channels have a narrow pore that allows water molecules to 
traverse the membrane in single file, following the path of carbonyl oxygens that 
line one side of the pore (Figure 11-20A and B). Hydrophobic amino acids line 
the other side of the pore. The pore is too narrow for any hydrated ion to enter, and 
the energy cost of dehydrating an ion would be enormous because the hydropho- 
bic wall of the pore cannot interact with a dehydrated ion to compensate for the 
loss of water. This design readily explains why the aquaporins cannot conduct K*, 





ion pumps 
and channels basolateral membrane 


Figure 11-19 The role of aquaporins in 
fluid secretion. Cells lining the ducts of 
exocrine glands (as found, for example, in 
the pancreas and liver, and in mammary, 
sweat, and salivary glands) secrete large 
volumes of body fluids. These cells are 
organized into epithelial sheets in which 
their apical plasma membrane faces the 
lumen of the duct. lon pumps and channels 
situated in the basolateral and apical 
plasma membrane move ions (mostly 
Nat and Cl) into the ductal lumen, 
creating an osmotic gradient between the 
surrounding tissue and the duct. Water 
molecules rapidly follow the osmotic 
gradient through aquaporins that are 
present in high concentrations in both the 
apical and basolateral membranes. 
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Na+, Cat, or CI- ions. These channels are also impermeable to H+, which is mainly 
present in cells as H30*. These hydronium ions diffuse through water extremely 
rapidly, using a molecular relay mechanism that requires the making and breaking 
of hydrogen bonds between adjacent water molecules (Figure 11-20C). Aquapo- 
rins contain two strategically placed asparagines, which bind to the oxygen atom 
of the central water molecule in the line of water molecules traversing the pore, 
imposing a bipolarity on the entire column of water molecules (Figure 11-20C and 
D). This makes it impossible for the “making and breaking” sequence of hydrogen 
bonds (shown in Figure 11-20C) to get past the central asparagine-bonded water 
molecule. Because both valences of this central oxygen are unavailable for hydro- 
gen-bonding, the central water molecule cannot participate in an H* relay, and 
the pore is therefore impermeable to H*. 
We now turn to ion channels, the subject of the rest of the chapter. 


lon Channels Are lon-Selective and Fluctuate Between Open and 
Closed States 


Two important properties distinguish ion channels from aqueous pores. First, 
they show ion selectivity, permitting some inorganic ions to pass, but not others. 
This suggests that their pores must be narrow enough in places to force perme- 
ating ions into intimate contact with the walls of the channel so that only ions of 
appropriate size and charge can pass. The permeating ions have to shed most or 
all of their associated water molecules to pass, often in single file, through the nar- 
rowest part of the channel, which is called the selectivity filter; this limits their rate 
of passage (Figure 11-21). Thus, as the ion concentration increases, the flux of the 
ion through a channel increases proportionally but then levels off (saturates) at a 
maximum rate. 
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Figure 11-20 The structure of 
aquaporins. (A) A ribbon diagram of an 
aquaporin monomer. In the membrane, 
aquaporins form tetramers, with each 
monomer containing an aqueous pore 

in its center (not shown). Each individual 
aquaporin channel passes about 109 
water molecules per second. (B) A 
longitudinal cross section through one 
aquaporin monomer, in the plane of the 
central pore. One face of the pore is 

lined with hydrophilic amino acids, which 
provide transient hydrogen bonds to water 
molecules; these bonds help line up the 
transiting water molecules in a single 

row and orient them as they traverse the 
pore. (C and D) A model explaining why 
aquaporins are impermeable to Ht. 

(C) In water, H* diffuses extremely rapidly 
by being relayed from one water molecule 
to the next. (D) Carbonyl groups (C=O) 
lining the hydrophilic face of the pore align 
water molecules, and two strategically 
placed asparagines in the center help 
tether a central water molecule such that 
both valences on its oxygen are occupied. 
This arrangement bipolarizes the entire 
line of water molecules, with each water 
molecule acting as a hydrogen-bond 
acceptor from its inner neighbor (Movie 
11.6). (A and B, adapted from R.M. Stroud 
et al., Curr Opin. Struct. Biol. 13:424-431, 
2003. With permission from Elsevier.) 


Figure 11-21 A typical ion channel, 
which fluctuates between closed and 
open conformations. The ion channel 
shown here in cross section forms a pore 
across the lipid bilayer only in the “open” 
conformational state. The pore narrows 

to atomic dimensions in one region (the 
selectivity filter), where the ion selectivity of 
the channel is largely determined. Another 
region of the channel forms the gate. 
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Figure 11-22 The gating of ion channels. This schematic drawing shows several kinds of stimuli 
that open ion channels. Mechanically gated channels often have cytoplasmic extensions (not 
shown) that link the channel to the cytoskeleton. 


The second important distinction between ion channels and aqueous pores is 
that ion channels are not continuously open. Instead, they are gated, which allows 
them to open briefly and then close again. Moreover, with prolonged (chemical or 
electrical) stimulation, most ion channels go into a closed “desensitized,” or “inac- 
tivated,” state, in which they are refractory to further opening until the stimulus 
has been removed, as we discuss later. In most cases, the gate opens in response 
to a specific stimulus. As shown in Figure 11-22, the main types of stimuli that are 
known to cause ion channels to open are a change in the voltage across the mem- 
brane (voltage-gated channels), a mechanical stress (mechanically gated chan- 
nels), or the binding of a ligand (ligand-gated channels). The ligand can be either 
an extracellular mediator—specifically, a neurotransmitter (transmitter-gated 
channels)—or an intracellular mediator such as an ion (ion-gated channels) or 
a nucleotide (nucleotide-gated channels). In addition, protein phosphorylation 
and dephosphorylation regulates the activity of many ion channels; this type of 
channel regulation is discussed, together with nucleotide-gated ion channels, in 
Chapter 15. 

More than 100 types of ion channels have been identified thus far, and new 
ones are still being discovered, each characterized by the ions it conducts, the 
mechanism by which it is gated, and its abundance and localization in the cell and 
in specific cells. lon channels are responsible for the electrical excitability of mus- 
cle cells, and they mediate most forms of electrical signaling in the nervous sys- 
tem. A single neuron typically contains 10 or more kinds of ion channels, located 
in different domains of its plasma membrane. But ion channels are not restricted 
to electrically excitable cells. They are present in all animal cells and are found in 
plant cells and microorganisms: they propagate the leaf-closing response of the 
mimosa plant, for example (Movie 11.7), and allow the single-celled Paramecium 
to reverse direction after a collision. 

Ion channels that are permeable mainly to K* are found in the plasma mem- 
brane of almost all cells. An important subset of K* channels opens even in an 
unstimulated or “resting” cell, and hence these are called K* leak channels. 
Although this term applies to many different K* channels, depending on the cell 
type, they serve a common purpose: by making the plasma membrane much 
more permeable to K* than to other ions, they have a crucial role in maintaining 
the membrane potential across all plasma membranes, as we discuss next. 
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The Membrane Potential in Animal Cells Depends Mainly on Kt 
Leak Channels and the Kt Gradient Across the Plasma Membrane 


A membrane potential arises when there is a difference in the electrical charge 
on the two sides of a membrane, due to a slight excess of positive ions over neg- 
ative ones on one side and a slight deficit on the other. Such charge differences 
can result both from active electrogenic pumping (see p. 608) and from passive 
ion diffusion. As we discuss in Chapter 14, electrogenic H* pumps in the mito- 
chondrial inner membrane generate most of the membrane potential across this 
membrane. Electrogenic pumps also generate most of the electrical potential 
across the plasma membrane in plants and fungi. In typical animal cells, however, 
passive ion movements make the largest contribution to the electrical potential 
across the plasma membrane. 

As explained earlier, due to the action of the Na*-K* pump, there is little Na* 
inside the cell, and other intracellular inorganic cations have to be plentiful 
enough to balance the charge carried by the cell’s fixed anions—the negatively 
charged organic molecules that are confined inside the cell. This balancing role 
is performed largely by K*, which is actively pumped into the cell by the Na*- 
K* pump and can also move freely in or out through the K* leak channels in the 
plasma membrane. Because of the presence of these channels, K* comes almost 
to equilibrium, where an electrical force exerted by an excess of negative charges 
attracting K* into the cell balances the tendency of K* to leak out down its con- 
centration gradient. The membrane potential (of the plasma membrane) is the 
manifestation of this electrical force, and we can calculate its equilibrium value 
from the steepness of the K* concentration gradient. The following argument may 
help to make this clear. 

Suppose that initially there is no voltage gradient across the plasma membrane 
(the membrane potential is zero) but the concentration of K* is high inside the cell 
and low outside. K* will tend to leave the cell through the K* leak channels, driven 
by its concentration gradient. As K* begins to move out, each ion leaves behind 
an unbalanced negative charge, thereby creating an electrical field, or membrane 
potential, which will tend to oppose the further efflux of K*. The net efflux of K* 
halts when the membrane potential reaches a value at which this electrical driv- 
ing force on K* exactly balances the effect of its concentration gradient—that is, 
when the electrochemical gradient for K* is zero. Although Cl ions also equili- 
brate across the membrane, the membrane potential keeps most of these ions out 
of the cell because their charge is negative. 

The equilibrium condition, in which there is no net flow of ions across the 
plasma membrane, defines the resting membrane potential for this idealized cell. 
A simple but very important formula, the Nernst equation, quantifies the equilib- 
rium condition and, as explained in Panel 11-1, makes it possible to calculate 
the theoretical resting membrane potential if we know the ratio of internal and 
external ion concentrations. As the plasma membrane of a real cell is not exclu- 
sively permeable to K* and Cl, however, the actual resting membrane potential 
is usually not exactly equal to that predicted by the Nernst equation for K* or CI. 


The Resting Potential Decays Only Slowly When the 
Nat-Kt Pump Is Stopped 


Movement of only a minute number of inorganic ions across the plasma mem- 
brane through ion channels suffices to set up the membrane potential. Thus, we 
can think of the membrane potential as arising from movements of charge that 
leave ion concentrations practically unaffected and result in only a very slight 
discrepancy in the number of positive and negative ions on the two sides of the 
membrane (Figure 11-23). Moreover, these movements of charge are generally 
rapid, taking only a few milliseconds or less. 

Consider the change in the membrane potential in a real cell after the sudden 
inactivation of the Nat-K* pump. A slight drop in the membrane potential occurs 
immediately. This is because the pump is electrogenic and, when active, makes a 
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PANEL 11-1: The Derivation of the Nernst Equation 


THE NERNST EQUATION AND ION FLOW 


The flow of any inorganic ion through a membrane 
channel is driven by the electrochemical gradient for that 
ion. This gradient represents the combination of two 
influences: the voltage gradient and the concentration 
gradient of the ion across the membrane. When these 
two influences just balance each other, the 
electrochemical gradient for the ion is zero, and there is 
no net flow of the ion through the channel. The voltage 
gradient (membrane potential) at which this equilibrium 
is reached is called the equilibrium potential for the ion. 
It can be calculated from an equation that will be derived 
below, called the Nernst equation. 


The Nernst equation is 


where 


V = the equilibrium potential in volts (internal 
potential minus external potential) 

C, and Ç; = outside and inside concentrations of the 
ion, respectively 

R = the gas constant (8.3 J mol”! K~’) 

T = the absolute temperature (K) 

F = Faraday’s constant (9.6 x 104 J V-' mol’) 

z = the valence (charge) of the ion 

In = logarithm to the base e 


The Nernst equation is derived as follows: 


A molecule in solution (a solute) tends to move from a 
region of high concentration to a region of low 
concentration simply due to the random movement of 
molecules, which results in their equilibrium. 
Consequently, movement down a concentration gradient 
is accompanied by a favorable free-energy change 

(AG < 0), whereas movement up a concentration gradient 
is accompanied by an unfavorable free-energy change 
(AG > 0). (Free energy is introduced in Chapter 2 and 
discussed in the context of redox reactions in 

Panel 14-1, p. 765.) 


The free-energy change per mole of solute moved across 
the plasma membrane (AG,,,,,.) is equal to -RT In C,/ G. 


If the solute is an ion, moving it into a cell across a 
membrane whose inside is at a voltage V relative to the 
outside will cause an additional free-energy change (per 
mole of solute moved) of AG, = ZFV. 


At the point where the concentration and voltage 
gradients just balance, 
AGeonc + AGyoit = 0 


and the ion distribution is at equilibrium across the 
membrane. 


Thus, C 
zFV-RT In —2=0 


l 
and, therefore, 


zF C 


or, using the constant that converts natural logarithms to 
base 10, 
Co 


RT 
V= 2.3 — lo a 
910 C; 


zF 
For a univalent cation, 


35 sr =58 mV at20°C and 61.5 mV at 37°C. 


Thus, for such an ion at 37°C, 


V=+ 61.5 mV for C,/C = 10, 
whereas 
V=0TorC,/ C= 1: 


The K* equilibrium potential (Vę), for example, is 
61.5 logio([K*], / [K*],) millivolts 


(-89 mV for a typical cell, where [K+], = 5 mM 
and [K+]; = 140 mM). 


At V,, there is no net flow of Kt across the membrane. 


Similarly, when the membrane potential has a value of 
61.5 logio([Na*], /[Na*],), 
the Na* equilibrium potential (Vna), 
there is no net flow of Nat. 


For any particular membrane potential, Vm, the net 

force tending to drive a particular type of ion out of the 
cell, is proportional to the difference between Vy and the 
equilibrium potential for the ion: hence, 


for Kt it is Vm — Vk 
and for Nat it is Vm — Vya- 


When there is a voltage gradient across the membrane, 
the ions responsible for it—the positive ions on one side 
and the negative ions on the other—are concentrated in 
thin layers on either side of the membrane because of the 
attraction between positive and negative electric charges. 
The number of ions that go to form the layer of charge 
adjacent to the membrane is minute compared with the 
total number inside the cell. For example, the movement 
of 6000 Nat ions across 1 um? of membrane will carry 
sufficient charge to shift the membrane potential by 
about 100 mV. 


Because there are about 3 x 107 Na+ ions in a typical cell 
(1 um? of bulk cytoplasm), such a movement of charge 
will generally have a negligible effect on the ion 
concentration gradients across the membrane. 
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Figure 11-23 The ionic basis of a 
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means, for example, that in a spherical cell 
of diameter 10 um, the number of K+ ions 


small direct contribution to the membrane potential by pumping out three Nat that have to flow out to alter the membrane 
for every two K* that it pumps in (see Figure 11-15). However, switching off the  POtential by 100 mV is only about — 
: : : ; e 1/100,000 of the total number of K* ions in 
pump does not abolish the major component of the resting potential, whichis ihe éyiosol. This amount is so minute that 
generated by the K* equilibrium mechanism just described. This component of the intracellular Kt concentration remains 
the membrane potential persists as long as the Na* concentration inside the cell virtually unchanged. 
stays low and the K* ion concentration high—typically for many minutes. But the 
plasma membrane is somewhat permeable to all small ions, including Na‘. There- 
fore, without the Na*-K* pump, the ion gradients set up by the pump will eventu- 
ally run down, and the membrane potential established by diffusion through the 
K* leak channels will fall as well. As Na* enters, the cell eventually comes to a new 
resting state where Na*, K*, and Cl are all at equilibrium across the membrane. 
The membrane potential in this state is much less than it was in the normal cell 
with an active Na*-K* pump. 
The resting potential of an animal cell varies between -20 mV and -120 mV, 
depending on the organism and cell type. Although the K* gradient always has a 
major influence on this potential, the gradients of other ions (and the disequili- 
brating effects of ion pumps) also have a significant effect: the more permeable 
the membrane for a given ion, the more strongly the membrane potential tends to 
be driven toward the equilibrium value for that ion. Consequently, changes in a 
membrane’s permeability to ions can cause significant changes in the membrane 
potential. This is one of the key principles relating the electrical excitability of cells 
to the activities of ion channels. 
To understand how ion channels select their ions and how they open and 
close, we need to know their atomic structure. The first ion channel to be crystal- 
lized and studied by x-ray diffraction was a bacterial K* channel. The details of its 
structure revolutionized our understanding of ion channels. 


The Three-Dimensional Structure of a Bacterial Kt Channel Shows 
How an lon Channel Can Work 


Scientists were puzzled by the remarkable ability of ion channels to combine 
exquisite ion selectivity with a high conductance. K* leak channels, for exam- 
ple, conduct K* 10,000-fold faster than Nat, yet the two ions are both featureless 
spheres and have similar diameters (0.133 nm and 0.095 nm, respectively). A sin- 
gle amino acid substitution in the pore of an animal cell K* channel can result in 
a loss of ion selectivity and cell death. We cannot explain the normal K* selectivity 
by pore size, because Na* is smaller than K*. Moreover, the high conductance rate 
is incompatible with the channel’s having selective, high-affinity K*-binding sites, 
as the binding of K* ions to such sites would greatly slow their passage. 

The puzzle was solved when the structure of a bacterial Kt channel was deter- 
mined by x-ray crystallography. The channel is made from four identical trans- 
membrane subunits, which together form a central pore through the membrane. 
Each subunit contributes two transmembrane a helices, which are tilted outward 
in the membrane and together form a cone, with its wide end facing the outside of 
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Figure 11-24 The structure of a bacterial Kt channel. (A) The transmembrane a helices from only two of the four identical 
subunits are shown. From the cytosolic side, the pore (schematically shaded in blue) opens up into a vestibule in the middle of 
the membrane. The pore vestibule facilitates transport by allowing the K* ions to remain hydrated even though they are more 

than halfway across the membrane. The narrow selectivity filter of the pore links the vestibule to the outside of the cell. Carbonyl 
oxygens line the walls of the selectivity filter and form transient binding sites for dehydrated K* ions. Two K* ions occupy different 
sites in the selectivity filter, while a third K* ion is located in the center of the vestibule, where it is stabilized by electrical interactions 
with the more negatively charged ends of the pore helices. The ends of the four short “pore helices” (only two of which are shown) 
point precisely toward the center of the vestibule, thereby guiding K* ions into the selectivity filter (Movie 11.8). (B) Peptide bonds 
have an electric dipole, with more negative charge accumulated at the oxygen of the C=O bond and at the nitrogen of the 

N-H bond. In an a helix, hydrogen bonds (red) align the dipoles. As a consequence, every a helix has an electric dipole along its 
axis, resulting from summation of the dipoles of the individual peptide bonds, with a more negatively charged C-terminal end (37) 
and a more positively charged N-terminal end (ô+). (A, adapted from D.A. Doyle et al., Science 280:69-77, 1998.) 


the cell where K* ions exit from the channel (Figure 11-24). The polypeptide chain 
that connects the two transmembrane helices forms a short a helix (the pore helix) 
and a crucial loop that protrudes into the wide section of the cone to form the 
selectivity filter. The selectivity loops from the four subunits form a short, rigid, 
narrow pore, which is lined by the carbonyl oxygen atoms of their polypeptide 
backbones. Because the selectivity loops of all known K* channels have similar 
amino acid sequences, it is likely that they form a closely similar structure. 

The structure of the selectivity filter explains the ion selectivity of the channel. 
A K* ion must lose almost all of its bound water molecules to enter the filter, where 
it interacts instead with the carbonyl oxygens lining the filter; the oxygens are rig- 
idly spaced at the exact distance to accommodate a K* ion. A Na* ion, in contrast, 
cannot enter the filter because the carbonyl oxygens are too far away from the 
smaller Na* ion to compensate for the energy expense associated with the loss of 
water molecules required for entry (Figure 11-25). 

Structural studies of K* channels and other ion channels have also indicated 
some general principles of how these channels open and close. The gating involves 
movement of the helices in the membrane so that they either obstruct or open the 
path for ion movement. Depending on the particular type of channel, helices tilt, 
rotate, or bend during gating. The structure of a closed K* channel shows that by 
tilting the inner helices, the pore constricts like a diaphragm at its cytosolic end 
(Figure 11-26). Bulky hydrophobic amino acid side chains block the small open- 
ing that remains, preventing the entry of ions. 

Many other ion channels operate on similar principles: the channel’s gating 
helices are allosterically coupled to domains that form the ion-conducting path- 
way; and a conformational change in the gate—in response, say, to ligand binding 
or altered membrane potential—brings about conformational change in the con- 
ducting pathway, either opening it or blocking it off. 
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vestibule, the ions are hydrated. In the 
selectivity filter, they have lost their water, 
and the carbonyl oxygens are placed to 
accommodate a dehydrated K* ion. The 
dehydration of the K* ion requires energy, 
which is precisely balanced by the energy 
regained by the interaction of the ion with 
all of the carbonyl oxygens that serve as 
surrogate water molecules. Because the 
Na* ion is too small to interact with the 
oxygens, it can enter the selectivity filter 
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von only at a great energetic expense. The 
‘siev D-o wn 4) mo -< Deo © a = filter therefore selects K* ions with high 
specificity. (A, adapted from Y. Zhou et al., 
O Nature 414:43-48, 2001. With permission 
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Mechanosensitive Channels Protect Bacterial Cells Against 
Extreme Osmotic Pressures 


All organisms, from single-cell bacteria to multicellular animals and plants, must 
sense and respond to mechanical forces in their external environment (such as 
sound, touch, pressure, shear forces, and gravity) and in their internal environ- 
ment (such as osmotic pressure and membrane bending). Numerous proteins are 
known to be capable of responding to such mechanical forces, and a large subset 
of those proteins has been identified as possible mechanosensitive channels, but 
very few of the candidate proteins have been shown directly to be mechanically 
activated ion channels. One reason for this dearth in our knowledge is that most 
such channels are extremely rare. Auditory hair cells in the human cochlea, for 
example, contain extraordinarily sensitive mechanically gated ion channels, but 
each of the approximately 15,000 individual hair cells is thought to have a total of 
only 50-100 of them (Movie 11.9). Additional difficulties arise because the gating 
mechanisms of many mechanosensitive channel types require the channels to be 
embedded in complex architectures that require attachment to the extracellular 
matrix or to the cytoskeleton and are difficult to reconstitute in the test tube. The 
study of mechanosensitive receptors is a field of active investigation. 

A well-studied class of mechanosensitive channels is found in the bacterial 
plasma membrane. These channels open in response to mechanical stretch- 
ing of the lipid bilayer in which they are embedded. When a bacterium experi- 
ences a low-ionic-strength external environment (hypotonic conditions), such as 


inner helix ion pore 


Figure 11-26 A model for the gating 

of a bacterial Kt channel. The channel 
is viewed in cross section. To adopt 

the closed conformation, the four inner 
transmembrane helices that line the pore 
on the cytosolic side of the selectivity filter 
(see Figure 11-24) rearrange to close the 
cytosolic entrance to the channel. 
(Adapted from E. Perozo et al., Science 
285:73-78, 1999.) 
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rainwater, the cell swells as water seeps in due to an increase in the osmotic pres- 
sure. If the pressure rises to dangerous levels, the cell opens mechanosensitive 
channels that allow small molecules to leak out. Bacteria that are experimentally 
placed in fresh water can rapidly lose more than 95% of their small molecules in 
this manner, including amino acids, sugars, and potassium ions. However, they 
keep their macromolecules safely inside and thus can recover quickly after envi- 
ronmental conditions return to normal. 

Mechanical gating has been demonstrated using biophysical techniques in 
which force is exerted on pure lipid bilayers containing the bacterial mechano- 
sensitive channels; for example, by applying suction with a micropipette. Such 
measurements demonstrate that the cell has several different channels that open 
at different levels of pressure. The mechanosensitive channel of small conduc- 
tance, called the MscS channel, opens at low and moderate pressures (Figure 
11-27). It is composed of seven identical subunits, which in the open state form a 
pore about 1.3 nm in diameter—just big enough to pass ions and small molecules. 
Large cytoplasmic domains limit the size of molecules that can reach the pore. 
The mechanosensitive channel of large conductance, called the MscL channel, 
opens to over 3 nm in diameter when the pressure gets so high that the cell might 
burst. 


The Function of a Neuron Depends on Its Elongated Structure 


The cells that make most sophisticated use of channels are neurons. Before dis- 
cussing how they do so, we digress briefly to describe how a typical neuron is 
organized. 

The fundamental task of a neuron, or nerve cell, is to receive, conduct, and 
transmit signals. To perform these functions, neurons are often extremely elon- 
gated. In humans, for example, a single neuron extending from the spinal cord to 
a muscle in the foot may be as long as 1 meter. Every neuron consists of a cell body 
(containing the nucleus) with a number of thin processes radiating outward from 
it. Usually one long axon conducts signals away from the cell body toward distant 
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Figure 11-27 The structure of 
mechanosensitive channels. The crystal 
structures of MscS in its (A) closed and 
(B) open conformation are shown. The 
side views (lower panels) show the entire 
protein, including the large intracellular 
domain. The face views (upper panels) 
show the transmembrane domains only. 
The open structure occupies more area in 
the lipid bilayer and is energetically favored 
when a membrane is stretched. This may 
explain why the MscS channel opens as 
pressure builds up inside the cell. (PDB 
codes: 2OAU, 2VV5.) 
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targets, and several shorter, branching dendrites extend from the cell body like 
antennae, providing an enlarged surface area to receive signals from the axons 
of other neurons (Figure 11-28), although the cell body itself also receives such 
signals. A typical axon divides at its far end into many branches, passing on its 
message to many target cells simultaneously. Likewise, the extent of branching 
of the dendrites can be very great—in some cases sufficient to receive as many as 
100,000 inputs on a single neuron. 

Despite the varied significance of the signals carried by different classes of 
neurons, the form of the signal is always the same, consisting of changes in the 
electrical potential across the neuron’s plasma membrane. The signal spreads 
because an electrical disturbance produced in one part of the membrane spreads 
to other parts, although the disturbance becomes weaker with increasing distance 
from its source, unless the neuron expends energy to amplify it as it travels. Over 
short distances, this attenuation is unimportant; in fact, many small neurons con- 
duct their signals passively, without amplification. For long-distance communi- 
cation, however, such passive spread is inadequate. Thus, larger neurons employ 
an active signaling mechanism, which is one of their most striking features. An 
electrical stimulus that exceeds a certain threshold strength triggers an explosion 
of electrical activity that propagates rapidly along the neuron’s plasma membrane 
and is sustained by automatic amplification all along the way. This traveling wave 
of electrical excitation, known as an action potential, or nerve impulse, can carry 
a message without attenuation from one end of a neuron to the other at speeds of 
100 meters per second or more. Action potentials are the direct consequence of 
the properties of voltage-gated cation channels, as we now discuss. 


Voltage-Gated Cation Channels Generate Action Potentials in 
Electrically Excitable Cells 


The plasma membrane of all electrically excitable cells—not only neurons, but 
also muscle, endocrine, and egg cells—contains voltage-gated cation channels, 
which are responsible for generating the action potentials. An action potential 
is triggered by a depolarization of the plasma membrane—that is, by a shift in 
the membrane potential to a less negative value inside. (We shall see later how 
the action of a neurotransmitter causes depolarization.) In nerve and skeletal 
muscle cells, a stimulus that causes sufficient depolarization promptly opens the 
voltage-gated Na* channels, allowing a small amount of Na‘ to enter the cell down 
its electrochemical gradient. The influx of positive charge depolarizes the mem- 
brane further, thereby opening more Na* channels, which admit more Na* ions, 
causing still further depolarization. This self-amplification process (an example of 
positive feedback, discussed in Chapters 8 and 15) continues until, within a frac- 
tion of a millisecond, the electrical potential in the local region of membrane has 
shifted from its resting value of about -70 mV (in squid giant axon; about -40 mV 
in human) to almost as far as the Na* equilibrium potential of about +50 mV (see 
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Figure 11-28 A typical vertebrate 
neuron. The arrows indicate the 
direction in which signals are conveyed. 
The single axon conducts signals 

away from the cell body, while the 
multiple dendrites (and the cell body) 
receive signals from the axons of other 
neurons. The axon terminals end on the 
dendrites or cell body of other neurons 
or on other cell types, such as muscle 
or gland cells. 
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Panel 11-1, p. 616). At this point, when the net electrochemical driving force for 
the flow of Na* is almost zero, the cell would come to a new resting state, with all of 
its Na* channels permanently open, if the open conformation of the channel were 
stable. Two mechanisms act in concert to save the cell from such a permanent 
electrical spasm: the Na* channels automatically inactivate and voltage-gated K* 
channels open to restore the membrane potential to its initial negative value. 

The Nat channel is built from a single polypeptide chain that contains four 
structurally very similar domains. It is thought that these domains evolved by 
gene duplication followed by fusion into a single large gene (Figure 11-29A). In 
bacteria, in fact, the Na* channel is a tetramer of four identical polypeptide chains, 
supporting this evolutionary idea. 

Each domain contributes to the central channel, which is very similar to the K* 
channel. Each domain also contains a voltage sensor that is characterized by an 
unusual transmembrane helix, S4, that contains many positively charged amino 
acids. As the membrane depolarizes, the S4 helices experience an electrostatic 
pulling force that attracts them to the now negatively charged extracellular side of 
the plasma membrane. The resulting conformational change opens the channel. 
The structure of a bacterial voltage-gated Na* channel provides insights how the 
structural elements are arranged in the membrane (Figure 11-29B and C). 

The Na* channels also have an automatic inactivating mechanism, which 
causes the channels to reclose rapidly even though the membrane is still depolar- 
ized (see Figure 11-30). The Na* channels remain in this inactivated state, unable 
to reopen, until after the membrane potential has returned to its initial negative 
value. The time necessary for a sufficient number of Na* channels to recover from 
inactivation to support a new action potential, termed the refractory period, limits 


Figure 11-29 Structural models of 
voltage-gated Nat channels. (A) The 
channel in animal cells is built from a 
single polypeptide chain that contains 

four homologous domains. Each domain 
contains two transmembrane a helices 
(green) that surround the central ion- 
conducting pore. They are separated by 
sequences (blue) that form the selectivity 
filter. Four a additional helices (gray 

and red) in each domain constitute the 
voltage sensor. The S4 helices (red) are 
unique in that they contain an abundance 
of positively charged arginines. An 
inactivation gate that is part of a flexible 
loop connecting the third and fourth 
domains acts as a plug that obstructs the 
pore in the channel’s inactivated state, as 
shown in Figure 11-30. (B) Side and top 
views of a homologous bacterial channel 
protein showing its arrangement within the 
membrane. (C) A cross section of the pore 
domain of the channel shown in (B) shows 
lateral portals, through which the central 
cavity is accessible from the hydrophobic 
core of the lipid bilayer. In the crystals, lipid 
acyl chains were found to intrude into the 
pore. These lateral portals are large enough 
to allow entry of small, hydrophobic, pore- 
blocking drugs that are commonly used as 
anesthetics and block ion conductance. 
(PDB code: 3RVZ.) 
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Figure 11-30 Na+ channels and an action potential. (A) An action potential is triggered by a brief pulse of current, which 

(B) partially depolarizes the membrane, as shown in the plot of membrane potential versus time. The green curve shows how 
the membrane potential would have simply relaxed back to the resting value after the initial depolarizing stimulus if there had 
been no voltage-gated Na* channels in the membrane. The red curve shows the course of the action potential that is caused 
by the opening and subsequent inactivation of voltage-gated Nat channels. The states of the Na* channels are indicated 

in (B). The membrane cannot fire a second action potential until the Na* channels have returned from the inactivated to the 
closed conformation; until then, the membrane is refractory to stimulation. (C) The three states of the Na* channel. When the 
membrane is at rest (highly polarized), the closed conformation of the channel has the lowest free energy and is therefore most 
stable; when the membrane is depolarized, the energy of the open conformation is lower, so the channel has a high probability 
of opening. But the free energy of the inactivated conformation is lower still; therefore, after a randomly variable period spent in 
the open state, the channel becomes inactivated. Thus, the open conformation corresponds to a metastable state that can exist 


only transiently when the membrane depolarizes (Movie 11.10). 


the repetitive firing rate of a neuron. The cycle from initial stimulus to the return 
to the original resting state takes a few milliseconds or less. The Na* channel can 
therefore exist in three distinct states—closed, open, and inactivated—which con- 
tribute to the rise and fall of the action potential (Figure 11-30). 

This description of an action potential applies only to a small patch of plasma 
membrane. The self-amplifying depolarization of the patch, however, is sufficient 
to depolarize neighboring regions of membrane, which then go through the same 
cycle. In this way, the action potential sweeps like a wave from the initial site of 
depolarization over the entire plasma membrane, as shown in Figure 11-31. 


The Use of Channelrhodopsins Has Revolutionized the Study of 
Neural Circuits 


Channelrhodopsins are photosensitive ion channels that open in response to 
light. They evolved as sensory receptors in photosynthetic green algae to allow 
the algae to swim toward light. The structure of channelrhodopsin closely resem- 
bles that of bacteriorhodopsin (see Figure 10-31). It contains a covalently bound 
retinal group that absorbs light and undergoes an isomerization reaction, which 
triggers a conformational change in the protein, opening an ion channel in the 
plasma membrane. In contrast to bacteriorhodopsin, which is a light-driven pro- 
ton pump, channelrhodopsin is a light-driven cation channel. 

Using genetic engineering techniques, channelrhodopsin can be expressed 
in virtually any cell type in vertebrates and invertebrates. Researchers first 
introduced the gene into cultured neurons and showed that flashing light could 
now activate the channelrhodopsin and induce the neurons to fire action poten- 
tials. Because the frequency of the light flashes determined the frequency of the 
action potentials, one can control the frequency of neuronal firing with milli- 
second precision. 

Next, neurobiologists used the approach to activate specific neurons in the 
brain of experimental animals. Using a tiny fiber optic cable implanted near the 
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PROPAGATION Figure 11-31 The propagation of an action potential along an axon. (A) The 
voltages that would be recorded from a set of intracellular electrodes placed at 


axon intervals along the axon. (B) The changes in the Nat channels and the current flows 
— —— —— (curved red arrows) that give rise to a traveling action potential. The region of the 
axon with a depolarized membrane is shaded in blue. Note that once an action 


potential has started to progress, it has to continue in the same direction, traveling 
only away from the site of depolarization, because Na*-channel inactivation 
prevents the depolarization from spreading backward. 
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relevant brain region, they could flash light to specifically activate the channel- 

rhodopsin-containing neurons to fire action potentials. One group of researchers 

expressed channelrhodopsin in a subset of mouse neurons thought to be involved 

in aggression: when these cells were activated by light, the mouse immediately Figure 11-32 Optogenetic control of 
attacked anything in its environment—including other mice or even an inflated @ggression neurons in a living mouse. 


: ; f : A gene encoding channelrhodopsin was 
rubber glove (Figure 11-32); when the light was switched off, the neurons fell ve cg into a súbpoculation Ofneurons 


silent and the mouse’s behavior returned to normal. in the hypothalamus of a mouse. When the 
Since these pioneering studies, researchers have engineered additional neurons were exposed to flashing blue light 
light-responsive ion channels and transporters, including some that can rapidly —_—-using a tiny, implanted fiber optic cable, 
the channelrhodopsin channels opened, 
depolarizing and activating the cells. 
When the light was switched on, the 
mouse immediately became aggressive 
and attacked the inflated rubber glove; 
when the light was switched off, its 
behavior immediately returned to normal 
(Movie 11.11). (From D. Lin et al., Nature 
470:221-226, 2011. With permission from 
Macmillan Publishers Ltd.) 





CHANNELS AND THE ELECTRICAL PROPERTIES OF MEMBRANES 


«mT mm >| 


Sa 








nodes of Ranvier 


layers of 
myelin 


axon 


nucleus 


(A) 





axon „glial cell 


-= I O OO ce 


ee ee a M E a Aam, -e 
Lo |I 

node of 

Ranvier 


(C) 


myelin sheath 


inactivate specific neurons. It is therefore now possible to transiently activate or 
inhibit specific neurons in the brains of awake animals with remarkable spatial 
and temporal precision. In this way, the rapidly expanding new field of optoge- 
netics is revolutionizing neurobiology, allowing neuroscientists to analyze the 
neurons and circuits underlying even the most complex behaviors in experimen- 
tal animals, including nonhuman primates. 


Myelination Increases the Speed and Efficiency of Action Potential 
Propagation in Nerve Cells 


The axons of many vertebrate neurons are insulated by a myelin sheath, which 
greatly increases the rate at which an axon can conduct an action potential. The 
importance of myelination is dramatically demonstrated by the demyelinating 
disease multiple sclerosis, in which the immune system destroys myelin sheaths in 
some regions of the central nervous system; in the affected regions, nerve impulse 
propagation greatly slows or even fails, often with devastating neurological con- 
sequences. 

Myelin is formed by specialized non-neuronal supporting cells called glial 
cells. Schwann cells are the glial cells that myelinate axons in peripheral nerves, 
and oligodendrocytes do so in the central nervous system. These myelinating 
glial cells wrap layer upon layer of their own plasma membrane in a tight spiral 
around the axon (Figure 11-33A and B), thereby insulating the axonal membrane 
so that little current can leak across it. The myelin sheath is interrupted at regularly 
spaced nodes of Ranvier, where almost all the Na* channels in the axon are con- 
centrated (Figure 11-33C). This arrangement allows an action potential to prop- 
agate along a myelinated axon by jumping from node to node, a process called 
saltatory conduction. This type of conduction has two main advantages: action 
potentials travel very much faster, and metabolic energy is conserved because the 
active excitation is confined to the small regions of axonal plasma membrane at 
nodes of Ranvier. 
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Figure 11-33 Myelination. 

(A) A myelinated axon from a peripheral 
nerve. Each Schwann cell wraps its 
plasma membrane concentrically around 
the axon to form a segment of myelin 
sheath about 1 mm long. For clarity, 
the membrane layers of the myelin are 
shown less compacted than they are 

in reality (See part B). (B) An electron 
micrograph of a nerve in the leg of a 
young rat. Two Schwann cells can 

be seen: one near the bottom is just 
beginning to myelinate its axon; the 
one above it has formed an almost 
mature myelin sheath. (C) Fluorescence 
micrograph and diagram of individual 
myelinated axons teased apart in a rat 
optic nerve, showing the confinement 
of the voltage-gated Na* channels 
(green) in the axonal membrane at the 
node of Ranvier. A protein called Caspr 
(red) marks the junctions where the 
myelinating glial cell plasma membrane 
tightly abuts the axon on either side of 
the node. Voltage-gated K* channels 
(blue) localize to regions in the axon 
plasma membrane well away from the 
node. (B, from Cedric S. Raine, in Myelin 
[P. Morell, ed.]. New York: Plenum, 1976; 
C, from M.N. Rasband and P. Shrager, 
J. Physiol. 525:63-73, 2000. With 
permission from Blackwell Publishing.) 


626 Chapter 11: Membrane Transport of Small Molecules and the Electrical Properties of Membranes 


Patch-Clamp Recording Indicates That Individual lon Channels 
Open in an All-or-Nothing Fashion 


Neuron and skeletal muscle cell plasma membranes contain many thousands of 
voltage-gated Nat channels, and the current crossing the membrane is the sum 
of the currents flowing through all of these. An intracellular microelectrode can 
record this aggregate current, as shown in Figure 11-31A. Remarkably, however, 
it is also possible to record current flowing through individual channels. Patch- 
clamp recording, developed in the 1970s and 1980s, revolutionized the study of 
ion channels and made it possible to examine transport through a single chan- 
nel in a small patch of membrane covering the mouth of a micropipette (Figure 
11-34). With this simple but powerful technique, one can study the detailed prop- 
erties of ion channels in all sorts of cell types. This work led to the discovery that 
even cells that are not electrically excitable usually have a variety of ion channels 
in their plasma membrane. Many of these cells, such as yeasts, are too small to be 
investigated by the traditional electrophysiologist’s method of impalement with 
an intracellular microelectrode. 

Patch-clamp recording indicates that individual ion channels open in an all- 
or-nothing fashion. For example, a voltage-gated Na* channel opens and closes 
at random, but when open, the channel always has the same large conductance, 
allowing more than 1000 ions to pass per millisecond (Figure 11-35). Therefore, 
the aggregate current crossing the membrane of an entire cell does not indicate 
the degree to which a typical individual channel is open but rather the total num- 
ber of channels in its membrane that are open at any one time. 

Some simple physical principles allow us to refine our understanding of volt- 
age-gating from the perspective of a single Na* channel. The interior of the resting 
neuron or muscle cell is at an electrical potential about 40-100 mV more negative 
than the external medium. Although this potential difference seems small, it exists 
across a plasma membrane only about 5 nm thick, so that the resulting voltage 
gradient is about 100,000 V/cm. Charged proteins in the membrane such as Nat 
channels are thus subjected to a very large electrical field that can profoundly 
affect their conformation. Each conformation can “flip” to another conformation if 
given a sufficient jolt by the random thermal movements of the surroundings, and 
itis the relative stability of the closed, open, and inactivated conformations against 
flipping that is altered by changes in the membrane potential (see Figure 11-30C). 


Voltage-Gated Cation Channels Are Evolutionarily and Structurally 
Related 


Na* channels are not the only kind of voltage-gated cation channel that can gen- 
erate an action potential. The action potentials in some muscle, egg, and endo- 
crine cells, for example, depend on voltage-gated Ca** channels rather than on 
Nat channels. 
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Figure 11-34 The technique of 
patch-clamp recording. Because of 

the extremely tight seal between the 
micropipette and the membrane, current 
can enter or leave the micropipette only 
by passing through the ion channels in the 
patch of membrane covering its tip. The 
term clamp is used because an electronic 
device is employed to maintain, or “clamp,” 
the membrane potential at a set value 
while recording the ionic current through 
individual channels. The current through 
these channels can be recorded with 

the patch still attached to the rest of the 
cell, as in (A), or detached, as in (B). The 
advantage of the detached patch is that 

it is easy to alter the composition of the 
solution on either side of the membrane to 
test the effect of various solutes on channel 
behavior. A detached patch can also be 
produced with the opposite orientation, 

so that the cytoplasmic surface of the 
membrane faces the inside of the pipette. 
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There is a surprising amount of structural and functional diversity within each 
of the different classes of voltage-gated cation channels, generated both by mul- 
tiple genes and by the alternative splicing of RNA transcripts produced from the 
same gene. Nonetheless, the amino acid sequences of the known voltage-gated 
Nat, K*, and Ca** channels show striking similarities, demonstrating that they all 
belong to a large superfamily of evolutionarily and structurally related proteins 
and share many of the design principles. Whereas the single-celled yeast S. cerevi- 
siae contains a single gene that codes for a voltage-gated K* channel, the genome 
of the worm C. elegans contains 68 genes that encode different but related K* 
channels. This complexity indicates that even a simple nervous system made up 
of only 302 neurons uses a large number of different ion channels to compute its 
responses. 

Humans who inherit mutant genes encoding ion channels can suffer from a 
variety of nerve, muscle, brain, or heart diseases, depending in which cells the 
channel encoded by the mutant gene normally functions. Mutations in genes 
that encode voltage-gated Na* channels in skeletal muscle cells, for example, can 
cause myotonia, a condition in which there is a delay in muscle relaxation after 
voluntary contraction, causing painful muscle spasms. In some cases, this occurs 
because the abnormal channels fail to inactivate normally; as a result, Nat entry 
persists after an action potential finishes and repeatedly reinitiates membrane 
depolarization and muscle contraction. Similarly, mutations that affect Nat or K* 
channels in the brain can cause epilepsy, in which excessive synchronized firing of 
large groups of neurons causes epileptic seizures (convulsions, or fits). 

The particular combination of ion channels conducting Nat, Kt, and Ca** 
that are expressed in a neuron largely determines how the cell fires repetitive 
sequences of action potentials. Some nerve cells can repeat action potentials up 
to 300 times per second; other neurons fire short bursts of action potentials sepa- 
rated by periods of silence; while others rarely fire more than one action potential 
at a time. There is a remarkable diversity of neurons in the brain. 


Different Neuron Types Display Characteristic Stable Firing 
Properties 


It is estimated that the human brain contains about 10'! neurons and 10! synap- 
tic connections. To make matters more complex, neural circuitry is continuously 
sculpted in response to experience, modified as we learn and store memories, 
and irreversibly altered by the gradual loss of neurons and their connections as 
we age. How can a system so complex be subject to such change and yet con- 
tinue to function stably? One emerging theory suggests that individual neurons 
are self-tuning devices, constantly adjusting the expression of ion channels and 
neurotransmitter receptors in order to maintain a stable function. How might this 
work? 

Neurons can be categorized into functionally different types, based in part on 
their propensity to fire action potentials and their pattern of firing. For example, 
some neurons fire action potentials at high frequencies, while others fire rarely. 
The firing properties of each neuron type are determined to a large extent by the 
ion channels that the cell expresses. The number of ion channels in a neuron’s 
membrane is not fixed: as conditions change, a neuron can modify the num- 
bers of depolarizing (Na* and Ca?+) and hyperpolarizing (Kt) channels and keep 
their proportions adjusted so as to maintain its characteristic firing behavior—a 
remarkable example of homeostatic control. The molecular mechanisms involved 
remain an important mystery. 


Transmitter-Gated lon Channels Convert Chemical Signals into 
Electrical Ones at Chemical Synapses 


Neuronal signals are transmitted from cell to cell at specialized sites of contact 
known as synapses. The usual mechanism of transmission is indirect. The cells are 
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Figure 11-35 Patch-clamp 
measurements for a single voltage- 
gated Nat channel. A tiny patch of 
plasma membrane was detached from 
an embryonic rat muscle cell, as in Figure 
11-34. (A) The membrane was depolarized 
by an abrupt shift of potential from —90 to 
about —40 mV. (B) Three current records 
from three experiments performed on 

the same patch of membrane. Each 
major current step in (B) represents the 
opening and closing of a single channel. 
A comparison of the three records shows 
that, whereas the durations of channel 
Opening and closing vary greatly, the rate 
at which current flows through an open 
channel (its conductance) is practically 
constant. The minor fluctuations in 

the current records arise largely from 
electrical noise in the recording apparatus. 
Current flowing into the cell, measured 

in picoamperes (pA), is shown as a 
downward deflection of the curve. By 
convention, the electrical potential on the 
outside of the cell is defined as zero. 

(C) The sum of the currents measured in 
144 repetitions of the same experiment. 
This aggregate current is equivalent to the 
usual Na* current that would be observed 
flowing through a relatively large region 

of membrane containing 144 channels. 

A comparison of (B) and (C) reveals that 
the time course of the aggregate current 
reflects the probability that any individual 
channel will be in the open state; this 
probability decreases with time as the 
channels in the depolarized membrane 
adopt their inactivated conformation. 
(Data from J. Patlak and R. Horn, 

J. Gen. Physiol. 79:333-351, 1982. 

With permission from The Rockefeller 
University Press.) 
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Figure 11-36 A chemical synapse. (A) When an action potential reaches 
the nerve terminal in a presynaptic cell, it stimulates the terminal to release 
its neurotransmitter. The neurotransmitter molecules are contained in 
synaptic vesicles and are released to the cell exterior when the vesicles 
fuse with the plasma membrane of the nerve terminal. The released 
neurotransmitter binds to and opens the transmitter-gated ion channels 
concentrated in the plasma membrane of the postsynaptic target cell at 

the synapse. The resulting ion flows alter the membrane potential of the 
postsynaptic membrane, thereby transmitting a signal from the excited nerve (B) 
(Movie 11.12). (B) A thin-section electron micrograph of two nerve terminal 
synapses on a dendrite of a postsynaptic cell. (B, courtesy of Cedric Raine.) 


presynaptic 
membrane 


electrically isolated from one another, the presynaptic cell being separated from 
the postsynaptic cell by a narrow synaptic cleft. When an action potential arrives 
at the presynaptic site, the depolarization of the membrane opens voltage-gated 
Ca** channels that are clustered in the presynaptic membrane. Ca”* influx trig- 
gers the release into the cleft of small signal molecules known as neurotransmit- 
ters, which are stored in membrane-enclosed synaptic vesicles and released by 
exocytosis (discussed in Chapter 13). The neurotransmitter diffuses rapidly across 
the synaptic cleft and provokes an electrical change in the postsynaptic cell by 
binding to and opening transmitter-gated ion channels (Figure 11-36). After the 
neurotransmitter has been secreted, it is rapidly removed: it is either destroyed 
by specific enzymes in the synaptic cleft or taken up by the presynaptic nerve ter- 
minal or by surrounding glial cells. Reuptake is mediated by a variety of Na*-de- 
pendent neurotransmitter symporters (see Figure 11-8); in this way, neurotrans- 
mitters are recycled, allowing cells to keep up with high rates of release. Rapid 
removal ensures both spatial and temporal precision of signaling at a synapse. 
It decreases the chances that the neurotransmitter will influence neighboring 
cells, and it clears the synaptic cleft before the next pulse of neurotransmitter is 
released, so that the timing of repeated, rapid signaling events can be accurately 
communicated to the postsynaptic cell. As we shall see, signaling via such chemi- 
cal synapses is far more versatile and adaptable than direct electrical coupling via 
gap junctions at electrical synapses (discussed in Chapter 19), which are also used 
by neurons but to a much smaller extent. 

Transmitter-gated ion channels, also called ionotropic receptors, are built 
for rapidly converting extracellular chemical signals into electrical signals at 
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chemical synapses. The channels are concentrated in a specialized region of the 
postsynaptic plasma membrane at the synapse and open transiently in response 
to the binding of neurotransmitter molecules, thereby producing a brief perme- 
ability change in the membrane (see Figure 11-36A). Unlike the voltage-gated 
channels responsible for action potentials, transmitter-gated channels are rela- 
tively insensitive to the membrane potential and therefore cannot by themselves 
produce a self-amplifying excitation. Instead, they produce local permeability 
increases, and hence changes of membrane potential, that are graded according 
to the amount of neurotransmitter released at the synapse and how long it persists 
there. Only if the summation of small depolarizations at this site opens sufficient 
numbers of nearby voltage-gated cation channels can an action potential be trig- 
gered. This may require the opening of transmitter-gated ion channels at numer- 
ous synapses in close proximity on the target nerve cell. 


Chemical Synapses Can Be Excitatory or Inhibitory 


Transmitter-gated ion channels differ from one another in several important ways. 
First, as receptors, they have highly selective binding sites for the neurotransmit- 
ter that is released from the presynaptic nerve terminal. Second, as channels, they 
are selective in the type of ions that they let pass across the plasma membrane; 
this determines the nature of the postsynaptic response. Excitatory neurotrans- 
mitters open cation channels, causing an influx of Nat, and in many cases Ca”*, 
that depolarizes the postsynaptic membrane toward the threshold potential for 
firing an action potential. Inhibitory neurotransmitters, by contrast, open either 
Cl channels or K* channels, and this suppresses firing by making it harder for 
excitatory neurotransmitters to depolarize the postsynaptic membrane. Many 
transmitters can be either excitatory or inhibitory, depending on where they are 
released, what receptors they bind to, and the ionic conditions that they encoun- 
ter. Acetylcholine, for example, can either excite or inhibit, depending on the type 
of acetylcholine receptors it binds to. Usually, however, acetylcholine, glutamate, 
and serotonin are used as excitatory transmitters, and y-aminobutyric acid (GABA) 
and glycine are used as inhibitory transmitters. Glutamate, for instance, mediates 
most of the excitatory signaling in the vertebrate brain. 

We have already discussed how the opening of Na* or Ca** channels depo- 
larizes a membrane. The opening of K* channels has the opposite effect because 
the K* concentration gradient is in the opposite direction—high concentration 
inside the cell, low outside. Opening K* channels tends to keep the cell close to 
the equilibrium potential for K*, which, as we discussed earlier, is normally close 
to the resting membrane potential because at rest K* channels are the main type 
of channel that is open. When additional K* channels open, it becomes harder to 
drive the cell away from the resting state. We can understand the effect of open- 
ing Cl channels similarly. The concentration of Cl is much higher outside the 
cell than inside (see Table 11-1, p. 598), but the membrane potential opposes its 
influx. In fact, for many neurons, the equilibrium potential for Cl is close to the 
resting potential—or even more negative. For this reason, opening CI- channels 
tends to buffer the membrane potential; as the membrane starts to depolarize, 
more negatively charged Cl ions enter the cell and counteract the depolariza- 
tion. Thus, the opening of Cl channels makes it more difficult to depolarize the 
membrane and hence to excite the cell. Some powerful toxins act by blocking the 
action of inhibitory neurotransmitters: strychnine, for example, binds to glycine 
receptors and prevents their inhibitory action, causing muscle spasms, convul- 
sions, and death. 

However, not all chemical signaling in the nervous system operates through 
these ionotropic ligand-gated ion channels. In fact, most neurotransmitter mol- 
ecules that are secreted by nerve terminals, including a large variety of neuro- 
peptides, bind to metabotropic receptors, which regulate ion channels only 
indirectly through the action of small intracellular signal molecules (discussed 
in Chapter 15). All neurotransmitter receptors fall into one or other of these two 
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major classes—ionotropic or metabotropic—on the basis of their signaling mech- 
anisms: 


1. Ionotropic receptors are ion channels and feature at fast chemical syn- 
apses. Acetylcholine, glycine, glutamate, and GABA all act on transmit- 
ter-gated ion channels, mediating excitatory or inhibitory signaling that is 
generally immediate, simple, and brief. 


2. Metabotropic receptors are G-protein-coupled receptors (discussed in 
Chapter 15) that bind to all other neurotransmitters (and, confusingly, also 
acetylcholine, glutamate, and GABA). Signaling mediated by ligand-bind- 
ing to metabotropic receptors tends to be far slower and more complex 
than that at ionotropic receptors, and longer-lasting in its consequences. 


The Acetylcholine Receptors at the Neuromuscular Junction Are 
Excitatory Transmitter-Gated Cation Channels 


A well-studied example of a transmitter-gated ion channel is the acetylcholine 
receptor of skeletal muscle cells. This channel is opened transiently by acetylcho- 
line released from the nerve terminal at a neuromuscular junction—the special- 
ized chemical synapse between a motor neuron and a skeletal muscle cell (Figure 
11-37). This synapse has been intensively investigated because it is readily acces- 
sible to electrophysiological study, unlike most of the synapses in the central ner- 
vous system, that is, the brain and spinal cord in vertebrates. Moreover, the ace- 
tylcholine receptors are densely packed in the muscle cell plasma membrane at 
a neuromuscular junction (about 20,000 such receptors per m7), with relatively 
few receptors elsewhere in the same membrane. 

The receptors are composed of five transmembrane polypeptides, two of one 
kind and three others, encoded by four separate genes (Figure 11-38A). The four 
genes are strikingly similar in sequence, implying that they evolved from a single 
ancestral gene. The two identical polypeptides in the pentamer each contribute 
one acetylcholine-binding site. When two acetylcholine molecules bind to the 
pentameric complex, they induce a conformational change that opens the chan- 
nel. With ligand bound, the channel still flickers between open and closed states, 
but now it has a 90% probability of being open. This state continues—with ace- 
tylcholine binding and unbinding—until hydrolysis of the free acetylcholine by 
the enzyme acetylcholinesterase lowers its concentration at the neuromuscular 
junction sufficiently. Once freed of its bound neurotransmitter, the acetylcholine 
receptor reverts to its initial resting state. If the presence of acetylcholine persists 
for a prolonged time as a result of excessive nerve stimulation, the channel inac- 
tivates. Normally, the acetylcholine is rapidly hydrolyzed and the channel closes 
within about 1 millisecond, well before significant desensitization occurs. Desen- 
sitization would occur after about 20 milliseconds in the continued presence of 
acetylcholine. 

The five subunits of the acetylcholine receptor are arranged in a ring, form- 
ing a water-filled transmembrane channel that consists of a narrow pore through 
the lipid bilayer, which widens into vestibules at both ends. Acetylcholine binding 
opens the channel by causing the helices that line the pore to rotate outward, thus 
disrupting a ring of hydrophobic amino acids that blocks ion flow in the closed 
state. Clusters of negatively charged amino acids at either end of the pore help to 
exclude negative ions and encourage any positive ion of diameter less than 0.65 
nm to pass through (Figure 11-38B). The normal through-traffic consists chiefly 
of Nat and K+, together with some Ca**. Thus, unlike voltage-gated cation chan- 
nels, such as the K* channel discussed earlier, there is little selectivity among cat- 
ions, and the relative contributions of the different cations to the current through 
the channel depend chiefly on their concentrations and on the electrochemical 
driving forces. When the muscle cell membrane is at its resting potential, the net 
driving force for K* is near zero, since the voltage gradient nearly balances the K* 
concentration gradient across the membrane (see Panel 11-1, p. 616). For Na‘, 
in contrast, the voltage gradient and the concentration gradient both act in the 
same direction to drive the ion into the cell. (The same is true for Ca**, but the 
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Figure 11-37 A low-magnification 
scanning electron micrograph of a 
neuromuscular junction in a frog. The 
termination of a single axon on a skeletal 
muscle cell is shown. (From J. Desaki and 
Y. Uehara, J. Neurocytol. 10:101-110, 
1981. With permission from Kluwer 
Academic Publishers.) 
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extracellular concentration of Ca** is so much lower than that of Na* that Ca?+ 
makes only a small contribution to the total inward current.) Therefore, the open- 
ing of the acetylcholine-receptor channels leads to a large net influx of Na* (a 
peak rate of about 30,000 ions per channel each millisecond). This influx causes a 
membrane depolarization that signals the muscle to contract, as discussed below. 


Neurons Contain Many Types of Transmitter-Gated Channels 


The ion channels that open directly in response to the neurotransmitters acetyl- 
choline, serotonin, GABA, and glycine contain subunits that are structurally sim- 
ilar and probably form transmembrane pores in the same way as the ionotropic 
acetylcholine receptor, even though they have distinct neurotransmitter-binding 
specificities and ion selectivities. These channels are all built from homologous 
polypeptide subunits, which assemble as a pentamer. Glutamate-gated ion chan- 
nels are an exception, in that they are constructed from a distinct family of sub- 
units and form tetramers resembling the K* channels discussed earlier (see Figure 
11-24A). 

For each class of transmitter-gated ion channel, there are alternative forms of 
each type of subunit, which may be encoded by distinct genes or else generated 
by alternative RNA splicing of a single gene product. The subunits assemble in 
different combinations to form an extremely diverse set of distinct channel sub- 
types, with different ligand affinities, different channel conductances, different 
rates of opening and closing, and different sensitivities to drugs and toxins. Some 
vertebrate neurons, for example, have acetylcholine-gated ion channels that dif- 
fer from those of muscle cells in that they are formed from two subunits of one 
type and three of another; but there are at least nine genes coding for different 
versions of the first type of subunit and at least three coding for different versions 
of the second. Subsets of such neurons performing different functions in the brain 
express different combinations of the genes for these subunits. In principle, and 
already to some extent in practice, it is possible to design drugs targeted against 
these narrowly defined subsets, thereby specifically influencing particular brain 
functions. 


Many Psychoactive Drugs Act at Synapses 


Transmitter-gated ion channels have for a long time been important drug tar- 
gets. A surgeon, for example, can relax muscles for the duration of an operation 
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Figure 11-38 A model for the structure 
of the skeletal muscle acetylcholine 
receptor. (A) Five homologous subunits 
(a, a, B, y, 6) combine to form a 
transmembrane pore. Both of the a 
subunits contribute an acetylcholine- 
binding site nestled between adjoining 
subunits. (B) The pore is lined by a ring 
of five transmembrane a helices, one 
contributed by each subunit (just the 

two a Subunits are shown). In its closed 
conformation, the pore is occluded 

by the hydrophobic side chains of five 
leucines (green), one from each a helix, 
which form a gate near the middle of the 
lipid bilayer. When acetylcholine binds to 
both a subunits, the channel undergoes 
a conformational change that opens 

the gate by an outward rotation of the 
helices containing the occluding leucines. 
Negatively charged side chains (indicated 
by the “—* signs) at either end of the pore 
ensure that only positively charged ions 
pass through the channel. (PDB code: 
2BGQ.) 
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by blocking the acetylcholine receptors on skeletal muscle cells with curare, a 
plant-derived drug that was originally used by South American Indians to make 
poison arrows. Most drugs used to treat insomnia, anxiety, depression, and 
schizophrenia exert their effects at chemical synapses, and many of these act 
by binding to transmitter-gated channels. Barbiturates, tranquilizers such as 
Valium, and sleeping pills such as Ambien, for example, bind to GABA receptors, 
potentiating the inhibitory action of GABA by allowing lower concentrations of 
this neurotransmitter to open Cl channels. Our increasing understanding of the 
molecular biology of ion channels should allow us to design a new generation of 
psychoactive drugs that will act still more selectively to alleviate the miseries of 
mental illness. 

In addition to ion channels, many other components of the synaptic signaling 
machinery are potential targets for psychoactive drugs. As mentioned earlier, after 
release into the synaptic cleft, many neurotransmitters are cleared by reuptake 
mechanisms mediated by Na*-driven symports. Inhibiting such transporters pro- 
longs the effect of the neurotransmitter, thereby strengthening synaptic transmis- 
sion. Many antidepressant drugs, including Prozac, inhibit the reuptake of sero- 
tonin; others inhibit the reuptake of both serotonin and norepinephrine. 

Ion channels are the basic molecular units from which neuronal devices for 
signaling and computation are built. To provide a glimpse of how sophisticated 
these devices can be, we consider several examples that demonstrate how the 
coordinated activities of groups of ion channels allow you to move, feel, and 
remember. 


Neuromuscular Transmission Involves the Sequential Activation of 
Five Different Sets of lon Channels 


The following process, in which a nerve impulse stimulates a muscle cell to con- 
tract, illustrates the importance of ion channels to electrically excitable cells. This 
apparently simple response requires the sequential activation of at least five dif- 
ferent sets of ion channels, all within a few milliseconds (Figure 11-39). 


1. The process is initiated when a nerve impulse reaches the nerve terminal 
and depolarizes the plasma membrane of the terminal. The depolarization 
transiently opens voltage-gated Ca** channels in this presynaptic mem- 
brane. As the Ca** concentration outside cells is more than 1000 times 
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Figure 11-39 The system of ion 
channels at a neuromuscular junction. 
These gated ion channels are essential for 
the stimulation of muscle contraction by a 
nerve impulse. The various channels are 
numbered in the sequence in which they 
are activated, as described in the text. 
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greater than the free Ca** concentration inside, Ca** flows into the nerve 
terminal. The increase in Ca** concentration in the cytosol of the nerve 
terminal triggers the local release of acetylcholine by exocytosis into the 
synaptic cleft. 


2. The released acetylcholine binds to acetylcholine receptors in the muscle 
cell plasma membrane, transiently opening the cation channels associated 
with them. The resulting influx of Na* causes a local membrane depolariza- 
tion. 


3. The local depolarization opens voltage-gated Na* channels in this mem- 
brane, allowing more Na* to enter, which further depolarizes the mem- 
brane. This, in turn, opens neighboring voltage-gated Na* channels and 
results in a self-propagating depolarization (an action potential) that 
spreads to involve the entire plasma membrane (see Figure 11-31). 


4. The generalized depolarization of the muscle cell plasma membrane acti- 
vates voltage-gated Ca** channels in the transverse tubules (T tubules— 
discussed in Chapter 16) of this membrane. 


5. This in turn causes Ca*t-release channels in an adjacent region of the sar- 
coplasmic reticulum (SR) membrane to open transiently and release Ca** 
stored in the SR into the cytosol. The T-tubule and SR membranes are 
closely apposed with the two types of channel joined together in a special- 
ized structure, in which activation of the voltage-sensitive Ca** channel in 
the T-tubule plasma membrane causes a channel conformational change 
that is mechanically transmitted to the Ca**-release channel in the SR 
membrane, opening it and allowing Ca? to flow from the SR lumen into 
the cytoplasm (see Figure 16-35). The sudden increase in the cytosolic Ca** 
concentration causes the myofibrils in the muscle cell to contract. 


Whereas the initiation of muscle contraction by a motor neuron is complex, an 
even more sophisticated interplay of ion channels is required for a neuron to inte- 
grate a large number of input signals at its synapses and compute an appropriate 
output, as we now discuss. 


Single Neurons Are Complex Computation Devices 


In the central nervous system, a single neuron can receive inputs from thousands 
of other neurons, and it can in turn form synapses with many thousands of other 
cells. Several thousand nerve terminals, for example, make synapses on an aver- 
age motor neuron in the spinal cord, almost completely covering its cell body and 
dendrites (Figure 11-40). Some of these synapses transmit signals from the brain 
or spinal cord; others bring sensory information from muscles or from the skin. 
The motor neuron must combine the information received from all these sources 
and react, either by firing action potentials along its axon or by remaining quiet. 
Of the many synapses on a neuron, some tend to excite it, while others inhibit 
it. Neurotransmitter released at an excitatory synapse causes a small depolariza- 
tion in the postsynaptic membrane called an excitatory postsynaptic potential 
(excitatory PSP), whereas neurotransmitter released at an inhibitory synapse 
generally causes a small hyperpolarization called an inhibitory PSP. The plasma 
membrane of the dendrites and cell body of most neurons contains a relatively 
low density of voltage-gated Na* channels, and so an individual excitatory PSP is 
generally too small to trigger an action potential. Instead, each incoming signal 
initiates a local PSP, which decreases with distance from the site of the synapse. If 
signals arrive simultaneously at several synapses in the same region of the den- 
dritic tree, the total PSP in that neighborhood will be roughly the sum of the indi- 
vidual PSPs, with inhibitory PSPs making a negative contribution to the total. The 
PSPs from each neighborhood spread passively and converge on the cell body. 
For long-distance transmission, the combined magnitude of the PSP is then trans- 
lated, or encoded, into the frequency of firing of action potentials: the greater the 
stimulation (depolarization), the higher the frequency of action potentials. 
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Neuronal Computation Requires a Combination of at Least Three 
Kinds of K+ Channels 


The intensity of stimulation that a neuron receives is encoded by that neuron into 
action potential frequency for long-distance transmission. The encoding takes 
place at a specialized region of the axonal membrane known as the initial seg- 
ment, or axon hillock, at the junction of the axon and the cell body (see Figure 
11-40). This membrane is rich in voltage-gated Nat channels; but it also contains 
at least four other classes of ion channels—three selective for K* and one selec- 
tive for Ca**—all of which contribute to the axon hillock’s encoding function. The 
three varieties of K* channels have different properties; we shall refer to them as 
delayed, rapidly inactivating, and Ca**-activated K* channels. 

To understand the need for multiple types of channels, consider first what 
would happen if the only voltage-gated ion channels present in the nerve cell 
were the Na* channels. Below a certain threshold level of synaptic stimulation, 
the depolarization of the initial-segment membrane would be insufficient to 
trigger an action potential. With gradually increasing stimulation, the threshold 
would be crossed, the Na* channels would open, and an action potential would 
fire. The action potential would be terminated by inactivation of the Na* channels. 
Before another action potential could fire, these channels would have to recover 
from their inactivation. But that would require a return of the membrane voltage 
to a very negative value, which would not occur as long as the strong depolariz- 
ing stimulus (from PSPs) was maintained. An additional channel type is needed, 
therefore, to repolarize the membrane after each action potential to prepare the 
cell to fire again. 

The delayed K* channels perform this task, as discussed previously in rela- 
tion to the propagation of the action potential (see Figure 11-31). They are volt- 
age-gated, but because of their slower kinetics they open only during the falling 
phase of the action potential, when the Na* channels are inactive. Their opening 
permits an efflux of K* that drives the membrane back toward the K* equilibrium 
potential, which is so negative that the Na* channels rapidly recover from their 
inactivated state. Repolarization of the membrane also closes the delayed K* 
channels. The initial segment is now reset so that the depolarizing stimulus from 





Figure 11-40 A motor neuron in the 
spinal cord. (A) Many thousands of nerve 
terminals synapse on the cell body and 
dendrites. These deliver signals from other 
parts of the organism to control the firing of 
action potentials along the single axon of 
this large cell. (B) Fluorescence micrograph 
showing a nerve cell body and its dendrites 
stained with a fluorescent antibody that 
recognizes a cytoskeletal protein (green) 
that is not present in axons. Thousands 

of axon terminals (red) from other nerve 
cells (not visible) make synapses on the 
cell body and dendrites; the terminals are 
stained with a fluorescent antibody that 
recognizes a protein in synaptic vesicles. 
(B, courtesy of Olaf Mundig! and Pietro de 
Camilli.) 
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synaptic inputs can fire another action potential. In this way, sustained stimula- 
tion of the dendrites and cell body leads to repetitive firing of the axon. 

Repetitive firing in itself, however, is not enough. The frequency of firing has 
to reflect the intensity of stimulation, and a simple system of Na* channels and 
delayed K* channels is inadequate for this purpose. Below a certain threshold 
level of steady stimulation, the cell will not fire at all; above that threshold level, 
it will abruptly begin to fire at a relatively rapid rate. The rapidly inactivating K* 
channels solve the problem. These, too, are voltage-gated and open when the 
membrane is depolarized, but their specific voltage sensitivity and kinetics of 
inactivation are such that they act to reduce the rate of firing at levels of stimula- 
tion that are only just above the threshold required for firing. Thus, they remove 
the discontinuity in the relationship between the firing rate and the intensity of 
stimulation. The result is a firing rate that is proportional to the strength of the 
depolarizing stimulus over a very broad range (Figure 11-41). 

The process of encoding is usually further modulated by the two other types 
of ion channels in the initial segment that were mentioned earlier—voltage-gated 
Ca** channels and Ca**-activated K* channels. They act together to decrease the 
response of the cell to an unchanging, prolonged stimulation—a process called 
adaptation. These Ca** channels are similar to the Ca** channels that mediate the 
release of neurotransmitter from presynaptic axon terminals; they open when an 
action potential fires, transiently allowing Ca** into the axon cytosol at the initial 
segment. 

The Ca?t-activated K* channel opens in response to a raised concentration 
of Ca** at the channel’s cytoplasmic face (Figure 11-42). Prolonged, strong depo- 
larizing stimuli will trigger a long train of action potentials, each of which permits 
a brief influx of Ca** through the voltage-gated Ca** channels, so that local cyto- 
solic Ca** concentration gradually builds up to a level high enough to open the 
Ca**-activated K* channels. Because the resulting increased permeability of the 
membrane to K* makes the membrane harder to depolarize, the delay between 
one action potential and the next is increased. In this way, a neuron that is stim- 
ulated continuously for a prolonged period becomes gradually less responsive to 
the constant stimulus. 

Such adaptation, which can also occur by other mechanisms, allows a neu- 
ron—indeed, the nervous system generally—to react sensitively to change, even 
against a high background level of steady stimulation. It is one of the computa- 
tional strategies that help us, for example, to feel a light touch on the shoulder 
and yet ignore the constant pressure of our clothing. We discuss adaptation as a 
general feature in cell signaling processes in more detail in Chapter 15. 

Other neurons do different computations, reacting to their synaptic inputs in 
myriad ways, reflecting the different assortments of ion channels in their mem- 
brane. There are several hundred genes that code for ion channels in the human 
genome, with over 150 encoding voltage-gated channels alone. Further complex- 
ity is introduced by alternative splicing of RNA transcripts and assembling chan- 
nel subunits in different combinations. Moreover, ion channels are selectively 
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Figure 11-41 The magnitude of the 
combined postsynaptic potential (PSP) 
is reflected in the frequency of firing of 
action potentials. The mix of excitatory 
and inhibitory PSPs produces a combined 
PSP at the initial segment. A comparison of 
(A) and (B) shows how the firing frequency 
of an axon increases with an increase in the 
combined PSP, while (C) summarizes the 
general relationship. 
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localized to different sites in the plasma membrane of a neuron. Some K* and 
Ca** channels are concentrated in the dendrites and participate in processing the 
input that a neuron receives. As we have seen, other ion channels are located at 
the axon’s initial segment, where they control action potential firing; and some 
ligand-gated channels are distributed over the cell body and, depending on their 
ligand occupancy, modulate the cell’s general sensitivity to synaptic inputs. The 
multiplicity of ion channels and their locations evidently allows each of the many 
types of neurons to tune the electrical behavior to the particular tasks they per- 
form. 

One of the crucial properties of the nervous system is its ability to learn and 
remember. This property depends in part on the ability of individual synapses to 
strengthen or weaken depending on their use—a process called synaptic plas- 
ticity. We end this chapter by considering a remarkable type of ion channel that 
has a special role in some forms of synaptic plasticity. It is located at many excit- 
atory synapses in the central nervous system, where it is gated by both voltage 
and the excitatory neurotransmitter glutamate. It is also the site of action of the 
psychoactive drug phencyclidine, or angel dust. 


Long-Term Potentiation (LTP) in the Mammalian Hippocampus 
Depends on Ca?* Entry Through NMDA-Receptor Channels 


Practically all animals can learn, but mammals seem to learn exceptionally well 
(or so we like to think). In a mammal’s brain, the region called the hippocampus 
has a special role in learning. When it is destroyed on both sides of the brain, the 
ability to form new memories is largely lost, although previous long-established 
memories remain. Some synapses in the hippocampus show a striking form of 
synaptic plasticity with repeated use: whereas occasional single action poten- 
tials in the presynaptic cells leave no lasting trace, a short burst of repetitive firing 
causes long-term potentiation (LTP), such that subsequent single action poten- 
tials in the presynaptic cells evoke a greatly enhanced response in the postsynaptic 
cells. The effect lasts hours, days, or weeks, according to the number and intensity 
of the bursts of repetitive firing. Only the synapses that were activated exhibit LTP; 
synapses that have remained quiet on the same postsynaptic cell are not affected. 
However, while the cell is receiving a burst of repetitive stimulation via one set of 
synapses, if a single action potential is delivered at another synapse on its surface, 
that latter synapse also will undergo LTP, even though a single action potential 
delivered there at another time would leave no such lasting trace. 

The underlying rule in such events seems to be that LTP occurs on any occa- 
sion when a presynaptic cell fires (once or more) at a time when the postsynaptic 
membrane is strongly depolarized (either through recent repetitive firing of the 
same presynaptic cell or by other means). This rule reflects the behavior of a par- 
ticular class ofion channels in the postsynaptic membrane. Glutamate is the main 
excitatory neurotransmitter in the mammalian central nervous system, and gluta- 
mate-gated ion channels are the most common of all transmitter-gated channels 
in the brain. In the hippocampus, as elsewhere, most of the depolarizing current 
responsible for excitatory PSPs is carried by glutamate-gated ion channels called 
AMPA receptors, which operate in the standard way (Figure 11-43). But the cur- 
rent has, in addition, a second and more intriguing component, which is mediated 
by a separate subclass of glutamate-gated ion channels known as NMDA recep- 
tors, so named because they are selectively activated by the artificial glutamate 
analog N-methyl-D-aspartate. The NMDA-receptor channels are doubly gated, 
opening only when two conditions are satisfied simultaneously: glutamate must 
be bound to the receptor, and the membrane must be strongly depolarized. The 
second condition is required for releasing the Mg** that normally blocks the rest- 
ing channel. This means that NMDA receptors are normally activated only when 
AMPA receptors are activated as well and depolarize the membrane. The NMDA 
receptors are critical for LTP. When they are selectively blocked with a specific 
inhibitor or inactivated genetically, LTP does not occur, even though ordinary 
synaptic transmission continues, indicating the importance of NMDA receptors 
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Figure 11-42 Structure of a Ca?+- 
activated K+ channel. The channel 
contains four identical subunits (which 
are shown in different colors for clarity). 

It is both voltage- and Ca?+-gated. The 
structure shown is a composite of the 
cytosolic and membrane portions of the 
channel that were separately crystallized. 
(PDB codes: 2R99, 1LNQ.) 
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Figure 11-43 The structure of the 

AMPA receptor. This ionotropic glutamate 
receptor (named after the glutamate analog 
a-Amino 3-hydroxy 5-Methyl 4-isoxazole 
Propionic Acid) is the most common 
mediator of fast, excitatory synaptic 
transmission in the central nervous system 
(CNS). (PDB code: 3KG2.) 


CHANNELS AND THE ELECTRICAL PROPERTIES OF MEMBRANES 


ea glutamate released by 


activated presynaptic 


637 


nerve terminal opens depolarization removes 
AMPA-receptor channels, Mg?* block from NMDA- 
allowing Na* influx receptor channel, which 
glutamate ©) that depolarizes (with glutamate bound) © ©) 
the postsynaptic allows Ca? to enter the 
mmaa e piled ee cell 


polarized 


membrane 
+++ 


NMDA receptor 





fb 


AMPA depolarized 
receptor membrane 


for LTP induction. Such animals exhibit specific deficits in their learning abilities 
but behave almost normally otherwise. 

How do NMDA receptors mediate LTP? The answer is that these channels, 
when open, are highly permeable to Ca**, which acts as an intracellular signal in 
the postsynaptic cell, triggering a cascade of changes that are responsible for LTP. 
Thus, LTP is prevented when Ca” levels are held artificially low in the postsynap- 
tic cell by injecting the Ca** chelator EGTA into it, and LTP can be induced by arti- 
ficially raising intracellular Ca** levels in the cell. Among the long-term changes 
that increase the sensitivity of the postsynaptic cell to glutamate is the insertion of 
new AMPA receptors into the plasma membrane (Figure 11-44). In some forms 
of LTP, changes occur in the presynaptic cell as well, so that it releases more gluta- 
mate than normal when it is activated subsequently. 

If synapses were capable only of LTP they would quickly become saturated, 
and thus be of limited value as an information-storage device. In fact, they also 
exhibit long-term depression (LTD), with the long-term effect of reducing the 
number of AMPA receptors in the post-synaptic membrane. This feat is accom- 
plished by degrading AMPA receptors after their selective endocytosis. Surpris- 
ingly, LTD also requires NMDA receptor activation and a rise in Ca?+. How does 
Ca** trigger opposite effects at the same synapse? It turns out that this bidirec- 
tional control of synaptic strength depends on the magnitude of the rise in Ca** 
high Ca** levels activate protein kinases and LTP, whereas modest Ca** levels acti- 
vate protein phosphatases and LTD. 

There is evidence that NMDA receptors have an important role in synaptic 
plasticity and learning in other parts of the brain, as well as in the hippocampus. 
Moreover, they have a crucial role in adjusting the anatomical pattern of synap- 
tic connections in the light of experience during the development of the nervous 
system. 

Thus, neurotransmitters released at synapses, besides relaying transient elec- 
trical signals, can also alter concentrations of intracellular mediators that bring 
about lasting changes in the efficacy of synaptic transmission. However, it is still 
uncertain how these changes endure for weeks, months, or a lifetime in the face 
of the normal turnover of cell constituents. 


Summary 


Ion channels form aqueous pores across the lipid bilayer and allow inorganic ions 
of appropriate size and charge to cross the membrane down their electrochemi- 
cal gradients at rates about 1000 times greater than those achieved by any known 
transporter. The channels are “gated” and usually open transiently in response to 
a specific perturbation in the membrane, such as a change in membrane poten- 
tial (voltage-gated channels), or the binding of a neurotransmitter to the channel 
(transmitter-gated channels). 

K*t-selective leak channels have an important role in determining the rest- 
ing membrane potential across the plasma membrane in most animal cells. 
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Figure 11-44 The signaling events in long- 
term potentiation. Although not shown, 
transmission-enhancing changes can also 
occur in the presynaptic nerve terminals in 
LTP, which may be induced by retrograde 
signals from the postsynaptic cell. 
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Voltage-gated cation channels are responsible for the amplification and propaga- 
tion of action potentials in electrically excitable cells, such as neurons and skeletal 
muscle cells. Transmitter-gated ion channels convert chemical signals to electrical 
signals at chemical synapses. Excitatory neurotransmitters, such as acetylcholine 
and glutamate, open transmitter-gated cation channels and thereby depolarize the 
postsynaptic membrane toward the threshold level for firing an action potential. 
Inhibitory neurotransmitters, such as GABA and glycine, open transmitter-gated 
CI or K* channels and thereby suppress firing by keeping the postsynaptic mem- 
brane polarized. A subclass of glutamate-gated ion channels, called NUDA-recep- 
tor channels, is highly permeable to Ca**, which can trigger the long-term changes 
in synapse efficacy (synaptic plasticity) such as LTP and LTD that are thought to be 
involved in some forms of learning and memory. 

Ion channels work together in complex ways to control the behavior of electri- 
cally excitable cells. A typical neuron, for example, receives thousands of excitatory 
and inhibitory inputs, which combine by spatial and temporal summation to pro- 
duce a combined postsynaptic potential (PSP) at the initial segment of its axon. The 
magnitude of the PSP is translated into the rate of firing of action potentials by a 
mixture of cation channels in the initial segment membrane. 


WHAT WE DON’T KNOW 


e How do individual neurons establish 
and maintain their characteristic 
intrinsic firing properties? 


e Even organisms with very simple 
nervous systems have dozens of 
different Kt channels. Why is it 
important to have so many? 


e Why do cells that are not electrically 
active contain voltage-gated ion 
channels? 


e How are memories stored for so 
many years in the human brain? 


PROBLEMS 


Which statements are true? Explain why or why not. 


11-1 Transport by transporters can be either active or 
passive, whereas transport by channels is always passive. 


11-2 Transporters saturate at high concentrations of 
the transported molecule when all their binding sites are 
occupied; channels, on the other hand, do not bind the 
ions they transport and thus the flux of ions through a 
channel does not saturate. 


11-3 The membrane potential arises from movements 
of charge that leave ion concentrations practically unaf- 
fected, causing only a very slight discrepancy in the num- 
ber of positive and negative ions on the two sides of the 
membrane. 


Discuss the following problems. 


11-4 Order Ca**, COs, ethanol, glucose, RNA, and H20 
according to their ability to diffuse through a lipid bilayer, 
beginning with the one that crosses the bilayer most read- 
ily. Explain your order. 


11-5 How is it possible for some molecules to be at 
equilibrium across a biological membrane and yet not be 
at the same concentration on both sides? 


11-6 Ion transporters are “linked” together—not physi- 
cally, but as a consequence of their actions. For example, 
cells can raise their intracellular pH, when it becomes too 
acidic, by exchanging external Na* for internal H*, using 
a Na*-H* antiporter. The change in internal Na* is then 
redressed using the Na*-K* pump. 

A. Can these two transporters, operating together, 
normalize both the H* and the Na* concentrations inside 
the cell? 


B. Does the linked action of these two pumps cause 
imbalances in either the K* concentration or the mem- 
brane potential? Why or why not? 


11-/ Microvilli increase the surface area of intestinal 
cells, providing more efficient absorption of nutrients. 
Microvilli are shown in profile and cross section in Figure 
Q11-1. From the dimensions given in the figure, estimate 
the increase in surface area that microvilli provide (for 
the portion of the plasma membrane in contact with the 
lumen of the gut) relative to the corresponding surface of a 
cell with a “flat” plasma membrane. 





Figure Q11-1 Microvilli of intestinal epithelial cells in profile and cross 
section (Problem 11-7). (Left panel, from Rippel Electron Microscope 
Facility, Dartmouth College; right panel, from David Burgess.) 


11-8 According to Newton’s laws of motion, an ion 
exposed to an electric field in a vacuum would experience 
a constant acceleration from the electric driving force, just 
as a falling body in a vacuum constantly accelerates due to 
gravity. In water, however, an ion moves at constant veloc- 
ity in an electric field. Why do you suppose that is? 


CHAPTER 11 END-OF-CHAPTER PROBLEMS 


21.4nm 


Figure Q11-2 A “ball” tethered by a “chain” to a voltage-gated Kt 
channel (Problem 11-9). 


11-9 In a subset of voltage-gated K* channels, the 
N-terminus of each subunit acts like a tethered ball that 
occludes the cytoplasmic end of the pore soon after it 
opens, thereby inactivating the channel. This “ball-and- 
chain” model for the rapid inactivation of voltage-gated 
K* channels has been elegantly supported for the shaker 
Kt channel from Drosophila melanogaster. (The shaker 
K* channel in Drosophila is named after a mutant form 
that causes excitable behavior—even anesthetized flies 
keep twitching.) Deletion of the N-terminal amino acids 
from the normal shaker channel gives rise to a channel 
that opens in response to membrane depolarization, but 
stays open instead of rapidly closing as the normal chan- 
nel does. A peptide (MAAVAGLYGLGEDRQHRKKQ) that 
corresponds to the deleted N-terminus can inactivate the 
open channel at 100 uM. 

Is the concentration of free peptide (100 uM) that 
is required to inactivate the defective K* channel anywhere 
near the local concentration of the tethered ball on a nor- 
mal channel? Assume that the tethered ball can explore a 
hemisphere [volume = (2/3)zr°] with a radius of 21.4 nm, 
which is the length of the polypeptide “chain” (Figure 
Q11-2). Calculate the concentration for one ball in this 
hemisphere. How does that value compare with the con- 
centration of free peptide needed to inactivate the chan- 
nel? 


11-10 The giant axon of the squid (Figure Q11-3) occu- 
pies a unique position in the history of our understanding 
of cell membrane potentials and nerve action. When an 
electrode is stuck into an intact giant axon, the membrane 
potential registers -70 mV. When the axon, suspended in a 
bath of seawater, is stimulated to conduct a nerve impulse, 
the membrane potential changes transiently from -70 mV 
to +40 mV. 





Figure Q11-3 The squid Loligo (Problem 11-10). This squid is about 
15 cm in length. 
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TABLE Q11-1 





For univalent ions and at 20°C (293 K), the Nernst equation 
reduces to 


V=58 mV x log (Co/C)) 


where C, and C; are the concentrations outside and inside, 
respectively. 

Using this equation, calculate the potential across 
the resting membrane (1) assuming that it is due solely to 
K* and (2) assuming that it is due solely to Nat. (The Na+ 
and K* concentrations in the axon cytosol and in seawa- 
ter are given in Table Q11-1.) Which calculation is closer 
to the measured resting potential? Which calculation is 
closer to the measured action potential? Explain why these 
assumptions approximate the measured resting and action 
potentials. 


11-11 Acetylcholine-gated cation channels at the neu- 
romuscular junction open in response to acetylcholine 
released by the nerve terminal and allow Na* ions to enter 
the muscle cell, which causes membrane depolarization 
and ultimately leads to muscle contraction. 

A.  Patch-clamp measurements show that young rat 
muscles have cation channels that respond to acetylcho- 
line (Figure Q11-4). How many kinds of channel are there? 
How can you tell? 

B. For each kind of channel, calculate the number of 
ions that enter in one millisecond. (One ampere is a cur- 
rent of one coulomb per second; one pA equals 107! 
ampere. An ion with a single charge such as Na* carries a 
charge of 1.6 x 10719 coulomb.) 


2 pA | 


40 msec 


Figure Q11-4 Patch-clamp measurements of acetylcholine-gated 
cation channels in young rat muscle (Problem 11-11). 
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Intracellular Compartments 
and Protein Sorting 


Unlike a bacterium, which generally consists of a single intracellular compart- 
ment surrounded by a plasma membrane, a eukaryotic cell is elaborately sub- 
divided into functionally distinct, membrane-enclosed compartments. Each 
compartment, or organelle, contains its own characteristic set of enzymes and 
other specialized molecules, and complex distribution systems transport specific 
products from one compartment to another. To understand the eukaryotic cell, it 
is essential to know how the cell creates and maintains these compartments, what 
occurs in each of them, and how molecules move between them. 

Proteins confer upon each compartment its characteristic structural and 
functional properties. They catalyze the reactions that occur there and selectively 
transport small molecules into and out of the compartment. For membrane-en- 
closed organelles in the cytoplasm, proteins also serve as organelle-specific sur- 
face markers that direct new deliveries of proteins and lipids to the appropriate 
organelle. 

An animal cell contains about 10 billion (101°) protein molecules of perhaps 
10,000 kinds, and the synthesis of almost all of them begins in the cytosol, the 
space of the cytoplasm outside the membrane-enclosed organelles. Each newly 
synthesized protein is then delivered specifically to the organelle that requires it. 
The intracellular transport of proteins is the central theme of both this chapter 
and the next. By tracing the protein traffic from one compartment to another, one 
can begin to make sense of the otherwise bewildering maze of intracellular mem- 
branes. 


THE COMPARTMENTALIZATION OF CELLS 


In this brief overview of the compartments of the cell and the relationships 
between them, we organize the organelles conceptually into a small number of 
discrete families, discuss how proteins are directed to specific organelles, and 
explain how proteins cross organelle membranes. 


All Eukaryotic Cells Have the Same Basic Set of Membrane- 
enclosed Organelles 


Many vital biochemical processes take place in membranes or on their surfaces. 
Membrane-bound enzymes, for example, catalyze lipid metabolism; and oxida- 
tive phosphorylation and photosynthesis both require a membrane to couple the 
transport of H* to the synthesis of ATP. In addition to providing increased mem- 
brane area to host biochemical reactions, intracellular membrane systems form 
enclosed compartments that are separate from the cytosol, thus creating function- 
ally specialized aqueous spaces within the cell. In these spaces, subsets of mol- 
ecules (proteins, reactants, ions) are concentrated to optimize the biochemical 
reactions in which they participate. Because the lipid bilayer of cell membranes is 
impermeable to most hydrophilic molecules, the membrane of an organelle must 
contain membrane transport proteins to import and export specific metabolites. 
Each organelle membrane must also have a mechanism for importing, and incor- 
porating into the organelle, the specific proteins that make the organelle unique. 
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IN THIS CHAPTER 
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Figure 12-1 illustrates the major intracellular compartments common to 
eukaryotic cells. The nucleus contains the genome (aside from mitochondrial and 
chloroplast DNA), and it is the principal site of DNA and RNA synthesis. The sur- 
rounding cytoplasm consists of the cytosol and the cytoplasmic organelles sus- 
pended in it. The cytosol constitutes a little more than half the total volume of 
the cell, and it is the main site of protein synthesis and degradation. It also per- 
forms most of the cell’s intermediary metabolism—that is, the many reactions 
that degrade some small molecules and synthesize others to provide the building 
blocks for macromolecules (discussed in Chapter 2). 

About half the total area of membrane in a eukaryotic cell encloses the laby- 
rinthine spaces of the endoplasmic reticulum (ER). The rough ER has many ribo- 
somes bound to its cytosolic surface. Ribosomes are organelles that are not mem- 
brane-enclosed; they synthesize both soluble and integral membrane proteins, 
most of which are destined either for secretion to the cell exterior or for other 
organelles. We shall see that, whereas proteins are transported into other mem- 
brane-enclosed organelles only after their synthesis is complete, they are trans- 
ported into the ER as they are synthesized. This explains why the ER membrane is 
unique in having ribosomes tethered to it. The ER also produces most of the lipid 
for the rest of the cell and functions as a store for Ca** ions. Regions of the ER that 
lack bound ribosomes are called smooth ER. The ER sends many of its proteins 
and lipids to the Golgi apparatus, which often consists of organized stacks of disc- 
like compartments called Golgi cisternae. The Golgi apparatus receives lipids and 
proteins from the ER and dispatches them to various destinations, usually cova- 
lently modifying them en route. 

Mitochondria and chloroplasts generate most of the ATP that cells use to drive 
reactions requiring an input of free energy; chloroplasts are a specialized version 
of plastids (present in plants, algae, and some protozoa), which can also have 
other functions, such as the storage of food or pigment molecules. Lysosomes con- 
tain digestive enzymes that degrade defunct intracellular organelles, as well as 
macromolecules and particles taken in from outside the cell by endocytosis. On 
the way to lysosomes, endocytosed material must first pass through a series of 
organelles called endosomes. Finally, peroxisomes are small vesicular compart- 
ments that contain enzymes used in various oxidative reactions. 

In general, each membrane-enclosed organelle performs the same set of basic 
functions in all cell types. But to serve the specialized functions of cells, these 
organelles vary in abundance and can have additional properties that differ from 
cell type to cell type. 

On average, the membrane-enclosed compartments together occupy nearly 
half the volume of a cell (Table 12-1), and a large amount of intracellular mem- 
brane is required to make them. In liver and pancreatic cells, for example, the 


Figure 12-1 The major intracellular 
compartments of an animal cell. The 
cytosol (gray), endoplasmic reticulum, 
Golgi apparatus, nucleus, mitochondrion, 
endosome, lysosome, and peroxisome are 
distinct compartments isolated from the 
rest of the cell by at least one selectively 
permeable membrane (see Movie 9.2). 
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endoplasmic reticulum has a total membrane surface area that is, respectively, 
25 times and 12 times that of the plasma membrane (Table 12-2). The mem- 
brane-enclosed organelles are packed tightly in the cytoplasm, and, in terms of 
area and mass, the plasma membrane is only a minor membrane in most eukary- 
otic cells (Figure 12-2). 

The abundance and shape of membrane-enclosed organelles are regulated to 
meet the needs of the cell. This is particularly apparent in cells that are highly spe- 
cialized and therefore disproportionately rely on specific organelles. Plasma cells, 
for example, which secrete their own weight every day in antibody molecules into 
the bloodstream, contain vastly amplified amounts of rough ER, which is found 
in large, flat sheets. Cells that specialize in lipid synthesis also expand their ER, 
but in this case the organelle forms a network of convoluted tubules. Moreover, 
membrane-enclosed organelles are often found in characteristic positions in the 
cytoplasm. In most cells, for example, the Golgi apparatus is located close to the 
nucleus, whereas the network of ER tubules extends from the nucleus throughout 
the entire cytosol. These characteristic distributions depend on interactions of the 
organelles with the cytoskeleton. The localization of both the ER and the Golgi 
apparatus, for instance, depends on an intact microtubule array; if the microtu- 
bules are experimentally depolymerized with a drug, the Golgi apparatus frag- 
ments and disperses throughout the cell, and the ER network collapses toward the 
cell center (discussed in Chapter 16). The size, shape, composition, and location 
are all important and regulated features of these organelles that ultimately con- 
tribute to the organelle’s function. 


Evolutionary Origins May Help Explain the Topological 
Relationships of Organelles 


To understand the relationships between the compartments of the cell, it is help- 
ful to consider how they might have evolved. The precursors of the first eukaryotic 
cells are thought to have been relatively simple cells that—like most bacterial and 


TABLE 12-2 


ownenmenbane 5 


Mitochondria 
Outer membrane 
Inner membrane 


“These two cells are of very different sizes: the average hepatocyte has a volume of about 
5000 um? compared with 1000 um? for the pancreatic exocrine cell. Total cell membrane areas 
are estimated at about 110,000 um? and 13,000 pm2, respectively. 
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archaeal cells—have a plasma membrane but no internal membranes. The plasma 
membrane in such cells provides all membrane-dependent functions, including 
the pumping of ions, ATP synthesis, protein secretion, and lipid synthesis. Typ- 
ical present-day eukaryotic cells are 10-30 times larger in linear dimension and 
1000-10,000 times greater in volume than a typical bacterium such as E. coli. The 
profusion of internal membranes can be regarded, in part, as an adaptation to 
this increase in size: the eukaryotic cell has a much smaller ratio of surface area to 
volume, and its plasma membrane therefore presumably has too small an area to 
sustain the many vital functions that membranes perform. The extensive internal 
membrane systems of a eukaryotic cell alleviate this problem. 

The evolution of internal membranes evidently went hand-in-hand with the 
specialization of membrane function. A hypothetical scheme for how the first 
eukaryotic cells, with a nucleus and ER, might have evolved by the invagination 
and pinching off of the plasma membrane of an ancestral cell is illustrated in 
Figure 12-3.This process would create membrane-enclosed organelles with an 
interior or lumen that is topologically equivalent to the exterior of the cell. We 
shall see that this topological relationship holds for all of the organelles involved 
in the secretory and endocytic pathways, including the ER, Golgi apparatus, 
endosomes, lysosomes, and peroxisomes. We can therefore think of all of these 
organelles as members of the same topologically equivalent compartment. As we 
discuss in detail in the next chapter, their interiors communicate extensively with 
one another and with the outside of the cell via transport vesicles, which bud off 
from one organelle and fuse with another (Figure 12-4). 

As described in Chapter 14, mitochondria and plastids differ from the other 
membrane-enclosed organelles because they contain their own genomes. The 
nature of these genomes, and the close resemblance of the proteins in these 
organelles to those in some present-day bacteria, strongly suggest that mito- 
chondria and plastids evolved from bacteria that were engulfed by other cells 
with which they initially lived in symbiosis (see Figures 1-29 and 1-31): the inner 
membrane of mitochondria and plastids presumably corresponds to the original 
plasma membrane of the bacterium, while the lumen of these organelles evolved 
from the bacterial cytosol. Like the bacteria from which they were derived, 
both mitochondria and plastids are enclosed by a double membrane and they 
remain isolated from the extensive vesicular traffic that connects the interiors of 
most of the other membrane-enclosed organelles to each other and to the outside 
of the cell. 





Figure 12-2 An electron micrograph 
of part of a liver cell seen in cross 
section. Examples of most of the major 
intracellular organelles are indicated. 
(Courtesy of Daniel S. Friend.) 
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The evolutionary schemes just described group the intracellular compart- 
ments in eukaryotic cells into four distinct families: (1) the nucleus and the cyto- 
sol, which communicate with each other through nuclear pore complexes and are 
thus topologically continuous (although functionally distinct); (2) all organelles 
that function in the secretory and endocytic pathways—including the ER, Golgi 
apparatus, endosomes, and lysosomes, the numerous classes of transport inter- 
mediates such as transport vesicles that move between them, and peroxisomes; 
(3) the mitochondria; and (4) the plastids (in plants only). 


Proteins Can Move Between Compartments in Different Ways 


The synthesis of all proteins begins on ribosomes in the cytosol, except for the few 
that are synthesized on the ribosomes of mitochondria and plastids. Their sub- 
sequent fate depends on their amino acid sequence, which can contain sorting 
signals that direct their delivery to locations outside the cytosol or to organelle 
surfaces. Some proteins do not have a sorting signal and consequently remain in 
the cytosol as permanent residents. Many others, however, have specific sorting 
signals that direct their transport from the cytosol into the nucleus, the ER, mito- 
chondria, plastids, or peroxisomes; sorting signals can also direct the transport of 
proteins from the ER to other destinations in the cell. 
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Figure 12-3 One suggested pathway 
for the evolution of the eukaryotic 

cell and its internal membranes As 
discussed in Chapter 1, there is evidence 
that the nuclear genome of a eukaryotic 
cell evolved from an ancient archeaon. 

For example, clear homologs of actin, 
tubulin, histones, and the nuclear DNA 
replication system are found in archaea, 
but not in bacteria. Thus, it is now thought 
that the first eukaryotic cells arose when 
an ancient anaerobic archaeon joined 
forces with an aerobic bacterium roughly 
1.6 billion years ago. As indicated, the 
nuclear envelope may have originated from 
an invagination of the plasma membrane 
of this ancient archaeon—an invagination 
that protected its chromosome while still 
allowing access of the DNA to the cytosol 
(as required for DNA to direct protein 
synthesis). This envelope may have later 
pinched off completely from the plasma 
membrane, so as to produce a separate 
nuclear compartment surrounded by a 
double membrane. Because this double 
membrane is penetrated by nuclear pore 
complexes, the nuclear compartment is 
topologically equivalent to the cytosol. In 
contrast, the lumen of the ER is continuous 
with the space between the inner and outer 
nuclear membranes, and it is topologically 
equivalent to the extracellular space (see 
Figure 12-4). (Adapted from J. Martijn and 
T.J.G. Ettema, Biochem. Soc. Trans. 41: 
451-457, 2013.) 
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To understand the general principles by which sorting signals operate, it is 
important to distinguish three fundamentally different ways by which proteins 
move from one compartment to another. These three mechanisms are described 
below, and the transport steps at which they operate are outlined in Figure 12-5. 
We discuss the first two mechanisms (gated transport and transmembrane trans- 
port) in this chapter, and the third (vesicular transport, green arrows in Figure 
12-5) in Chapter 13. 


1. In gated transport, proteins and RNA molecules move between the cytosol 
and the nucleus through nuclear pore complexes in the nuclear envelope. 
The nuclear pore complexes function as selective gates that support the 
active transport of specific macromolecules and macromolecular assem- 
blies between the two topologically equivalent spaces, although they also 
allow free diffusion of smaller molecules. 


2. In protein translocation, transmembrane protein translocators directly 
transport specific proteins across a membrane from the cytosol into a 
space that is topologically distinct. The transported protein molecule usu- 
ally must unfold to snake through the translocator. The initial transport 
of selected proteins from the cytosol into the ER lumen or mitochondria, 
for example, occurs in this way. Integral membrane proteins often use the 
same translocators but translocate only partially across the membrane, so 
that the protein becomes embedded in the lipid bilayer. 


3. In vesicular transport, membrane-enclosed transport intermediates— 
which may be small, spherical transport vesicles or larger, irregularly 
shaped organelle fragments—ferry proteins from one topologically equiva- 
lent compartment to another. The transport vesicles and fragments become 
loaded with a cargo of molecules derived from the lumen of one compart- 
ment as they bud and pinch off from its membrane; they discharge their 
cargo into a second compartment by fusing with the membrane enclos- 
ing that compartment (Figure 12-6). The transfer of soluble proteins from 
the ER to the Golgi apparatus, for example, occurs in this way. Because the 


Figure 12-5 A simplified “roadmap” of protein traffic within a 
eukaryotic cell. Proteins can move from one compartment to another 
by gated transport (red), protein translocation (blue), or vesicular transport 
(green). The sorting signals that direct a given protein’s movement through 
the system, and thereby determine its eventual location in the cell, are 
contained in each protein’s amino acid sequence. The journey begins with the 
synthesis of a protein on a ribosome in the cytosol and, for many proteins, 
terminates when the protein reaches its final destination. Other proteins 
shuttle back and forth between the nucleus and cytosol. At each intermediate 
station (boxes), a decision is made as to whether the protein is to be retained 
in that compartment or transported further. A sorting signal may direct either 
retention in or exit from a compartment. 

We shall refer to this figure often as a guide in this chapter and the next, 
highlighting in color the particular pathway being discussed. 


Figure 12-4 Topologically equivalent 
compartments in the secretory and 
endocytic pathways in a eukaryotic 
cell. Compartments are said to be 
topologically equivalent if they can 
communicate with one another, in the 
sense that molecules can get from one 

to the other without having to cross a 
membrane. Topologically equivalent spaces 
are shown in red. (A) Molecules can be 
carried from one compartment to another 
topologically equivalent compartment 

by vesicles that bud from one and fuse 
with the other. (B) In principle, cycles of 
membrane budding and fusion permit 

the lumen of any of the organelles shown 
to communicate with any other and with 
the cell exterior by means of transport 
vesicles. Blue arrows indicate the 
extensive outbound and inbound vesicular 
traffic (discussed in Chapter 13). Some 
organelles, most notably mitochondria and 
(in plant cells) plastids, do not take part in 
this communication and are isolated from 
the vesicular traffic between organelles 
shown here. 
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Figure 12-6 Vesicle budding and fusion during vesicular transport. 
Transport vesicles bud from one compartment (donor) and fuse with another 
topologically equivalent (target) compartment. In the process, soluble 
components (red dots) are transferred from lumen to lumen. Note that 
membrane is also transferred and that the original orientation of both proteins 
and lipids in the donor compartment membrane is preserved in the target 
compartment membrane. Thus, membrane proteins retain their asymmetric 
orientation, with the same domains always facing the cytosol. 


transported proteins do not cross a membrane, vesicular transport can 
move proteins only between compartments that are topologically equiva- 
lent (see Figure 12-4). 

Each mode of protein transfer is usually guided by sorting signals in the trans- 
ported protein, which are recognized by complementary sorting receptors. If a 
large protein is to be imported into the nucleus, for example, it must possess a 
sorting signal that receptor proteins recognize to guide it through the nuclear pore 
complex. If a protein is to be transferred directly across a membrane, it must pos- 
sess a sorting signal that the translocator recognizes. Likewise, if a protein is to be 
loaded into a certain type of vesicle or retained in certain organelles, a comple- 
mentary receptor in the appropriate membrane must recognize its sorting signal. 


Signal Sequences and Sorting Receptors Direct Proteins to the 
Correct Cell Address 


Most protein sorting signals involved in transmembrane transport reside in 
a stretch of amino acid sequence, typically 15-60 residues long. Such signal 
sequences are often found at the N-terminus of the polypeptide chain, and in 
many cases specialized signal peptidases remove the signal sequence from the 
finished protein once the sorting process is complete. Signal sequences can also 
be internal stretches of amino acids, which remain part of the protein. Such sig- 
nals are used in gated transport into the nucleus. Sorting signals can also be com- 
posed of multiple internal amino acid sequences that form a specific three-di- 
mensional arrangement of atoms on the protein’s surface; such signal patches are 
sometimes used for nuclear import and in vesicular transport. 

Each signal sequence specifies a particular destination in the cell. Proteins 
destined for initial transfer to the ER usually have a signal sequence at their N- 
terminus that characteristically includes a sequence composed of about 5-10 
hydrophobic amino acids. Many of these proteins will in turn pass from the ER to 
the Golgi apparatus, but those with a specific signal sequence of four amino acids 
at their C-terminus are recognized as ER residents and are returned to the ER. 
Proteins destined for mitochondria have signal sequences of yet another type, in 
which positively charged amino acids alternate with hydrophobic ones. Finally, 
many proteins destined for peroxisomes have a signal sequence of three charac- 
teristic amino acids at their C-terminus. 

Table 12-3 presents some specific signal sequences. Experiments in which the 
peptide is transferred from one protein to another by genetic engineering tech- 
niques have demonstrated the importance of each of these signal sequences for 
protein targeting. Placing the N-terminal ER signal sequence at the beginning 
of a cytosolic protein, for example, redirects the protein to the ER; removing or 
mutating the signal sequence of an ER protein causes its retention in the cytosol. 
Signal sequences are therefore both necessary and sufficient for protein targeting. 
Even though their amino acid sequences can vary greatly, the signal sequences of 
proteins having the same destination are functionally interchangeable; physical 
properties, such as hydrophobicity, often seem to be more important in the sig- 
nal-recognition process than the exact amino acid sequence. 

Signal sequences are recognized by complementary sorting receptors that 
guide proteins to their appropriate destination, where the receptors unload their 
cargo. The receptors function catalytically: after completing one round of tar- 
geting, they return to their point of origin to be reused. Most sorting receptors 
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TABLE 12-3 


Import into nucleus -Pro-Pro-Lys-Lys-Lys-Arg-Lys-Val- 
Export from nucleus -Met-Glu-Glu-Leu-Ser-Gln-Ala-Leu-Ala-Ser-Ser-Phe- 


Import into mitochondria *H3N-Met-Leu-Ser-Leu-Arg-Gln-Ser-lle-Arg-Phe-Phe-Lys-Pro-Ala-Thr-Arg- Thr-Leu-Cys-Ser-Ser- 
Arg- Tyr-Leu-Leu- 


Import into plastid +HaN-Met-Val-Ala-Met-Ala-Met-Ala-Ser-Leu-Gln-Ser-Ser-Met-Ser-Ser-Leu-Ser-Leu-Ser-Ser-Asn- 


Ser-Phe-Leu-Gly-Gln-Pro-Leu-Ser-Pro-lle- [hr-Leu-Ser-Pro-Phe-Leu-Gln-Gly- 


Import into peroxisomes -Ser-Lys-Leu-COO- 


Import into ER *tH3aN-Met-Met-Ser-Phe-Val-Ser-Leu-Leu-Leu-Val-Gly-lle-Leu-Phe- Tro-Ala-Thr-Glu-Ala-Glu-Gln- 
Leu-Thr-Lys-Cys-Glu-Val-Phe-Gln- 


Return to ER -Lys-Asp-Glu-Leu-COO- 


Some characteristic features of the different classes of signal sequences are highlighted in color. Where they are known to be important for 

the function of the signal sequence, positively charged amino acids are shown in red and negatively charged amino acids are shown in green. 
Similarly, important hydrophobic amino acids are shown in orange and important hydroxylated amino acids are shown in blue. *H3N indicates the 
N-terminus of a protein; COO™ indicates the C-terminus. 





recognize classes of proteins rather than an individual protein species. They can 
therefore be viewed as public transportation systems, dedicated to delivering 
numerous different components to their correct location in the cell. 


Most Organelles Cannot Be Constructed De Novo: They Require 
Information in the Organelle Itself 


When a cell reproduces by division, it has to duplicate its organelles, in addition to 
its chromosomes. In general, cells do this by incorporating new molecules into the 
existing organelles, thereby enlarging them; the enlarged organelles then divide 
and are distributed to the two daughter cells. Thus, each daughter cell inherits a 
complete set of specialized cell membranes from its mother. This inheritance is 
essential because a cell could not make such membranes from scratch. If the ER 
were completely removed from a cell, for example, how could the cell reconstruct 
it? As we discuss later, the membrane proteins that define the ER and perform 
many of its functions are themselves products of the ER. A new ER could not be 
made without an existing ER or, at least, a membrane that specifically contains 
the protein translocators required to import selected proteins into the ER from the 
cytosol (including the ER-specific translocators themselves). The same is true for 
mitochondria and plastids. 

Thus, it seems that the information required to construct an organelle does not 
reside exclusively in the DNA that specifies the organelle’s proteins. Information 
in the form of at least one distinct protein that preexists in the organelle mem- 
brane is also required, and this information is passed from parent cell to daughter 
cells in the form of the organelle itself. Presumably, such information is essential 
for the propagation of the cell’s compartmental organization, just as the informa- 
tion in DNA is essential for the propagation of the cell’s nucleotide and amino acid 
sequences. 

As we discuss in more detail in Chapter 13, however, the ER buds off a con- 
stant stream of transport vesicles that incorporate only a subset of ER proteins and 
therefore have a composition different from the ER itself. Similarly, the plasma 
membrane constantly buds off various types of specialized endocytic vesicles. 
Thus, some organelles can form from other organelles and do not have to be 
inherited at cell division. 
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Summary 


Eukaryotic cells contain intracellular membrane-enclosed organelles that make up 
nearly half the cell’s total volume. The main ones present in all eukaryotic cells are 
the endoplasmic reticulum, Golgi apparatus, nucleus, mitochondria, lysosomes, 
endosomes, and peroxisomes; plant cells also contain plastids such as chloroplasts. 
These organelles contain distinct sets of proteins, which mediate each organelle’s 
unique function. 

Each newly synthesized organelle protein must find its way from a ribosome in 
the cytosol, where the protein is made, to the organelle where it functions. It does 
so by following a specific pathway, guided by sorting signals in its amino acid 
sequence that function as either signal sequences or signal patches. Sorting signals 
are recognized by complementary sorting receptors, which deliver the protein to the 
appropriate target organelle. Proteins that function in the cytosol do not contain 
sorting signals and therefore remain there after they are synthesized. 

During cell division, organelles such as the ER and mitochondria are distributed 
to each daughter cell. These organelles contain information that is required for their 
construction, and so they cannot be made de novo. 


THE TRANSPORT OF MOLECULES BETWEEN THE 
NUCLEUS AND THE CYTOSOL 


The nuclear envelope encloses the DNA and defines the nuclear compartment. 
This envelope consists of two concentric membranes, which are penetrated by 

nuclear pore complexes (Figure 12-7). Although the inner and outer nuclear J 
membranes are continuous, they maintain distinct protein compositions. The NUCLEUS — 
inner nuclear membrane contains proteins that act as binding sites for chromo- 
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somes and for the nuclear lamina, a protein meshwork that provides structural 
support for the nuclear envelope; the lamina also acts as an anchoring site for 
chromosomes and the cytoplasmic cytoskeleton (via protein complexes that span JL 
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the nuclear envelope). The inner membrane is surrounded by the outer nuclear ENDOPLASMIC RETICULUM 








membrane, which is continuous with the membrane of the ER. Like the ER mem- 
brane (discussed later), the outer nuclear membrane is studded with ribosomes it 
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ported into the space between the inner and outer nuclear membranes (the peri- 
nuclear space), which is continuous with the ER lumen (see Figure 12-7). 
Bidirectional traffic occurs continuously between the cytosol and the nucleus. LATE ENDOSOME 
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The many proteins that function in the nucleus—including histones, DNA poly- 
merases, RNA polymerases, transcriptional regulators, and RNA-processing pro- 
teins—are selectively imported into the nuclear compartment from the cytosol, 
where they are made. At the same time, almost all RNAs—including mRNAs, 
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ment and then exported to the cytosol. Like the import process, the export process Jb Jb 











is selective; mRNAs, for example, are exported only after they have been properly CELL EXTERIOR 











modified by RNA-processing reactions in the nucleus. In some cases, the trans- 
port process is complex. Ribosomal proteins, for instance, are made in the cytosol 
and imported into the nucleus, where they assemble with newly made ribosomal 
RNA into particles. The particles are then exported to the cytosol, where they 
assemble into ribosomes. Each of these steps requires selective transport across 
the nuclear envelope. 


Nuclear Pore Complexes Perforate the Nuclear Envelope 


Large and elaborate nuclear pore complexes (NPCs) perforate the nuclear enve- 
lope in all eukaryotes. Each NPC is composed of a set of approximately 30 differ- 
ent proteins, or nucleoporins. Reflecting the high degree of internal symmetry, 
each nucleoporin is present in multiple copies, resulting in 500-1000 protein mol- 
ecules in the fully assembled NPC, with an estimated mass of 66 million daltons in 
yeast and 125 million daltons in vertebrates (Figure 12-8). Most nucleoporins are 
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composed of repetitive protein domains of only a few different types, which have 
evolved through extensive gene duplication. Some of the scaffold nucleoporins 
(see Figure 12-8) are structurally related to vesicle coat protein complexes, such 
as clathrin and COPII coatomer (discussed in Chapter 13), which shape transport 
vesicles; and one protein is used as a common building block in both NPCs and 
vesicle coats. These similarities suggest a common evolutionary origin for NPCs 
and vesicle coats: they may derive from an early membrane-bending protein 
module that helped shape the elaborate membrane systems of eukaryotic cells, 
and in present-day cells stabilize the sharp membrane bends required to form a 
nuclear pore. 

The nuclear envelope of a typical mammalian cell contains 3000-4000 NPCs, 
although that number varies widely, from a few hundred in glial cells to almost 
20,000 in Purkinje neurons. The total traffic that passes through each NPC is enor- 
mous: each NPC can transport up to 1000 macromolecules per second and can 
transport in both directions at the same time. How it coordinates the bidirectional 
flow of macromolecules to avoid congestion and head-on collisions is not known. 

Each NPC contains aqueous passages, through which small water-soluble 
molecules can diffuse passively. Researchers have determined the effective size 
of these passages by injecting labeled water-soluble molecules of different sizes 
into the cytosol and then measuring their rate of diffusion into the nucleus. Small 
molecules (5000 daltons or less) diffuse in so fast that we can consider the nuclear 
envelope freely permeable to them. Large proteins, however, diffuse in much 
more slowly, and the larger a protein, the more slowly it passes through the NPC. 
Proteins larger than 60,000 daltons cannot enter by passive diffusion. This size 
cut-off to free diffusion is thought to result from the NPC structure (see Figure 
12-8). The channel nucleoporins with extensive unstructured regions form a dis- 
ordered tangle (much like a kelp bed in the ocean) that restricts the diffusion of 
large macromolecules while allowing smaller molecules to pass. 

Because many cell proteins are too large to diffuse passively through the NPCs, 
the nuclear compartment and the cytosol can maintain different protein compo- 
sitions. Mature cytosolic ribosomes, for example, are about 30 nm in diameter 
and thus cannot diffuse through the NPC, confining protein synthesis to the cyto- 
sol. But how does the nucleus export newly made ribosomal subunits or import 
large molecules, such as DNA polymerases and RNA polymerases, which have 
subunit molecular masses of 100,000-200,000 daltons? As we discuss next, these 
and most other transported protein and RNA molecules bind to specific receptor 
proteins that actively ferry large molecules through NPCs. Even small proteins like 
histones frequently use receptor-mediated mechanisms to cross the NPC, thereby 
increasing transport efficiency. 


Nuclear Localization Signals Direct Nuclear Proteins to the 
Nucleus 


When proteins are experimentally extracted from the nucleus and reintroduced 
into the cytosol, even the very large ones reaccumulate efficiently in the nucleus. 
Sorting signals called nuclear localization signals (NLSs) are responsible for the 
selectivity of this active nuclear import process. The signals have been precisely 
defined by using recombinant DNA technology for numerous nuclear proteins, as 
well as for proteins that enter the nucleus only transiently (Figure 12-9). In many 
nuclear proteins, the signals consist of one or two short sequences that are rich 
in the positively charged amino acids lysine and arginine (see Table 12-3, p. 648), 
with the precise sequence varying for different proteins. Other nuclear proteins 
contain different signals, some of which are not yet characterized. 

Nuclear localization signals can be located almost anywhere in the amino acid 
sequence and are thought to form loops or patches on the protein surface. Many 
function even when linked as short peptides to lysine side chains on the surface 
of a cytosolic protein, suggesting that the precise location of the signal within the 
amino acid sequence of a nuclear protein is not important. Moreover, as long 
as one of the protein subunits of a multicomponent complex displays a nuclear 
localization signal, the entire complex will be imported into the nucleus. 
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Figure 12-7 The nuclear envelope. The 
double-membrane envelope is penetrated 
by pores in which nuclear pore complexes 
(not shown) are positioned. The outer 
nuclear membrane is continuous with 

the endoplasmic reticulum (ER). The 
ribosomes that are normally bound to the 
cytosolic surface of the ER membrane and 
outer nuclear membrane are not shown. 
The nuclear lamina is a fibrous protein 
meshwork underlying the inner membrane. 
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One can visualize the transport of nuclear proteins through NPCs by coating 
gold particles with a nuclear localization signal, injecting the particles into the 
cytosol, and then following their fate by electron microscopy (Figure 12-10). The 
particles bind to the tentaclelike fibrils that extend from the scaffold nucleoporins 
at the rim of the NPC into the cytosol, and then proceed through the center of the 
NPC. Presumably, the unstructured regions of the nucleoporins that form a diffu- 
sion barrier for large molecules (mentioned earlier) are pushed away to allow the 
coated gold particles to squeeze through. 

Macromolecular transport across NPCs differs fundamentally from the 
transport of proteins across the membranes of other organelles, in that it occurs 
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Figure 12-8 The arrangement of NPCs in the nuclear envelope. (A) In a vertebrate NPC, nucleoporins are arranged with striking eightfold 
rotational symmetry. In addition, immunoelectron microscopic studies show that the proteins that make up the central portion of the NPC are 
oriented symmetrically across the nuclear envelope, so that the nuclear and cytosolic sides look identical. The eightfold rotational and twofold 
transverse symmetry explains how such a huge structure can be formed from only about 30 different proteins: many of the nucleoporins are 
present in 8, 16, or 32 copies. Based on their approximate localization in the central portion of the NPC, nucleoporins can be classified into (1) 
transmembrane ring proteins that span the nuclear envelope and anchor the NPC to the envelope; (2) scaffold nucleoporins that form layered ring 
structures. Some scaffold nucleoporins are membrane-bending proteins that stabilize the sharp membrane curvature where the nuclear envelope 
is penetrated; and (3) channel nucleoporins that line a central pore. In addition to folded domains that anchor the proteins in specific places, many 
channel nucleoporins contain extensive unstructured regions, where the polypeptide chains are intrinsically disordered. The central pore is filled 
with a tangled mesh of these disordered domains that blocks the passive diffusion of large macromolecules. The disordered regions contain a 
large number of phenylalanine-glycine (FG) repeats. Fibrils protrude from both the cytosolic and the nuclear sides of the NPC. By contrast to the 
twofold transverse symmetry of the NPC core, the fibrils facing the cytosol and nucleus are different: on the nuclear side, the fibrils converge at 
their distal end to form a basketlike structure. The precise arrangement of individual nucleoporins in the assembled NPC is still a matter of intense 
debate, because atomic resolution analyses have been hindered by the sheer size and flexible nature of the NPC, and by difficulties in purifying 
sufficient amounts of homogeneous material. A combination of electron microscopy, computational analyses, and crystal structures of nucleoporin 
subcomplexes has been used to develop the current models of the NPC architecture. (B) A scanning electron micrograph of the nuclear side of the 
nuclear envelope of an oocyte (see also Figure 9-52). (C) An electron micrograph showing a side view of two NPCs (brackets); note that the inner 
and outer nuclear membranes are continuous at the edges of the pore. (D) An electron micrograph showing face-on views of negatively stained 
NPCs. The membrane has been removed by detergent extraction. Note that some of the NPCs contain material in their center, which is thought to 
be trapped macromolecules in transit through these NPCs. (A, adapted from A. Hoelz, E.W. Debler and G. Blobel, Annu. Rev. Biochem. 80:613- 
643, 2011. With permission from Annual Reviews; B, from M.W. Goldberg and T.D. Allen, J. Cell Biol. 119:1429-1440, 1992. With permission from 
The Rockefeller University Press; C, courtesy of Werner Franke and Ulrich Scheer; D, courtesy of Ron Milligan.) 
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through a large, expandable, aqueous pore, rather than through a protein trans- 
porter spanning one or more lipid bilayers. For this reason, fully folded nuclear 
proteins can be transported into the nucleus through an NPC, and newly formed 
ribosomal subunits are transported out of the nucleus as an assembled particle. 
By contrast, proteins have to be extensively unfolded to be transported into most 
other organelles, as we discuss later. 


Nuclear Import Receptors Bind to Both Nuclear Localization 
Signals and NPC Proteins 


To initiate nuclear import, most nuclear localization signals must be recognized 
by nuclear import receptors, sometimes called importins, most of which are 
encoded by a family of related genes. Each family member encodes a receptor 
protein that can bind and transport the subset of cargo proteins containing the 
appropriate nuclear localization signal (Figure 12-11A). Nuclear import recep- 
tors do not always bind to nuclear proteins directly. Additional adaptor proteins 
can form a bridge between the import receptors and the nuclear localization sig- 
nals on the proteins to be transported (Figure 12-11B). Some adaptor proteins 
are structurally related to nuclear import receptors, suggesting a common evolu- 
tionary origin. By using a variety of import receptors and adaptors, cells are able 
to recognize the broad repertoire of nuclear localization signals that are displayed 
on nuclear proteins. 

The import receptors are soluble cytosolic proteins that bind both to the 
nuclear localization signal on the cargo protein and to the phenylalanine-gly- 
cine (FG) repeats in the unstructured domains of the channel nucleoporins that 
line the central pore. FG-repeats are also found in the cytoplasmic and nuclear 
fibrils. FG-repeats in the unstructured tangle of the pore are thought to do double 
duty. They interact weakly, which gives the protein tangle gel-like properties that 
impose a permeability barrier to large macromolecules, and they serve as dock- 
ing sites for nuclear import receptors. FG-repeats line the path through the NPCs 
taken by the import receptors and their bound cargo proteins. According to one 
model of nuclear transport, the receptor-cargo complexes move along the trans- 
port path by repeatedly binding, dissociating, and then re-binding to adjacent 
FG-repeat sequences. In this way, the complexes may hop from one nucleoporin 
to another to traverse the tangled interior of the NPC in a random walk. As import 
receptors bind to FG-repeats during this journey, they would disrupt interaction 
between the repeats and locally dissolve the gel phase of the protein tangle that 
fills the pore, allowing the passage of the receptor-cargo complex. Once inside the 
nucleus, the import receptors dissociate from their cargo and return to the cyto- 
sol. As we will see, this dissociation only occurs on the nuclear side of the NPC and 
thereby confers directionality to the import process. 


Nuclear Export Works Like Nuclear Import, But in Reverse 


The nuclear export of large molecules, such as new ribosomal subunits and 
RNA molecules, occurs through NPCs and also depends on a selective transport 





Figure 12-9 The function of a nuclear 
localization signal. |Immunofluorescence 
micrographs showing the cell location 

of SV40 virus T-antigen containing or 
lacking a short sequence that serves as a 
nuclear localization signal. (A) The normal 
T-antigen protein contains the lysine-rich 
sequence indicated and is imported to its 
site of action in the nucleus, as indicated 
by immunofluorescence staining with 
antibodies against the T-antigen. 

(B) T-antigen with an altered nuclear 
localization signal (a threonine replacing 

a lysine) remains in the cytosol. (From 

D. Kalderon, B. Roberts, W. Richardson 
and A. Smith, Cell 39:499-509, 1984. With 
permission from Elsevier.) 
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Figure 12-10 Visualizing active import 
through NPCs. This series of electron 
micrographs shows colloidal gold spheres 
(arrowheads) coated with peptides 
containing nuclear localization signals 
entering the nucleus through NPCs. The 
gold particles were injected into the cytosol 
of living cells, which then were fixed and 
prepared for electron microscopy at various 
times after injection. (A) Gold particles are 
first seen in proximity to the cytosolic fibrils 
of the NPCs. (B, C) They are then seen at 
the center of the NPCs, exclusively on the 
cytosolic face. (D) They then appear on 

the nuclear face. These gold particles have 
much larger diameters than the diffusion 
channels in the NPC and are imported by 
active transport. (From N. Panté and U. 
Aebi, Science 273:1729-1732, 1996. With 
permission from AAAS.) 
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system. The transport system relies on nuclear export signals on the macromol- 
ecules to be exported, as well as on complementary nuclear export receptors, 
or exportins. These receptors bind to both the export signal and NPC proteins to 
guide their cargo through the NPC to the cytosol. 

Many nuclear export receptors are structurally related to nuclear import 
receptors, and they are encoded by the same gene family of nuclear transport 
receptors, or karyopherins. In yeast, there are 14 genes encoding karyopherins; in 
animal cells, the number is significantly larger. It is often not possible to tell from 
their amino acid sequence alone whether a particular family member works as a 
nuclear import or nuclear export receptor. As might be expected, therefore, the 
import and export transport systems work in similar ways but in opposite direc- 
tions: the import receptors bind their cargo molecules in the cytosol, release them 
in the nucleus, and are then exported to the cytosol for reuse, while the export 
receptors function in the opposite fashion. 


The Ran GTPase Imposes Directionality on Transport Through 
NPCs 


The import of nuclear proteins through NPCs concentrates specific proteins in the 
nucleus and thereby increases order in the cell. The cell fuels this ordering process 
by harnessing energy stored in concentration gradients of the GTP-bound form 
of the monomeric GTPase Ran, which is required for both nuclear import and 
export. 

Like other GTPases, Ran is a molecular switch that can exist in two conforma- 
tional states, depending on whether GDP or GTP is bound (discussed in Chapter 
3). Two Ran-specific regulatory proteins trigger the conversion between the two 
states: a cytosolic GTPase-activating protein (GAP) triggers GTP hydrolysis and 
thus converts Ran-GTP to Ran-GDP, and a nuclear guanine exchange factor (GEF) 
promotes the exchange of GDP for GTP and thus converts Ran-GDP to Ran-GTP. 
Because Ran-GAP is located in the cytosol and Ran-GEF is located in the nucleus 
where it is anchored to chromatin, the cytosol contains mainly Ran-GDP, and the 
nucleus contains mainly Ran-GTP (Figure 12-12). 
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Figure 12-11 Nuclear import receptors 
(importins). (A) Different nuclear import 
receptors bind different nuclear localization 
signals and thereby different cargo 
proteins. (B) Cargo protein 4 requires an 
adaptor protein to bind to its nuclear import 
receptor. The adaptors are structurally 
related to nuclear import receptors and 
recognize nuclear localization signals 

on cargo proteins. They also contain a 
nuclear localization signal that binds them 
to an import receptor, but this signal only 
becomes exposed when they are loaded 
with a cargo protein. 


Figure 12-12 The compartmentalization 
of Ran-GDP and Ran-GTP. Localization 
of Ran-GDP in the cytosol and Ran-GTP in 
the nucleus results from the localization of 
two Ran regulatory proteins: Ran GTPase- 
activating protein (Ran-GAP) is located in 
the cytosol, and Ran guanine nucleotide 
exchange factor (Ran-GEF) binds to 
chromatin and is therefore located in the 
nucleus. 

Ran-GDP is imported into the nucleus by 
its Own import receptor, which is specific 
for the GDP-bound conformation of Ran. 
The Ran-GDP receptor is structurally 
unrelated to the main family of nuclear 
transport receptors. However, it also 
binds to FG-repeats in NPC channel 
nucleoporins. 
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This gradient of the two conformational forms of Ran drives nuclear transport 
in the appropriate direction. Docking of nuclear import receptors to FG-repeats 
on the cytosolic side of the NPC, for example, occurs whether or not these recep- 
tors are loaded with appropriate cargo. Import receptors, facilitated by FG-repeat 
binding, then enter the channel. If they reach the nuclear side of the pore com- 
plex, Ran-GTP binds to them, and, if the receptors arrive loaded with cargo mol- 
ecules, the Ran-GTP binding causes the receptors to release their cargo (Figure 
12-13A). Because the Ran-GDP in the cytosol does not bind to import (or export) 
receptors, unloading occurs only on the nuclear side of the NPC. In this way, the 
nuclear localization of Ran-GTP creates the directionality of the import process. 

Having discharged its cargo in the nucleus, the empty import receptor with 
Ran-GTP bound is transported back through the pore complex to the cytosol. 
There, Ran-GAP triggers Ran-GTP to hydrolyze its bound GTP, thereby converting 
it to Ran-GDP, which dissociates from the receptor. The receptor is then ready for 
another cycle of nuclear import. 

Nuclear export occurs by a similar mechanism, except that Ran-GTP in the 
nucleus promotes cargo binding to the export receptor, rather than promoting 
cargo dissociation. Once the export receptor moves through the pore to the cyto- 
sol, it encounters Ran-GAP, which induces the receptor to hydrolyze its GTP to 
GDP. As a result, the export receptor releases both its cargo and Ran-GDP in the 
cytosol. Free export receptors are then returned to the nucleus to complete the 
cycle (Figure 12-13B). 


Transport Through NPCs Can Be Regulated by Controlling Access 
to the Transport Machinery 


Some proteins contain both nuclear localization signals and nuclear export sig- 
nals. These proteins continually shuttle back and forth between the nucleus and 
the cytosol. The relative rates of their import and export determine the steady- 
state localization of such shuttling proteins: if the rate of import exceeds the rate 
of export, a protein will be located mainly in the nucleus; conversely, if the rate of 
export exceeds the rate of import, a protein will be located mainly in the cytosol. 
Thus, changing the rate of import, export, or both, can change the location of a 
protein. 






NUCLEAR EXPORT 


Figure 12-13 How GTP hydrolysis 

by Ran in the cytosol provides 
directionality to nuclear transport. 
Movement through the NPC of loaded 
nuclear transport receptors occurs along 
the FG-repeats displayed by certain NPC 
proteins. The differential localization of 
Ran-GTP in the nucleus and Ran-GDP 

in the cytosol provides directionality (red 
arrows) to both nuclear import (A) and 
nuclear export (B). Ran-GAP stimulates the 
hydrolysis of GTP to produce Ran-GDP on 
the cytosolic side of the NPC (see Figure 
12-12), 
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Some shuttling proteins move continuously into and out of the nucleus. In 
other cases, however, the transport is stringently controlled. As discussed in Chap- 
ter 7, cells control the activity of some transcription regulators by keeping them 
out of the nucleus until they are needed there (Figure 12-14). In many cases, cells 
control transport by regulating nuclear localization and export signals—turn- 
ing them on or off, often by phosphorylation of amino acids close to the signal 
sequences (Figure 12-15). 

Other transcription regulators are bound to inhibitory cytosolic proteins that 
either anchor them in the cytosol (through interactions with the cytoskeleton or 
specific organelles) or mask their nuclear localization signals so that they cannot 
interact with nuclear import receptors. An appropriate stimulus releases the gene 
regulatory protein from its cytosolic anchor or mask, and it is then transported 
into the nucleus. One important example is the latent gene regulatory protein that 
controls the expression of proteins involved in cholesterol metabolism. The pro- 
tein is made and stored in an inactive form as a transmembrane protein in the ER. 
When a cell is deprived of cholesterol, the protein is transported from the ER to 
the Golgi apparatus where it encounters specific proteases that cleave off the cyto- 
solic domain, releasing it into the cytosol. This domain is then imported into the 
nucleus, where it activates the transcription of genes required for both cholesterol 
uptake and synthesis (Figure 12-16). 

As we discuss in detail in Chapter 6, cells control the export of RNAs from the 
nucleus in a similar way. snRNAs, miRNAs, and tRNAs bind to the same family 
of nuclear export receptors just discussed, and they use the same Ran-GTP gra- 
dient to fuel the transport process. By contrast, the export of mRNAs out of the 
nucleus uses a different mechanism. mRNAs are exported as large assemblies, 
which can be as large as 100 million daltons (see Figure 6-37) and can contain 
hundreds of proteins of a few dozen different types. These mRNA ribonucleo- 
protein complexes (mRNPs) first dock at the nuclear side of the NPC, where they 
are extensively remodeled. Although Ran-GTP is indirectly involved in the export 
(because it imports the proteins that bind to the mRNA molecules), the transloca- 
tion across the NPC is thought to be driven by ATP hydrolysis. How export direc- 
tionality is assured is unclear. It is likely that the many accessory proteins tethered 
to the NPC’s nuclear and cytoplasmic fibrils have important roles in remodeling 
the mRNPs as they pass through the pores, in particular stripping away nuclear 
proteins as the mRNPs exit on the cytosolic side of the NPC, thereby ensuring 
that transport is unidirectional. Upon entry into the cytosol, these nuclear mRNP 
proteins are rapidly returned to the nucleus. 
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Figure 12-14 The control of nuclear 
transport in the early Drosophila 
embryo. The embryo at this stage is a 
syncytium, shown here in cross section, 
with many nuclei in a common cytoplasm, 
arranged around the periphery, just 
beneath the plasma membrane. The 
transcription regulatory protein Dorsal 

is produced uniformly throughout the 
peripheral cytoplasm, but it can act only 
when inside the nuclei. The Dorsal protein 
has been stained with an enzyme-coupled 
antibody that yields a brown product, 
revealing that Dorsal is excluded from the 
nuclei at the dorsal side (top) of the embryo 
but is concentrated in the nuclei toward 
the ventral side (bottom) of the embryo. 
The regulated traffic of Dorsal into the 
nuclei controls the differential develooment 
between the back and belly of the animal. 
(Courtesy of Siegfried Roth.) 


Figure 12-15 The control of nuclear 
import during T cell activation. The 
nuclear factor of activated T cells 

(NF-AT) is a transcription regulatory protein 
that, in the resting T cell, is found in the 
cytosol in a phosphorylated state. When 

T cells are activated by foreign antigen 
(discussed in Chapter 24), the intracellular 
Ca?*+ concentration increases. In high Ca?*, 
the protein phosphatase calcineurin binds 
to NF-AT and dephosphorylates it. The 
dephosphorylation exposes nuclear import 
signals and blocks a nuclear export signal. 
The complex of NF-AT and calcineurin 

is therefore imported into the nucleus, 
where NF-AT activates the transcription 

of numerous genes required for T cell 
activation. 

The response shuts off when Ca?+ levels 
decrease, releasing NF-AT from calcineurin. 
Rephosphorylation of NF-AT inactivates 
the nuclear import signals and re-exposes 
the nuclear export signal, causing NF-AT to 
relocate to the cytosol. Some of the most 
potent immunosuppressive drugs, including 
cyclosporin A and FK506, inhibit the ability 
of calcineurin to dephosphorylate NF-AT 
and thereby block the nuclear accumulation 
of NF-AT and T cell activation (Movie 12.1). 
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During Mitosis the Nuclear Envelope Disassembles 


The nuclear lamina, located on the nuclear side of the inner nuclear membrane, 
is a meshwork of interconnected protein subunits called nuclear lamins. The 
lamins are a special class of intermediate filament proteins (discussed in Chapter 
16) that polymerize into a two-dimensional lattice (Figure 12-17). The nuclear 
lamina gives shape and stability to the nuclear envelope, to which it is anchored 
by attachment to both the NPCs and transmembrane proteins of the inner nuclear 
membrane. The lamina also interacts directly with chromatin, which itself inter- 
acts with transmembrane proteins of the inner nuclear membrane. Together with 
the lamina, these inner membrane proteins provide structural links between the 
DNA and the nuclear envelope. 

When a nucleus is dismantled during mitosis, the NPCs and nuclear lamina 
disassemble and the nuclear envelope fragments. The dismantling process is at 
least partly a consequence of direct phosphorylation of nucleoporins and lamins 
by the cyclin-dependent protein kinase (Cdk) that is activated at the onset of mito- 
sis (discussed in Chapter 17). During this process, some NPC proteins become 
bound to nuclear import receptors, which play an important part in the reassem- 
bly of NPCs at the end of mitosis. Nuclear envelope membrane proteins—no lon- 
ger tethered to the pore complexes, lamina, or chromatin—disperse throughout 
the ER membrane. The dynein motor protein, which moves along microtubules 
(discussed in Chapter 16), actively participates in tearing the nuclear envelope off 
the chromatin. Together, these processes break down the barriers that normally 
separate the nucleus and cytosol, and the nuclear proteins that are not bound to 
membranes or chromosomes intermix completely with the proteins of the cytosol 
(Figure 12-18). 

Later in mitosis, the nuclear envelope reassembles on the surface of the 
daughter chromosomes. In addition to its crucial role in nuclear transport, the 
Ran GTPase also acts as a positional marker for chromatin during cell division, 
when the nuclear and cytosolic components intermix. Because Ran-GEF remains 
bound to chromatin when the nuclear envelope breaks down, Ran molecules 
close to chromatin are mainly in their GTP-bound conformation. By contrast, Ran 
molecules further away have a high likelihood of encountering Ran-GAP, which 
is distributed throughout the cytosol; these Ran molecules are mainly in their 
GDP-bound conformation. As a result, the chromosomes in mitotic cells are sur- 
rounded by a cloud of Ran-GTP. Ran-GTP releases the NPC proteins in proximity 
to the chromosomes from nuclear import receptors. The free NPC proteins attach 
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Figure 12-16 Feedback regulation 

of cholesterol biosynthesis. SREBP 
(sterol response element binding protein), 
a latent transcription regulator that controls 
expression of cholesterol biosynthetic 
enzymes, is initially synthesized as an ER 
membrane protein. It is anchored in the 

ER if there is sufficient cholesterol in the 
membrane by interaction with another 

ER membrane protein, called SCAP 
(SREBP cleavage activation protein), 

which binds cholesterol. If the cholesterol 
binding site on SCAP is empty (at low 
cholesterol concentrations), SCAP changes 
conformation and is packaged together 
with SREBP into transport vesicles, which 
deliver their cargo to the Golgi apparatus, 
where two Golgi-resident proteases cleave 
SREBP to free its cytosolic domain from 
the membrane. The cytosolic domain then 
moves into the nucleus, where it binds 

to the promoters of genes that encode 
proteins involved in cholesterol biosynthesis 
and activates their transcription. In this 
way, more cholesterol is made when its 
concentration falls below a threshold. 





Figure 12-17 The nuclear lamina. An 
electron micrograph of a portion of the 
nuclear lamina in a Xenopus oocyte prepared 
by freeze-drying and metal shadowing. 

The lamina is formed by a regular lattice of 
specialized intermediate filaments. Lamins 
are only present in metazoan cells. Other, 
yet-unknown proteins may serve similar 
functions in species that lack lamins. 
(Courtesy of Ueli Aebi.) 
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to the chromosome surface, where they assemble into new NPCs. At the same 
time, inner nuclear membrane proteins and dephosphorylated lamins bind again 
to chromatin. ER membranes wrap around groups of chromosomes until they 
form a sealed nuclear envelope (Movie 12.2). During this process, the NPCs start 
actively re-importing proteins that contain nuclear localization signals. Because 
the nuclear envelope is initially closely applied to the surface of the chromosomes, 
the newly formed nucleus excludes all proteins except those initially bound to the 
mitotic chromosomes and those that are selectively imported through NPCs. In 
this way, all other large proteins, including ribosomes, are kept out of the newly 
assembled nucleus. 

As we discuss in Chapter 17, the cloud of Ran-GTP surrounding chromatin is 
also important in assembling the mitotic spindle in a dividing cell. 


Summary 


The nuclear envelope consists ofan inner and an outer nuclear membrane that are 
continuous with each other and with the ER membrane, and the space between the 
inner and outer nuclear membrane is continuous with the ER lumen. RNA mole- 
cules, which are made in the nucleus, and ribosomal subunits, which are assembled 
there, are exported to the cytosol; in contrast, all the proteins that function in the 
nucleus are synthesized in the cytosol and are then imported. The extensive traffic of 
materials between the nucleus and cytosol occurs through nuclear pore complexes 
(NPCs), which provide a direct passageway across the nuclear envelope. Small 
molecules diffuse passively through the NPCs, but large macromolecules have to be 
actively transported. 

Proteins containing nuclear localization signals are actively transported into 
the nucleus through NPCs, while proteins containing nuclear export signals are 
transported out of the nucleus to the cytosol. Some proteins, including the nuclear 
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Figure 12-18 The breakdown and re- 
formation of the nuclear envelope and 
lamina during mitosis. Phosphorylation 
of the lamins triggers the disassembly 

of the nuclear lamina, which initiates 

the nuclear envelope to break up. 
Dephosphorylation of the lamins reverses 
the process. An analogous phosphorylation 
and dephosphorylation cycle occurs 

for some nucleoporins and proteins of 

the inner nuclear membrane, and some 

of these dephosphorylations are also 
involved in the reassembly process. As 
indicated, the nuclear envelope initially 
re-forms around individual decondensing 
daughter chromosomes. Eventually, 

as decondensation progresses, these 
structures fuse to form a single complete 
nucleus. 

Mitotic breakdown of the nuclear 
envelope occurs in all metazoan cells. 
However, in many other species, such as 
yeasts, the nuclear envelope remains intact 
during mitosis, and the nucleus divides by 
fission. 
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import and export receptors, continually shuttle between the cytosol and nucleus. 
The monomeric GTPase Ran provides both the free energy and the directionality for 
nuclear transport. Cells regulate the transport of nuclear proteins and RNA mole- 
cules through the NPCs by controlling the access of these molecules to the transport 
machinery. Newly transcribed messenger RNA and ribosomal RNA are exported 
from the nucleus as parts of large ribonucleoprotein complexes. Because nuclear 
localization signals are not removed, nuclear proteins can be imported repeatedly, 
as is required each time that the nucleus reassembles after mitosis. 


THE TRANSPORT OF PROTEINS INTO 
MITOCHONDRIA AND CHLOROPLASTS 


Mitochondria and chloroplasts (a specialized form of plastids in green algae 
and plant cells) are double-membrane-enclosed organelles. They specialize in 
ATP synthesis, using energy derived from electron transport and oxidative phos- 
phorylation in mitochondria and from photosynthesis in chloroplasts (discussed NUCLEUS 
in Chapter 14). Although both organelles contain their own DNA, ribosomes, 
and other components required for protein synthesis, most of their proteins are © PLASTIDS 
encoded in the cell nucleus and imported from the cytosol. Each imported pro- 

tein must reach the particular organelle subcompartment in which it functions. _ MITOCHONDRIA 

There are different subcompartments in mitochondria (Figure 12-19A): the 
internal matrix space and the intermembrane space, which is continuous with 7 
the cristae space. These compartments are formed by the two concentric mito- 
chondrial membranes: the inner membrane, which encloses the matrix space GOLGI 
and forms extensive invaginations called cristae, and the outer membrane, i JI 
which is in contact with the cytosol. Protein complexes provide boundaries at the ER 
junctions where the cristae invaginate and divide the inner membrane into two LATE ENDOSOME VESICLES 
domains: one inner membrane domain surrounds the cristae space, and the other 
domain abuts the outer membrane. Chloroplasts also have an outer and inner 
membrane, which enclose an intermembrane space, and the stroma, which is the 
chloroplast equivalent of the mitochondrial matrix space (Figure 12-19B). They 
have an additional subcompartment, the thylakoid space, which is surrounded by se 
the thylakoid membrane. The thylakoid membrane derives from the inner mem- if U L 
brane during plastid development and is pinched off to become discontinuous O CELEXTERIOR o O) 
with it. Each of the subcompartments in mitochondria and chloroplasts contains 
a distinct set of proteins. 

New mitochondria and chloroplasts are produced by the growth of preexist- 
ing organelles, followed by fission (discussed in Chapter 14). The growth depends 
mainly on the import of proteins from the cytosol. The imported proteins must 
be transported across a number of membranes in succession and end up in the 
appropriate place. The process of protein movement across membranes is called 
protein translocation. This section explains how it occurs. 
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Translocation into Mitochondria Depends on Signal Sequences 
and Protein Translocators 


Proteins imported into mitochondria are usually taken up from the cytosol within 
seconds or minutes of their release from ribosomes. Thus, in contrast to protein 
translocation into the ER, which often takes place simultaneously with translation 
by a ribosome docked on the rough ER membrane (described later), mitochon- 
drial proteins are first fully synthesized as mitochondrial precursor proteins in 
the cytosol and then translocated into mitochondria by a post-translational mech- 
anism. One or more signal sequences direct all mitochondrial precursor proteins 
to their appropriate mitochondrial subcompartment. Many proteins entering the 
matrix space contain a signal sequence at their N-terminus that a signal pepti- 
dase rapidly removes after import. Other imported proteins, including all outer 
membrane and many inner membrane and intermembrane space proteins, have 
internal signal sequences that are not removed. The signal sequences are both 
necessary and sufficient for the import and correct localization of the proteins: 
when genetic engineering techniques are used to link these signals to a cytosolic 
protein, the signals direct the protein to the correct mitochondrial subcompart- 
ment. 

The signal sequences that direct precursor proteins into the mitochondrial 
matrix space are best understood. They all form an amphiphilic a helix, in which 
positively charged residues cluster on one side of the helix, while uncharged 
hydrophobic residues cluster on the opposite side. Specific receptor proteins that 
initiate protein translocation recognize this configuration rather than the precise 
amino acid sequence of the signal sequence (Figure 12-20). 

Multisubunit protein complexes that function as protein translocators medi- 
ate protein movement across mitochondrial membranes. The TOM complex 
transfers proteins across the outer membrane, and two TIM complexes (TIM23 
and TIM22) transfer proteins across the inner membrane (Figure 12-21). These 
complexes contain some components that act as receptors for mitochondrial pre- 
cursor proteins, and other components that form the translocation channels. 

The TOM complex is required for the import of all nucleus-encoded mitochon- 
drial proteins. It initially transports their signal sequences into the intermembrane 
space and helps to insert transmembrane proteins into the outer membrane. 
8-barrel proteins, which are particularly abundant in the outer membrane, are 
then passed on to an additional translocator, the SAM complex, which helps them 
to fold properly in the outer membrane. The TIM23 complex transports some sol- 
uble proteins into the matrix space and helps to insert transmembrane proteins 
into the inner membrane. The TIM22 complex mediates the insertion of a sub- 
class of inner membrane proteins, including the transporter that moves ADP, ATP, 
and phosphate in and out of mitochondria. Yet another protein translocator in 
the inner mitochondrial membrane, the OXA complex, mediates the insertion of 
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Figure 12-20 A signal sequence 

for mitochondrial protein import. 
Cytochrome oxidase is a large multiprotein 
complex located in the inner mitochondrial 
membrane, where it functions as the 
terminal enzyme in the electron-transport 
chain (discussed in Chapter 14). (A) The 
first 18 amino acids of the precursor to 
subunit IV of this enzyme serve as a signal 
sequence for import of the subunit into 
the mitochondrion. (B) When the signal 
sequence is folded as an a helix, the 
positively charged amino acids (red) are 
clustered on one face of the helix, while 
the nonpolar ones (green) are clustered 
primarily on the opposite face. Uncharged 
polar amino acids are shaded orange; 
nitrogen atoms on the side chains of 

Arg and Gln are colored blue. Signal 
sequences that direct proteins into the 
matrix space always have the potential to 
form such an amphiphilic a helix, which is 
recognized by specific receptor proteins 
on the mitochondrial surface. (C) The 
structure of a signal sequence (of alcohol 
dehydrogenase, another mitochondrial 
matrix enzyme), bound to an import 
receptor (gray), as determined by nuclear 
magnetic resonance. The amphiphilic a 
helix binds with its hydrophobic face to 

a hydrophobic groove in the receptor 
(PDB code: 10M2). 
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those inner membrane proteins that are synthesized within mitochondria. It also 
helps to insert some imported inner membrane proteins that are initially trans- 
ported into the matrix space by the other complexes. 


Mitochondrial Precursor Proteins Are Imported as Unfolded 
Polypeptide Chains 


We have learned almost everything we know about the molecular mechanism of 
protein import into mitochondria from analyses of cell-free, reconstituted trans- 
location systems, in which purified mitochondria in a test tube import radiola- 
beled mitochondrial precursor proteins. By changing the conditions in the test 
tube, it is possible to establish the biochemical requirements for the import. 

Mitochondrial precursor proteins do not fold into their native structures after 
they are synthesized; instead, they remain unfolded in the cytosol through inter- 
actions with other proteins. Some of these interacting proteins are general chap- 
erone proteins of the hsp70 family (discussed in Chapter 6), whereas others are 
dedicated to mitochondrial precursor proteins and bind directly to their signal 
sequences. All the interacting proteins help to prevent the precursor proteins 
from aggregating or folding up spontaneously before they engage with the TOM 
complex in the outer mitochondrial membrane. As a first step in the import pro- 
cess, the import receptors of the TOM complex bind the signal sequence of the 
mitochondrial precursor protein. The interacting proteins are then stripped off, 
and the unfolded polypeptide chain is fed—signal sequence first—into the trans- 
location channel. 

In principle, a protein could reach the mitochondrial matrix space by either 
crossing the two membranes all at once or crossing one at a time. One can dis- 
tinguish between these possibilities by cooling a cell-free mitochondrial import 
system to arrest the proteins at an intermediate step in the translocation process. 
The result is that the arrested proteins no longer contain their N-terminal signal 
sequence, indicating that the N-terminus must be in the matrix space where the 
signal peptidase is located, but the bulk of the protein can still be attacked from 
outside the mitochondria by externally added proteolytic enzymes. Clearly, the 
precursor proteins can pass through both mitochondrial membranes at once to 
enter the matrix space (Figure 12-22). The TOM complex first transports the sig- 
nal sequence across the outer membrane to the intermembrane space, where it 
binds to a TIM complex, opening the channel in the complex. The polypeptide 
chain is then either translocated into the matrix space or inserted into the inner 
membrane. 


Figure 12-21 The protein translocators 
in the mitochondrial membranes. The 
TOM, TIM, SAM, and OXA complexes are 
multimeric membrane protein assemblies 
that catalyze protein translocation across 
mitochondrial membranes. The protein 
components of the TIM22 and TIM23 
complexes that line the import channel are 
structurally related, suggesting a common 
evolutionary origin of both TIM complexes. 
On the matrix side, the TIM23 complex 

is bound to a multimeric protein complex 
containing mitochondrial hsp70, which 
acts as an import ATPase, using ATP 
hydrolysis to pull proteins through the pore. 
In animal cells, subtle variations exist in the 
subunit composition of the translocator 
complexes to adapt the mitochondrial 
import machinery to the particular needs of 
specialized cell tyoes. SAM = Sorting and 
Assembly Machinery; OXA = cytochrome 
Oxidase Activity; TIM = Translocator of the 
Inner Mitochondrial membrane; 

TOM = Translocator of the Outer 
Membrane. 
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TOM Figure 12-22 Protein import by 
complex mitochondria. The N-terminal signal 
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inner mitochondrial membrane protein is recognized by receptors 
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Although the TOM and TIM complexes usually work together to translocate 
precursor proteins across both membranes at the same time, they can work inde- 
pendently. In isolated outer membranes, for example, the TOM complex can 
translocate the signal sequence of precursor proteins across the membrane. Sim- 
ilarly, if the outer membrane is experimentally disrupted in isolated mitochon- 
dria, the exposed TIM23 complex can efficiently import precursor proteins into 
the matrix space. 


ATP Hydrolysis and a Membrane Potential Drive Protein Import 
Into the Matrix Space 


Directional transport requires energy, which in most biological systems is sup- 
plied by ATP hydrolysis. ATP hydrolysis fuels mitochondrial protein import at 
two discrete sites, one outside the mitochondria and one in the matrix space. In 
addition, protein import requires another energy source, which is the membrane 
potential across the inner mitochondrial membrane (Figure 12-23). 

The first requirement for energy occurs at the initial stage of the translocation 
process, when the unfolded precursor protein, associated with chaperone pro- 
teins, interacts with the import receptors of the TOM complex. As discussed in 
Chapter 6, the binding and release of newly synthesized polypeptides from the 
chaperone proteins requires ATP hydrolysis. 
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Figure 12-23 The role of energy in protein import into the mitochondrial matrix space. (1) Bound cytosolic hsp70 
chaperone is released from the precursor protein in a step that depends on ATP hydrolysis. After initial insertion of the signal 
sequence and of adjacent portions of the polypeptide chain into the TOM complex translocation channel, the signal sequence 
interacts with a TIM complex. (2) The signal sequence is then translocated into the matrix space in a process that requires the 
energy in the membrane potential across the inner membrane. (3) Mitochondrial hsp70, which is part of an import ATPase 
complex, binds to regions of the polypeptide chain as they become exposed in the matrix space, pulling the protein through the 
translocation channel, using the energy of ATP hydrolysis. 
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Once the signal sequence has passed through the TOM complex and is bound 
to a TIM complex, further translocation through the TIM translocation channel 
requires the membrane potential, which is the electrical component of the elec- 
trochemical H* gradient across the inner membrane (see Figure 11-4). Pumping 
of H* from the matrix space to the intermembrane space, driven by electron trans- 
port processes in the inner membrane (discussed in Chapter 14), maintains the 
electrochemical gradient. The energy in the electrochemical H* gradient across 
the inner membrane therefore not only powers most of the cell’s ATP synthesis, 
but it also drives the translocation of the positively charged signal sequences 
through the TIM complexes by electrophoresis. 

Mitochondrial hsp70 also plays a crucial part in the import process. Mito- 
chondria containing mutant forms of the protein fail to import precursor proteins. 
The mitochondrial hsp70 is part of a multisubunit protein assembly that is bound 
to the matrix side of the TIM23 complex and acts as a motor to pull the precursor 
protein into the matrix space. Like its cytosolic cousin, mitochondrial hsp70 has a 
high affinity for unfolded polypeptide chains, and it binds tightly to an imported 
protein chain as soon as the chain emerges from the TIM translocator in the 
matrix space. The hsp70 then undergoes a conformational change and releases 
the protein chain in an ATP-dependent step, exerting a ratcheting/pulling force 
on the protein being imported. This energy-driven cycle of binding and subse- 
quent release provides the final driving force needed to complete protein import 
after a protein has initially inserted into the TIM23 complex (see Figure 12-23). 

After the initial interaction with mitochondrial hsp70, many imported matrix 
proteins are passed on to another chaperone protein, mitochondrial hsp60. As 
discussed in Chapter 6, hsp60 helps the unfolded polypeptide chain to fold by 
binding and releasing it through cycles of ATP hydrolysis. 


Bacteria and Mitochondria Use Similar Mechanisms to Insert 
Porins into their Outer Membrane 


The outer mitochondrial membrane, like the outer membrane of Gram-negative 
bacteria (see Figure 11-17), contains abundant pore-forming {-barrel proteins 
called porins, and it is thus freely permeable to inorganic ions and metabolites 
(but not to most proteins). In contrast to other outer membrane proteins, which 
are anchored in the membrane through transmembrane a-helical regions, the 
TOM complex cannot integrate porins into the lipid bilayer. Instead, porins are 
first transported unfolded into the intermembrane space, where they transiently 
bind specialized chaperone proteins, which keep the porins from aggregating 
(Figure 12-24A). They then bind to the SAM complex in the outer membrane, 
which both inserts them into the outer membrane and helps them fold properly. 
One of the central subunits of the SAM complex is homologous to a bacterial 
outer membrane protein that helps insert B-barrel proteins into the bacterial outer 
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Figure 12-24 Integration of porins into 
the outer mitochondrial and bacterial 
membranes. (A) After translocation 
through the TOM complex in the outer 
mitochondrial membrane, B-barrel proteins 
bind to chaperones in the intermembrane 
space. The SAM complex then inserts the 
unfolded polypeptide chain into the outer 
membrane and helps the chain fold. 

(B) A structurally related BAM complex in 
the outer membrane of Gram-negative 
bacteria catalyzes B-barrel protein insertion 
and folding (see Figure 11-17). 
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membrane from the periplasmic space (the equivalent of the intermembrane 
space in mitochondria) (Figure 12-24B). This conserved pathway for inserting 
B-barrel proteins further underscores the endosymbiotic origin of mitochondria. 


Transport Into the Inner Mitochondrial Membrane and 
Intermembrane Space Occurs Via Several Routes 


The same mechanism that transports proteins into the matrix space using the 
TOM and TIM23 translocators (see Figure 12-22) also mediates the initial translo- 
cation of many proteins that are destined for the inner mitochondrial membrane 
or the intermembrane space. In the most common translocation route, only the 
N-terminal signal sequence of the transported protein actually enters the matrix 
space (Figure 12-25A). A hydrophobic amino acid sequence, strategically placed 
after the N-terminal signal sequence, acts as a stop-transfer sequence, preventing 
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Figure 12-25 Protein import from the cytosol into the inner mitochondrial membrane and intermembrane space. (A) The N-terminal 
signal sequence (red) initiates import into the matrix space (see Figure 12-22). A hydrophobic sequence (blue) that follows the matrix-targeting 
signal sequence binds to the TIM23 translocator (orange) in the inner membrane and stops translocation. The remainder of the protein is then 
pulled into the intermembrane space through the TOM translocator in the outer membrane, and the hydrophobic sequence is released into the 
inner membrane anchoring the protein there. (B) A second route for protein integration into the inner membrane first delivers the protein completely 
into the matrix space. Cleavage of the signal sequence (red) used for the initial translocation unmasks an adjacent hydrophobic signal sequence 
(blue) at the new N-terminus. This signal then directs the protein into the inner membrane, using the same OXA-dependent pathway that inserts 
proteins that are encoded by the mitochondrial genome and translated in the matrix space. (C) Some soluble proteins of the intermembrane space 
also use the pathways shown in (A) and (B) before they are released into the intermembrane space by a second signal peptidase, which has its 
active site in the intermembrane space and removes the hydrophobic signal sequence. (D) Some soluble intermembrane-space proteins become 
oxidized by the Mia40 protein (Mia = mitochondrial intermembrane space assembly) during import. Mia40 forms a covalent intermediate through an 
intermolecular disulfide bond, which helps pull the transported protein through the TOM complex. Mia40 becomes reduced in the process, and then 
is reoxidized by the electron transport chain, so that it can catalyze the next round of import. (E) Multipass inner membrane proteins that function 
as metabolite transporters contain internal signal sequences and snake through the TOM complex as a loop. They then bind to the chaperones in 
the intermembrane space, which guide the proteins to the TIM22 complex. The TIM22 complex is specialized for the insertion of multipass inner 
membrane proteins. 
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further translocation across the inner membrane. The remainder of the protein 
then crosses the outer membrane through the TOM complex into the intermem- 
brane space; the signal sequence is cleaved off in the matrix, and the hydrophobic 
sequence, released from TIM23, remains anchored in the inner membrane. 

In another transport route to the inner membrane or intermembrane space, 
the TIM23 complex initially translocates the entire protein into the matrix space 
(Figure 12-25B). A matrix signal peptidase then removes the N-terminal signal 
sequence, exposing a hydrophobic sequence at the new N-terminus. This signal 
sequence guides the protein to the OXA complex, which inserts the protein into 
the inner membrane. As mentioned earlier, the OXA complex is primarily used 
to insert proteins that are encoded and translated in the mitochondrion into the 
inner membrane, and only a few imported proteins use this pathway. Transloca- 
tors that are closely related to the OXA complex are found in the plasma mem- 
brane of bacteria and in the thylakoid membrane of chloroplasts, where they 
insert membrane proteins by a similar mechanism. 

Many proteins that use these pathways to the inner membrane remain 
anchored there through their hydrophobic signal sequence (see Figure 12-25A,B). 
Others, however, are released into the intermembrane space by a protease that 
removes the membrane anchor (Figure 12-25C). Many of these cleaved proteins 
remain attached to the outer surface of the inner membrane as peripheral sub- 
units of protein complexes that also contain transmembrane proteins. 

Certain intermembrane-space proteins that contain cysteine motifs are 
imported by a yet different route. These proteins form a transient covalent disul- 
fide bond to the Mia40 protein (Figure 12-25D). The imported proteins are 
then released in an oxidized form containing intrachain disulfide bonds. Mia40 
becomes reduced in the process, and is then reoxidized by passing electrons to 
the electron transport chain in the inner mitochondrial membrane. In this way, 
the energy stored in the redox potential in the mitochondrial electron transport 
chain is tapped to drive protein import. 

Mitochondria are the principal sites of ATP synthesis in the cell, but they also 
contain many metabolic enzymes, such as those of the citric acid cycle. Thus, in 
addition to proteins, mitochondria must also transport small metabolites across 
their membranes. While the outer membrane contains porins, which make the 
membrane freely permeable to such small molecules, the inner membrane does 
not. Instead, a family of metabolite-specific transporters transfers a vast number 
of small molecules across the inner membrane. In yeast cells, these transporters 
comprise a family of 35 different proteins, the most abundant of which transport 
ATP, ADP, and phosphate. These are multipass transmembrane proteins, which 
do not have cleavable signal sequences at their N-termini but instead contain 
internal signal sequences. They cross the TOM complex in the outer membrane, 
and intermembrane-space chaperones guide them to the TIM22 complex, which 
inserts them into the inner membrane by a process that requires the membrane 
potential, but not mitochondrial hsp70 or ATP (Figure 12-25E). An energetically 
favorable partitioning of the hydrophobic transmembrane regions into the inner 
membrane is likely to drive this process. 


Two Signal Sequences Direct Proteins to the Thylakoid Membrane 
in Chloroplasts 


Protein transport into chloroplasts resembles transport into mitochondria. 
Both processes occur post-translationally, use separate translocation complexes 
in each membrane, require energy, and use amphiphilic N-terminal signal 
sequences that are removed after use. With the exception of some of the chap- 
erone molecules, however, the protein components that form the translocation 
complexes differ. Moreover, whereas mitochondria harness the electrochemical 
H* gradient across their inner membrane to drive transport, chloroplasts, which 
have an electrochemical H* gradient across their thylakoid membrane but not 
their inner membrane, use GTP and ATP hydrolysis to power import across their 
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double membrane. The functional similarities may thus result from convergent 
evolution, reflecting the common requirements for translocation across a double 
membrane. 

Although the signal sequences for import into chloroplasts superficially 
resemble those for import into mitochondria, the same plant cells have both 
mitochondria and chloroplasts, so proteins must partition appropriately between 
the two organelles. In plants, for example, a bacterial enzyme can be directed 
specifically to mitochondria if it is experimentally joined to an N-terminal signal 
sequence of a mitochondrial protein; the same enzyme joined to an N-terminal 
signal sequence of a chloroplast protein ends up in chloroplasts. Thus, the import 
receptors on each organelle distinguish between the different signal sequences. 

Chloroplasts have an extra membrane-enclosed compartment, the thylakoid. 
Many chloroplast proteins, including the protein subunits of the photosynthetic 
system and of the ATP synthase (discussed in Chapter 14), are located in the thyla- 
koid membrane. Like the precursors of some mitochondrial proteins, the precur- 
sors of these proteins are translocated from the cytosol to their final destination 
in two steps. First, they pass across the double membrane into the matrix space 
(called the stroma in chloroplasts), and then they either integrate into the thyla- 
koid membrane or translocate into the thylakoid space (Figure 12-26A). The pre- 
cursors of these proteins have a hydrophobic thylakoid signal sequence following 
the N-terminal chloroplast signal sequence. After the N-terminal signal sequence 
has been used to import the protein into the stroma, a stromal signal peptidase 
removes it, unmasking the thylakoid signal sequence that initiates transport 
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Figure 12-26 Translocation of 
chloroplast precursor proteins into 
the thylakoid space. (A) The precursor 
protein contains an N-terminal chloroplast 
signal sequence (red), followed immediately 
by a thylakoid signal sequence (brown). 
The chloroplast signal sequence initiates 
translocation into the stroma by a 
mechanism similar to that used for the 
translocation of mitochondrial precursor 
proteins into the matrix space, although 
the translocator complexes, TOC and TIC, 
are different. The signal sequence is then 
cleaved off, unmasking the thylakoid signal 
sequence, which initiates translocation 
across the thylakoid membrane. 

(B) Translocation into the thylakoid space or 
thylakoid membrane can occur by any one 
of at least four routes: (1) a Sec pathway, 
so called because It uses components 
that are homologs of Sec proteins, which 
mediate protein translocation across the 
bacterial plasma membrane (discussed 
later); (2) an SRP-like pathway, so called 
because it uses a chloroplast homolog 

of the signal-recognition particle, or 

SRP (discussed later); (8) a TAT (twin 
arginine translocation) pathway, so called 
because two arginines are critical in the 
signal sequences that direct proteins into 
this pathway, which depends on the H+ 
gradient across the thylakoid membrane; 
and (4) a soontaneous insertion pathway 
that seems not to require any protein 
translocator. 
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across the thylakoid membrane. There are at least four routes by which proteins 
cross or become integrated into the thylakoid membrane, distinguished by their 
need for different stromal chaperones and energy sources (Figure 12-26B). 


Summary 


Although mitochondria and chloroplasts have their own genetic systems, they 
produce only a small proportion of their own proteins. Instead, the two organelles 
import most of their proteins from the cytosol, using similar mechanisms. In both 
cases, proteins are transported in an unfolded state across both outer and inner 
membranes simultaneously into the matrix space or stroma. Both ATP hydrolysis 
and a membrane potential across the inner membrane drive translocation into 
mitochondria, whereas GTP and ATP hydrolysis drive translocation into chloro- 
plasts. Chaperone proteins of the cytosolic hsp70 family maintain the precursor 
proteins in an unfolded state, and a second set of hsp70 proteins in the matrix space 
or stroma pulls the polypeptide chain into the organelle. Only proteins that con- 
tain a specific signal sequence are translocated. The signal sequence can either be 
located at the N-terminus and cleaved off after import or be internal and retained. 
Transport into the inner membrane sometimes uses a second, hydrophobic signal 
sequence that is unmasked when the first signal sequence is removed. In chloro- 
plasts, import from the stroma into the thylakoid can occur by several routes, dis- 
tinguished by the chaperones and energy source used. 


PEROXISOMES 


Peroxisomes differ from mitochondria and chloroplasts in many ways. Most 
notably, they are surrounded by only a single membrane, and they do not contain 
DNA or ribosomes. Thus, because peroxisomes lack a genome, all of their pro- aa 
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teins are encoded in the nucleus. Peroxisomes acquire most of these proteins by NUCLEUS 
selective import from the cytosol, although some of them enter the peroxisome ET 
membrane via the ER. 














Because we do not discuss peroxisomes elsewhere, we shall digress to consider 
some of the functions of this diverse family of organelles, before discussing their 
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biosynthesis. Virtually all eukaryotic cells have peroxisomes. They contain oxida- ENDOPLASMIC RETICULUM 
tive enzymes, such as catalase and urate oxidase, at such high concentrations that, T 
in some cells, the peroxisomes stand out in electron micrographs because of the 
presence of a crystalloid protein core (Figure 12-27). E 

Like mitochondria, peroxisomes are major sites of oxygen utilization. One i if 
hypothesis is that peroxisomes are a vestige of an ancient organelle that per- SECRETORY 
formed all the oxygen metabolism in the primitive ancestors of eukaryotic cells. LATE ENDOSOME VESICLES 
When the oxygen produced by photosynthetic bacteria first accumulated in the LL 
atmosphere, it would have been highly toxic to most cells. Peroxisomes might | LYSOSOME | 
have lowered the intracellular concentration of oxygen, while also exploiting its 
chemical reactivity to perform useful oxidation reactions. According to this view, E ONE 
the later development of mitochondria rendered peroxisomes largely obsolete 
because many of the same biochemical reactions—which had formerly been car- if <b Lb 
ried out in peroxisomes without producing energy—were now coupled to ATP CELL EXTERIOR 











formation by means of oxidative phosphorylation. The oxidation reactions per- 
formed by peroxisomes in present-day cells could therefore partly be those whose 
functions were not taken over by mitochondria. 


Peroxisomes Use Molecular Oxygen and Hydrogen Peroxide to 
Perform Oxidation Reactions 
Peroxisomes are so named because they usually contain one or more enzymes 
that use molecular oxygen to remove hydrogen atoms from specific organic sub- 
strates (designated here as R) in an oxidation reaction that produces hydrogen 
peroxide (H202): 

RH» + Os —R+ H202 


PEROXISOMES 


Catalase uses the H202 generated by other enzymes in the organelle to oxidize 
a variety of other substrates—including formic acid, formaldehyde, and alcohol— 
by the “peroxidation” reaction: H202 + R'H2 — R’ + 2H20. This type of oxidation 
reaction is particularly important in liver and kidney cells, where the peroxisomes 
detoxify various harmful molecules that enter the bloodstream. About 25% of the 
ethanol we drink is oxidized to acetaldehyde in this way. In addition, when excess 
H203 accumulates in the cell, catalase converts it to H20 through the reaction 


2H2023 —> 2H20 + Os 


A major function of the oxidation reactions performed in peroxisomes is the 
breakdown of fatty acid molecules. The process, called J oxidation, shortens the 
alkyl chains of fatty acids sequentially in blocks of two carbon atoms at a time, 
thereby converting the fatty acids to acetyl CoA. The peroxisomes then export the 
acetyl CoA to the cytosol for use in biosynthetic reactions. In mammalian cells, B 
oxidation occurs in both mitochondria and peroxisomes; in yeast and plant cells, 
however, this essential reaction occurs exclusively in peroxisomes. 

An essential biosynthetic function of animal peroxisomes is to catalyze the 
first reactions in the formation of plasmalogens, which are the most abundant 
class of phospholipids in myelin (Figure 12-28). Plasmalogen deficiencies cause 
profound abnormalities in the myelination of nerve-cell axons, which is one rea- 
son why many peroxisomal disorders lead to neurological disease. 

Peroxisomes are unusually diverse organelles, and even in the various cell 
types of a single organism they may contain different sets of enzymes. They also 
adapt remarkably to changing conditions. Yeasts grown on sugar, for example, 
have few small peroxisomes. But when some yeasts are grown on methanol, 
numerous large peroxisomes are formed that oxidize methanol; and when grown 
on fatty acids, they develop numerous large peroxisomes that break down fatty 
acids to acetyl CoA by ß oxidation. 

Peroxisomes are also important in plants. Two types of plant peroxisomes have 
been studied extensively. One is present in leaves, where it participates in photo- 
respiration (discussed in Chapter 14) (Figure 12-29A). The other type of peroxi- 
some is present in germinating seeds, where it converts the fatty acids stored in 
seed lipids into the sugars needed for the growth of the young plant. Because this 
conversion of fats to sugars is accomplished by a series of reactions known as the 
glyoxylate cycle, these peroxisomes are also called glyoxysomes (Figure 12-29B). 
In the glyoxylate cycle, two molecules of acetyl CoA produced by fatty acid break- 
down in the peroxisome are used to make succinic acid, which then leaves the 
peroxisome and is converted into glucose in the cytosol. The glyoxylate cycle does 
not occur in animal cells, and animals are therefore unable to convert the fatty 
acids in fats into carbohydrates. 


A Short Signal Sequence Directs the Import of Proteins into 
Peroxisomes 


A specific sequence of three amino acids (Ser-Lys-Leu) located at the C-termi- 
nus of many peroxisomal proteins functions as an import signal (see Table 12-3, 
p. 648). Other peroxisomal proteins contain a signal sequence near the N-termi- 
nus. If either sequence is attached to a cytosolic protein, the protein is imported 
into peroxisomes. The import signals are first recognized by soluble receptor pro- 
teins in the cytosol. Numerous distinct proteins, called peroxins, participate in 
the import process, which is driven by ATP hydrolysis. A complex of at least six dif- 
ferent peroxins forms a protein translocator in the peroxisome membrane. Even 
oligomeric proteins do not have to unfold to be imported. To allow the passage 
of such compactly folded cargo molecules, the pore formed by the transporter is 
thought to be dynamic in its dimensions, adapting in size to the particular cargo 
molecules to be transported. In this respect, the mechanism differs from that used 
by mitochondria and chloroplasts. One soluble import receptor, the peroxin Pex5 
recognizes the C-terminal peroxisomal import signal. It accompanies its cargo all 
the way into peroxisomes and, after cargo release, cycles back to the cytosol. After 





Figure 12-27 An electron micrograph 
of three peroxisomes in a rat liver 
cell. The paracrystalline, electron-dense 
inclusions are composed primarily of the 
enzyme urate oxidase. (Courtesy of Daniel 
S. Friend.) 
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Figure 12-28 The structure of a 
plasmalogen. Plasmalogens are very 
abundant in the myelin sheaths that 
insulate the axons of nerve cells. They 
make up some 80-90% of the myelin 
membrane phospholipids. In addition to 
an ethanolamine head group and a long- 
chain fatty acid attached to the same 
glycerol phosphate backbone used for 
phospholipids, plasmalogens contain 
an unusual fatty alcohol that is attached 
through an ether linkage (bottom left). 
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Figure 12-29 Electron micrographs of two types of peroxisomes found in plant cells. (A) A peroxisome with a 
paracrystalline core in a tobacco leaf mesophyll cell. Its close association with chloroplasts is thought to facilitate the 

exchange of materials between these organelles during photorespiration. The vacuole in plant cells is equivalent to the 
lysosome in animal cells. (B) Peroxisomes in a fat-storing cotyledon cell of a tomato seed 4 days after germination. Here the 
peroxisomes (glyoxysomes) are associated with the lipid droplets that store fat, reflecting their central role in fat mobilization and 
gluconeogenesis during seed germination. (A, from S.E. Frederick and E.H. Newcomb, J. Cell Biol. 43:343-353, 1969. With 
permission from The Rockefeller Press; B, from W.P. Wergin, P.J. Gruber and E.H. Newcomb, J. Ultrastruct. Res. 30:533-557, 


1970. With permission from Academic Press.) 


delivering its cargo to the peroxisome lumen, Pex5 undergoes ubiquitylation. This 
modification is required to release Pex5 back into the cytosol, where the ubiquitin 
is removed. An ATPase composed of Pex] and Pex6 harnesses the energy of ATP 
hydrolysis to help release Pex5 from peroxisomes. 

The importance of this import process and of peroxisomes is demonstrated 
by the inherited human disease Zellweger syndrome, in which a defect in import- 
ing proteins into peroxisomes leads to a profound peroxisomal deficiency. These 
individuals, whose cells contain “empty” peroxisomes, have severe abnormalities 
in their brain, liver, and kidneys, and they die soon after birth. A mutation in the 
gene encoding peroxin Pex5 causes one form of the disease. A defect in Pex7, the 
receptor for the N-terminal import signal, causes a milder peroxisomal disease. 

It has long been debated whether new peroxisomes arise from preexisting ones 
by organelle growth and fission—as mentioned earlier for mitochondria and plas- 
tids—or whether they derive as a specialized compartment from the endoplasmic 
reticulum (ER). Aspects of both views are true (Figure 12-30). Most peroxisomal 
membrane proteins are made in the cytosol and insert into the membrane of 
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Figure 12-30 A model that explains 
how peroxisomes proliferate and how 
new peroxisomes arise. Peroxisomal 
precursor vesicles bud from the ER. 

At least two peroxisomal membrane 
proteins, Pex3 and Pex15, follow this 
route. The machinery that drives the 
budding reaction and that selects only 
peroxisomal proteins for packaging into 
these vesicles depends on Pex19 and 
other cytosolic proteins that are still 
unknown. Peroxisomal precursor vesicles 
may then fuse with one another or with 
preexisting peroxisomes. The peroxisome 
membrane contains import receptors and 
protein translocators that are required 

for the import of peroxisomal proteins 
made on cytosolic ribosomes, including 
new copies of the import receptors and 
translocator components. Presumably, 
the lipids required for growth are also 
imported, although some may derive 
directly from the ER in the membrane of 
peroxisomal precursor vesicles. 
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preexisting peroxisomes, but others are first integrated into the ER membrane, 
where they are packaged into specialized peroxisomal precursor vesicles. New 
precursor vesicles may then fuse with one another and begin importing addi- 
tional peroxisomal proteins, using their own protein import machinery to grow 
into mature peroxisomes, which can undergo cycles of growth and fission. 


Summary 


Peroxisomes are specialized for carrying out oxidation reactions using molecular 
oxygen. They generate hydrogen peroxide, which they employ for oxidative pur- 
poses—and contain catalase to destroy the excess. Like mitochondria and plastids, 
peroxisomes are self-replicating organelles. Because they do not contain DNA or 
ribosomes, however, all of their proteins are encoded in the cell nucleus. Some of 
these proteins are conveyed to peroxisomes via peroxisomal precursor vesicles that 
bud from the ER, but most are synthesized in the cytosol and directly imported. A 
specific sequence of three amino acids near the C-terminus of many of the latter pro- 
teins functions as a peroxisomal import signal. The mechanism of protein import 
differs from that of mitochondria and chloroplasts, in that even oligomeric proteins 
are imported from the cytosol without unfolding. 


THE ENDOPLASMIC RETICULUM 


All eukaryotic cells have an endoplasmic reticulum (ER). Its membrane typi- 
cally constitutes more than half of the total membrane of an average animal cell 
(see Table 12-2, p. 643). The ER is organized into a netlike labyrinth of branching 
tubules and flattened sacs that extends throughout the cytosol (Figure 12-31 and 
Movie 12.4). The tubules and sacs interconnect, and their membrane is continu- 
ous with the outer nuclear membrane; the compartment that they enclose there- 
fore is also continuous with the space between the inner and outer nuclear mem- 
branes. Thus, the ER and nuclear membranes form a continuous sheet enclosing 
a single internal space, called the ER lumen or the ER cisternal space, which often 
occupies more than 10% of the total cell volume (see Table 12-1, p. 643). 

As mentioned at the beginning of this chapter, the ER has a central role in both 
lipid and protein biosynthesis, and it also serves as an intracellular Ca** store 
that is used in many cell signaling responses (discussed in Chapter 15). The ER 
membrane is the site of production of all the transmembrane proteins and lip- 
ids for most of the cell’s organelles, including the ER itself, the Golgi apparatus, 
lysosomes, endosomes, secretory vesicles, and the plasma membrane. The ER 
membrane is also the site at which most of the lipids for mitochondrial and per- 
oxisomal membranes are made. In addition, almost all of the proteins that will be 
secreted to the cell exterior—plus those destined for the lumen of the ER, Golgi 
apparatus, or lysosomes—are initially delivered to the ER lumen. 
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Figure 12-31 Fluorescent micrographs of 
the endoplasmic reticulum. (A) An animal 
cell in tissue culture that was genetically 
engineered to express an ER membrane 
protein fused to a fluorescent protein. The 
ER extends as a network of tubules and 
sheets throughout the entire cytosol, so that 
all regions of the cytosol are close to some 
portion of the ER membrane. The outer 
nuclear membrane, which is continuous 

with the ER, is also stained. (B) Part of 

an ER network in a living plant cell that 

was genetically engineered to express a 
fluorescent protein in the ER. (A, courtesy of 
Patrick Chitwood and Gia Voeltz; B, courtesy 
of Petra Boevink and Chris Hawes.) 
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The ER Is Structurally and Functionally Diverse 


While the various functions of the ER are essential to every cell, their relative 
importance varies greatly between individual cell types. To meet different func- 
tional demands, distinct regions of the ER become highly specialized. We observe 
such functional specialization as dramatic changes in ER structure, and different 
cell types can therefore possess characteristically different types of ER membrane. 
One of the most remarkable ER specializations is the rough ER. 

Mammalian cells begin to import most proteins into the ER before complete 
synthesis of the polypeptide chain—that is, import is a co-translational process 
(Figure 12-32A). In contrast, the import of proteins into mitochondria, chloro- 
plasts, nuclei, and peroxisomes is a post-translational process (Figure 12-32B). In 
co-translational transport, the ribosome that is synthesizing the protein is attached 
directly to the ER membrane, enabling one end of the protein to be translocated 
into the ER while the rest of the polypeptide chain is being synthesized. These 
membrane-bound ribosomes coat the surface of the ER, creating regions termed 
rough endoplasmic reticulum, or rough ER; regions of ER that lack bound ribo- 
somes are called smooth endoplasmic reticulum, or smooth ER (Figure 12-33). 

Most cells have scanty regions of smooth ER, and the ER is often partly smooth 
and partly rough. Areas of smooth ER from which transport vesicles carrying 
newly synthesized proteins and lipids bud off for transport to the Golgi apparatus 
are Called transitional ER. In certain specialized cells, the smooth ER is abundant 
and has additional functions. It is prominent, for example, in cells that specialize 
in lipid metabolism, such as cells that synthesize steroid hormones from choles- 
terol; the expanded smooth ER accommodates the enzymes that make choles- 
terol and modify it to form the hormones (see Figure 12-33B). 

The main cell type in the liver, the hepatocyte, also has a substantial amount 
of smooth ER. It is the principal site of production of lipoprotein particles, which 
carry lipids via the bloodstream to other parts of the body. The enzymes that 
synthesize the lipid components of the particles are located in the membrane of 
the smooth ER, which also contains enzymes that catalyze a series of reactions 
to detoxify both lipid-soluble drugs and various harmful compounds produced 
by metabolism. The most extensively studied of these detoxification reactions are 
carried out by the cytochrome P450 family of enzymes, which catalyze a series of 
reactions in which water-insoluble drugs or metabolites that would otherwise 
accumulate to toxic levels in cell membranes are rendered sufficiently water-solu- 
ble to leave the cell and be excreted in the urine. Because the rough ER alone can- 
not house enough of these and other necessary enzymes, a substantial portion of 
the membrane in a hepatocyte normally consists of smooth ER (see Table 12-2). 

Another crucially important function of the ER in most eukaryotic cells is to 
sequester Ca** from the cytosol. The release of Ca** into the cytosol from the ER, 
and its subsequent reuptake, occurs in many rapid responses to extracellular 
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Figure 12-32 Co-translational and post- 


translational protein translocation. 
(A) Ribosomes bind to the ER membrane 
during co-translational translocation. (B) By 





ER contrast, cytosolic ribosomes complete the 
{ I synthesis of a protein and release it prior 
to post-translational translocation. In both 
CO-TRANSLATIONAL POST-TRANSLATIONAL cases, the protein is directed to the ER by 


(A) TRANSLOCATION (B) TRANSLOCATION an ER signal sequence (red and orange). 
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Figure 12-33 The rough and smooth ER. (A) An electron micrograph of the rough ER in a pancreatic exocrine 

cell that makes and secretes large amounts of digestive enzymes every day. The cytosol is filled with closely packed sheets 
of ER membrane that is studded with ribosomes. At the top left is a portion of the nucleus and its nuclear envelope; note that 
the outer nuclear membrane, which is continuous with the ER, is also studded with ribosomes. (B) Abundant smooth ER in a 
steroid-hormone-secreting cell. This electron micrograph is of a testosterone-secreting Leydig cell in the human testis. 

(C) A three-dimensional reconstruction of a region of smooth ER and rough ER in a liver cell. The rough ER forms oriented 
stacks of flattened cisternae, each having a lumenal space 20-30 nm wide. The smooth ER membrane is connected to 
these cisternae and forms a fine network of tubules 30-60 nm in diameter. The ER lumen is colored green. (D) A tomographic 
reconstruction of a portion of the ER network in a yeast cell. Membrane-bound ribosomes (tiny dark spheres) are seen in both 
flat sheets and tubular regions of irregular diameter, demonstrating that the ribosomes bind to ER membranes of different 
curvature in these cells. (A, courtesy of Lelio Orci; B, courtesy of Daniel S. Friend; C, after R.V. Krstić, Ultrastructure of the 
Mammalian Cell. New York: Springer-Verlag, 1979; D, from M. West et al., J. Cell Biol. 193:333-346, 2011. With permission 
from Rockefeller University Press.) 


signals, as discussed in Chapter 15. A Ca** pump transports Ca** from the cytosol 
into the ER lumen. A high concentration of Ca**-binding proteins in the ER facil- 
itates Ca** storage. In some cell types, and perhaps in most, specific regions of 
the ER are specialized for Ca** storage. Muscle cells have an abundant, modified 
smooth ER called the sarcoplasmic reticulum. The release and reuptake of Ca?* by 
the sarcoplasmic reticulum trigger myofibril contraction and relaxation, respec- 
tively, during each round of muscle contraction (discussed in Chapter 16). 

To study the functions and biochemistry of the ER, it is necessary to isolate 
it. This may seem to be a hopeless task because the ER is intricately interleaved 
with other components of the cytoplasm. Fortunately, when tissues or cells are 
disrupted by homogenization, the ER breaks into fragments, which reseal to 
form small (~100-200 nm in diameter) closed vesicles called microsomes. Mic- 
rosomes are relatively easy to purify. To the biochemist, microsomes represent 
small authentic versions of the ER, still capable of protein translocation, protein 
glycosylation (discussed later), Ca?* uptake and release, and lipid synthesis. Mic- 
rosomes derived from rough ER are studded with ribosomes and are called rough 


672 Chapter 12: Intracellular Compartments and Protein Sorting 








g rough ER 














homogenization 
i 


rough and 
smooth 
microsomes 
















tube with gradient of 








smooth microsomes 


centri- have a low density and 
fugation stop sedimenting and 
l float at low sucrose 


concentration 








rough microsomes 

t | have a high density and 
| stop sedimenting and 
float at high sucrose 
concentration 





increasing sucrose concentration 


microsomes. The ribosomes are always found on the outside surface, so the inte- 
rior of the microsome is biochemically equivalent to the lumen of the ER (Figure 
12-34A). 

Many vesicles similar in size to rough microsomes, but lacking attached ribo- 
somes, are also found in cell homogenates. Such smooth microsomes are derived 
in part from smooth portions of the ER and in part from vesiculated fragments of 
the plasma membrane, Golgi apparatus, endosomes, and mitochondria (the ratio 
depending on the tissue). Thus, whereas rough microsomes are clearly derived 
from rough portions of ER, it is not easy to separate smooth microsomes derived 
from different organelles. The smooth microsomes prepared from liver or muscle 
cells are an exception. Because of the unusually large quantities of smooth ER or 
sarcoplasmic reticulum, respectively, most of the smooth microsomes in homog- 
enates of these tissues are derived from the smooth ER or sarcoplasmic reticu- 
lum. The ribosomes attached to rough microsomes make them more dense than 
smooth microsomes. As a result, we can use equilibrium centrifugation to sepa- 
rate the rough and smooth microsomes (Figure 12-34B). Microsomes have been 
invaluable in elucidating the molecular aspects of ER function, as we discuss next. 


Signal Sequences Were First Discovered in Proteins Imported into 
the Rough ER 


The ER captures selected proteins from the cytosol as they are being synthesized. 
These proteins are of two types: transmembrane proteins, which are only partly 
translocated across the ER membrane and become embedded in it, and water-sol- 
uble proteins, which are fully translocated across the ER membrane and are 
released into the ER lumen. Some of the transmembrane proteins function in the 
ER, but many are destined to reside in the plasma membrane or the membrane of 
another organelle. The water-soluble proteins are destined either for secretion or 
for residence in the lumen of the ER or of another organelle. All of these proteins, 
regardless of their subsequent fate, are directed to the ER membrane by an ER 
signal sequence, which initiates their translocation by a common mechanism. 
Signal sequences (and the signal sequence strategy of protein sorting) were 
first discovered in the early 1970s in secreted proteins that are translocated across 
the ER membrane as a first step toward their eventual discharge from the cell. 
In the key experiment, the mRNA encoding a secreted protein was translated by 
ribosomes in vitro. When microsomes were omitted from this cell-free system, 
the protein synthesized was slightly larger than the normal secreted protein. In 
the presence of microsomes derived from the rough ER, however, a protein of the 
correct size was produced. According to the signal hypothesis, the size difference 
reflects the initial presence of a signal sequence that directs the secreted protein 


Figure 12-34 The isolation of purified 
rough and smooth microsomes 

from the ER. (A) A thin section electron 
micrograph of the purified rough ER 
fraction shows an abundance of ribosome- 
studded vesicles. (B) When sedimented to 
equilibrium through a gradient of sucrose, 
the two types of microsomes separate from 
each other on the basis of their different 
densities. Note that the smooth fraction will 
also contain non-ER-derived material. 

(A, courtesy of George Palade.) 
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to the ER membrane and is then cleaved off by a signal peptidase in the ER mem- 
brane before the polypeptide chain has been completed (Figure 12-35). Cell-free 
systems in which proteins are imported into microsomes have provided powerful 
procedures for identifying, purifying, and studying the various components of the 
molecular machinery responsible for the ER import process. 


A Signal-Recognition Particle (SRP) Directs the ER Signal 
Sequence to a Specific Receptor in the Rough ER Membrane 


The ER signal sequence is guided to the ER membrane by at least two compo- 
nents: a signal-recognition particle (SRP), which cycles between the ER mem- 
brane and the cytosol and binds to the signal sequence, and an SRP receptor in 
the ER membrane. The SRP is a large complex; in animal cells, it consists of six 
different polypeptide chains bound to a single small RNA molecule. While the 
SRP and SRP receptor have fewer subunits in bacteria, homologs are present in 
all cells, indicating that this protein-targeting mechanism arose early in evolution 
and has been conserved. 

ER signal sequences vary greatly in amino acid sequence, but each has eight or 
more nonpolar amino acids at its center (see Table 12-3, p. 648). How can the SRP 
bind specifically to so many different sequences? The answer has come from the 
crystal structure of the SRP protein, which shows that the signal-sequence-bind- 
ing site is a large hydrophobic pocket lined by methionines. Because methionines 
have unbranched, flexible side chains, the pocket is sufficiently plastic to accom- 
modate hydrophobic signal sequences of different sequences, sizes, and shapes. 

The SRP is a rodlike structure, which wraps around the large ribosomal subunit, 
with one end binding to the ER signal sequence as it emerges from the ribosome 
as part of the newly made polypeptide chain; the other end blocks the elongation 
factor binding site at the interface between the large and small ribosomal subunits 
(Figure 12-36). This block halts protein synthesis as soon as the signal peptide has 
emerged from the ribosome. The transient pause presumably gives the ribosome 
enough time to bind to the ER membrane before completion of the polypeptide 
chain, thereby ensuring that the protein is not released into the cytosol. This safety 
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Figure 12-35 The signal hypothesis. 
A simplified view of protein translocation 
across the ER membrane, as originally 
proposed. When the ER signal sequence 
emerges from the ribosome, it directs 
the ribosome to a translocator on 

the ER membrane that forms a pore 

in the membrane through which the 
polypeptide is translocated. A signal 
peptidase is closely associated with 

the translocator and clips off the signal 
sequence during translation, and the 
mature protein is released into the lumen 
of the ER immediately after its synthesis 
is completed. The translocator is closed 
until the ribosome has bound, so that the 
permeability barrier of the ER membrane is 
maintained at all times. 
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Figure 12-36 The signal-recognition particle (SRP). (A) A mammalian SRP is a rodlike 
ridbonucleoprotein complex containing six protein subunits (brown) and one RNA molecule (blue). 
The SRP RNA forms a backbone that links the protein domain containing the signal-sequence- 
binding pocket to the domain responsible for pausing translation. Crystal structures of various 
SRP pieces from different species are assembled here into a composite model to approximate the 
structure of a complete SRP. (B) The three-dimensional outline of the SRP bound to a ribosome 
was determined by cryoelectron microscopy. SRP binds to the large ribosomal subunit so that its 
signal-sequence-binding pocket is positioned near the growing polypeptide chain exit site, and its 
translational pause domain is positioned at the interface between the ribosomal subunits, where it 
interferes with elongation factor binding. (C) As a signal sequence emerges from the ribosome and 
binds to the SRP, a conformational change in the SRP exposes a binding site for the SRP receptor. 
(B, adapted from M. Halic et al., Nature 427:808-814, 2004. With permission from Macmillan 
Publishers Ltd.) 


device may be especially important for secreted and lysosomal hydrolases, which 
could wreak havoc in the cytosol; cells that secrete large amounts of hydrolases, 
however, take the added precaution of having high concentrations of hydrolase 
inhibitors in their cytosol. The pause also ensures that large portions of a protein 
that could fold into a compact structure are not made before reaching the trans- 
locator in the ER membrane. Thus, in contrast to the post-translational import of 
proteins into mitochondria and chloroplasts, chaperone proteins are not required 
to keep the protein unfolded. 

When a signal sequence binds, SRP exposes a binding site for the SRP receptor 
(see Figure 12-36B,C), which is a transmembrane protein complex in the rough 
ER membrane. The binding of the SRP to its receptor brings the SRP-ribosome 
complex to an unoccupied protein translocator in the same membrane. The SRP 
and SRP receptor are then released, and the translocator transfers the growing 
polypeptide chain across the membrane (Figure 12-37). 

This co-translational transfer process creates two spatially separate popula- 
tions of ribosomes in the cytosol. Membrane-bound ribosomes, attached to the 
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Figure 12-37 How ER signal sequences 
and SRP direct ribosomes to the ER 
membrane. The SRP and its receptor 
act in concert. The SRP binds to both 

the exposed ER signal sequence and 

the ribosome, thereby inducing a pause 
in translation. The SRP receptor in the 

ER membrane, which in animal cells is 
composed of two different polypeptide 
chains, binds the SRP-ribosome complex 
and directs it to the translocator. In a 
poorly understood reaction, the SRP and 
SRP receptor are then released, leaving 
the ribosome bound to the translocator 

in the ER membrane. The translocator 
then inserts the polypeptide chain into 
the membrane and transfers it across 

the lipid bilayer. Because one of the SRP 
proteins and both chains of the SRP 
receptor contain GTP-binding domains, 

it is thought that conformational changes 
that occur during cycles of GTP binding 
and hydrolysis (discussed in Chapter 

15) ensure that SRP release occurs only 
after the ribosome has become properly 
engaged with the translocator in the ER 
membrane. The translocator is closed 
until the ribosome has bound, so that the 
permeability barrier of the ER membrane is 
maintained at all times. 
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cytosolic side of the ER membrane, are engaged in the synthesis of proteins that 
are being concurrently translocated into the ER. Free ribosomes, unattached to 
any membrane, synthesize all other proteins encoded by the nuclear genome. 
Membrane-bound and free ribosomes are structurally and functionally identical. 
They differ only in the proteins they are making at any given time. 

Since many ribosomes can bind to a single mRNA molecule, a polyribosome 
is usually formed. If the mRNA encodes a protein with an ER signal sequence, 
the polyribosome becomes attached to the ER membrane, directed there by the 
signal sequences on multiple growing polypeptide chains. The individual ribo- 
somes associated with such an mRNA molecule can return to the cytosol when 
they finish translation and intermix with the pool of free ribosomes. The mRNA 
itself, however, remains attached to the ER membrane by a changing population 
of ribosomes, each transiently held at the membrane by the translocator (Figure 
12-38). 


The Polypeptide Chain Passes Through an Aqueous Channel in 
the Translocator 

It had long been debated whether polypeptide chains are transferred across the 
ER membrane in direct contact with the lipid bilayer or through a channel in a 


protein translocator. The debate ended with the identification of the transloca- 
tor, which was shown to form a water-filled channel in the membrane through 


mRNA encoding a cytosolic protein 


remains free in cytosol free polyribosome 
in cytosol 
5! 






common pool of ribosomal 
subunits in cytosol 


















polyribosome bound to ER membrane by 
multiple nascent polypeptide chains 





a= n 


signal 





sequence 
ER LUMEN 
mRNA encoding a protein 
targeted to ER remains 
membrane-bound 
(A) ER membrane 





675 


ER membrane polyribosome 





polyribosome rosette 400 nm 


Figure 12-38 Free and membrane-bound polyribosomes. (A) A common pool of ribosomes synthesizes the proteins that 
stay in the cytosol and those that are transported into the ER. The ER signal sequence on a newly formed polypeptide chain 
binds to SRP, which directs the translating ribosome to the ER membrane. The mRNA molecule remains permanently bound 

to the ER as part of a polyribosome, while the ribosomes that move along it are recycled; at the end of each round of protein 
synthesis, the ribosomal subunits are released and rejoin the common pool in the cytosol. (B) A thin section electron micrograph 
of polyribosomes attached to the ER membrane. The plane of section in some places cuts through the ER roughly parallel to 
the membrane, giving a face-on view of the rosettelike pattern of the polyribosomes. (B, courtesy of George Palade.) 


676 Chapter 12: Intracellular Compartments and Protein Sorting 


y subunit 










a subunit 


pore 


7 displaced 
O pu 

B subunit 

(A) 


displaced 
: plug 
signal 


peptide 





lipid bilayer growing polypeptide 
(B) chain 


Figure 12-39 Structure of the Sec61 complex. (A) A side view (left) and a top view (right, seen 
from the cytosol) of the structure of the Sec61 complex of the archaeon Methanococcus jannaschii. 
The Sec61a subunit has an inverted repeat structure (see Figure 11-10) and is shown in blue and 
beige to indicate this pseudo-symmetry; the two smaller B and y subunits are shown in gray. In 

the side view, some helices in front have been omitted to make the inside of the pore visible. The 
yellow short helix is thought to form a plug that seals the pore when the translocator is closed. To 
open, the complex rearranges itself to move the plug helix out of the way, as indicated by the red 
arrow. A ring of hydrophobic amino acid side chains is thought to form a tight-fitting diaphragm 
around translocating polypeptide chain to prevent leaks of other molecules across the membrane. 
The pore of the Sec61 complex can also open sideways at a lateral seam. (B) Models of the closed 
and open states of the translocator are shown in top view, illustrating how a signal sequence (or a 
transmembrane segment) could be released into the lipid bilayer after opening of the seam. (PDB 
codes: 1RH5 and 1RHZ.) 


which the polypeptide chain passes. The core of the translocator, called the Sec61 
complex, is built from three subunits that are highly conserved from bacteria to 
eukaryotic cells. The structure of the Sec61 complex suggests that a helices con- 
tributed by the largest subunit surround a central channel through which the 
polypeptide chain traverses the membrane (Figure 12-39). The channel is gated 
by a short a helix that is thought to keep the translocator closed when it is idle and 
to move aside when it is engaged in passing a polypeptide chain. According to 
this view, the pore is a dynamic gated channel that opens only transiently when a 
polypeptide chain traverses the membrane. In an idle translocator, it is important 
to keep the channel closed, so that the membrane remains impermeable to ions, 
such as Ca**, which otherwise would leak out of the ER. As a polypeptide chain is 
translocating, a ring of hydrophobic amino acid side chains is thought to provide 
a flexible seal to prevent ion leaks. 

The structure of the Sec61 complex suggests that the pore can also open along 
a seam on its side. Indeed, some structures of the translocator show it locked in an 
open-seam conformation. This opening allows a translocating peptide chain lat- 
eral access into the hydrophobic core of the membrane, a process that is import- 
ant both for the release of a cleaved signal peptide into the membrane (see Figure 
12-35) and for the integration of transmembrane proteins into the bilayer, as we 
discuss later. 
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Figure 12-40 A ribosome (green) bound to the ER protein translocator 
(blue). (A) A side-view reconstruction of the complex from electron 
microscopic images. (B) A view of the translocator seen from the ER lumen. 
The translocator contains Sec61, accessory proteins, and detergent used in 
the preparation. Domains of accessory proteins extend across the membrane 
and form the lumenal bulge. (C) A schematic drawing of a membrane-bound 
ribosome attached to the translocator, indicating the location of the tunnel 

in the large ribosomal subunit through which the growing polypeptide chain 
exits from the ribosome. The MRNA (not shown) would be located between 
the small and large ribosomal subunits. (Adapted from J.F. Ménetret et al., J. 
Mol. Biol. 348:445-457, 2005. With permission from Academic Press.) 


In eukaryotic cells, four Sec61 complexes form a large translocator assembly 
that can be visualized on ER-bound ribosomes after detergent solubilization of 
the ER membrane (Figure 12-40). It is likely that this assembly includes other 
membrane complexes that associate with the translocator, such as enzymes that 
modify the growing polypeptide chain, including oligosaccharide transferase and 
the signal peptidase. The assembly of a translocator with these accessory compo- 
nents is called the translocon. 


Translocation Across the ER Membrane Does Not Always Require 
Ongoing Polypeptide Chain Elongation 


As we have seen, translocation of proteins into mitochondria, chloroplasts, and 
peroxisomes occurs post-translationally, after the protein has been made and 
released into the cytosol, whereas translocation across the ER membrane usually 
occurs during translation (co-translationally). This explains why ribosomes are 
bound to the ER but not to other organelles. 

Some completely synthesized proteins, however, are imported into the ER, 
demonstrating that translocation does not always require ongoing translation. 
Post-translational protein translocation is especially common across the yeast 
ER membrane and the bacterial plasma membrane (which is thought to be evo- 
lutionarily related to the ER). To function in post-translational translocation, the 
ER translocator needs accessory proteins that feed the polypeptide chain into the 
pore and drive translocation (Figure 12-41). In bacteria, a translocation motor 
protein, the SecA ATPase, attaches to the cytosolic side of the translocator, where it 
undergoes cyclic conformational changes driven by ATP hydrolysis. Each time an 
ATP is hydrolyzed, a portion of the SecA protein inserts into the pore of the trans- 
locator, pushing a short segment of the passenger protein with it. As a result of this 
ratchet mechanism, the SecA ATPase progressively pushes the polypeptide chain 
of the transported protein across the membrane. 

Eukaryotic cells use a different set of accessory proteins that associate with the 
Sec61 complex. These proteins span the ER membrane and use a small domain on 
the lumenal side of the ER membrane to deposit an hsp70-like chaperone protein 
(called BiP, for binding protein) onto the polypeptide chain as it emerges from the 
pore into the ER lumen. ATP-dependent cycles of BiP binding and release drive 
unidirectional translocation, as described earlier for the mitochondrial hsp70 
proteins that pull proteins across mitochondrial membranes. 

Proteins that are transported into the ER by a post-translational mechanism 
are first released into the cytosol, where they bind to chaperone proteins to pre- 
vent folding, as discussed earlier for proteins destined for mitochondria and chlo- 
roplasts. 


In Single-Pass Transmembrane Proteins, a Single Internal ER 
Signal Sequence Remains in the Lipid Bilayer as a Membrane- 
spanning a Helix 

The ER signal sequence in the growing polypeptide chain is thought to trigger the 


opening of the pore in the Sec61 protein translocator: after the signal sequence is 
released from the SRP and the growing chain has reached a sufficient length, the 
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Figure 12-41 Three ways in which protein translocation can be driven through structurally similar translocators. 
(A) Co-translational translocation. The ribosome is brought to the membrane by the SRP and SRP receptor and then engages 
with the Sec61 protein translocator. The growing polypeptide chain is threaded across the membrane as it is made. No 
additional energy is needed, as the only path available to the growing chain is to cross the membrane. (B) Post-translational 
translocation in eukaryotic cells requires an additional complex composed of Sec62, Sec63, Sec71, and Sec72 proteins, 
which is attached to the Sec61 translocator and deposits BiP molecules onto the translocating chain as it emerges from the 
translocator in the lumen of the ER. ATP-driven cycles of BiP binding and release pull the protein into the lumen, a mechanism 
that closely resembles the mechanism of mitochondrial import in Figure 12-23. (C) Post-translational translocation in bacteria. 
The completed polypeptide chain is fed from the cytosolic side into the bacterial homolog of the Sec61 complex (called the 
SecY complex in bacteria) in the plasma membrane by the SecA ATPase. ATP hydrolysis-driven conformational changes drive 
a pistonlike motion in SecA, each cycle pushing about 20 amino acids of the protein chain through the pore of the translocator. 
The Sec pathway used for protein translocation across the thylakoid membrane in chloroplasts uses a similar mechanism (see 
Figure 12-26B). 

Whereas the Sec61 translocator, SRP, and SRP receptor are found in all organisms, SecA is found exclusively in bacteria, and 
the Sec62, 63, 71, 72 complex is found exclusively in eukaryotic cells. (Adapted from P. Walter and A.E. Johnson, Annu. Rev. 
Cell Biol. 10:87-119, 1994. With permission from Annual Reviews.) 


signal sequence binds to a specific site inside the pore itself, thereby opening the 
pore. An ER signal sequence is therefore recognized twice: first by an SRP in the 
cytosol and then by a binding site in the pore of the protein translocator, where it 
serves as a Start-transfer signal (or start-transfer peptide) that opens the pore (for 
example, see Figure 12-35 for how this works for a soluble protein). Dual recog- 
nition may help ensure that only appropriate proteins enter the lumen of the ER. 
While bound in the translocation pore, a signal sequence is in contact not only 
with the Sec61 complex, which forms the walls of the pore, but also, along the lat- 
eral seam, with the hydrophobic core of the lipid bilayer. This was shown in chem- 
ical cross-linking experiments in which the signal sequence and the hydrocarbon 
chains of lipids were covalently linked together. When the nascent polypeptide 
chain grew long enough, the ER signal peptidase cleaved off the signal sequence 
and released it from the pore into the membrane, where it was rapidly degraded 
to amino acids by other proteases in the ER membrane. To release the signal 
sequence into the membrane, the translocator opens laterally along the seam (see 
Figures 12-35 and 12-39). The translocator is therefore gated in two directions: 
it opens to form a pore across the membrane to let the hydrophilic portions of 
proteins cross the lipid bilayer, and it opens laterally within the membrane to let 
hydrophobic portions of proteins partition into the lipid bilayer. Lateral gating of 
the pore is an essential step during the integration of transmembrane proteins. 
The integration of membrane proteins requires that some parts of the poly- 
peptide chain be translocated across the lipid bilayer whereas others are not. 
Despite this additional complexity, all modes of insertion of membrane proteins 
are simply variants of the sequence of events just described for transferring a sol- 
uble protein into the lumen of the ER. We begin by describing the three ways in 
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Figure 12-42 How a single-pass 
transmembrane protein with a cleaved 
ER signal sequence is integrated into 
the ER membrane. In this protein, the 
co-translational translocation process 

is initiated by an N-terminal ER signal 
sequence (red) that functions as a start- 
transfer signal, opening the translocator as 
in Figure 12-35. In addition to this start- 
transfer sequence, however, the protein 
also contains a stop-transfer sequence 
(orange); when this sequence enters the 
translocator and interacts with a binding 
site within the pore, the translocator opens 
at the seam and discharges the protein 
laterally into the lipid bilayer, where the 
stop-transfer Sequence remains to anchor 
the protein in the membrane. (In this 

figure and the two figures that follow, the 
ribosomes have been omitted for clarity.) 
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which single-pass transmembrane proteins (see Figure 10-17) become inserted 
into the ER membrane. 

In the simplest case, an N-terminal signal sequence initiates translocation, just 
as for a soluble protein, but an additional hydrophobic segment in the polypep- 
tide chain stops the transfer process before the entire polypeptide chain is trans- 
located. This stop-transfer signal anchors the protein in the membrane after the 
ER signal sequence (the start-transfer signal) has been cleaved off and released 
from the translocator (Figure 12-42). The lateral gating mechanism transfers 
the stop-transfer sequence into the bilayer, where it remains as a single a-helical 
membrane-spanning segment, with the N-terminus of the protein on the lumenal 
side of the membrane and the C-terminus on the cytosolic side. 

In the other two cases, the signal sequence is internal, rather than at the N-ter- 
minal end of the protein. As for an N-terminal ER signal sequence, the SRP binds 
to an internal signal sequence by recognizing its hydrophobic a-helical features. 
The SRP brings the ribosome making the protein to the ER membrane, and the 
ER signal sequence then serves as a start-transfer signal that initiates the pro- 
tein’s translocation. After release from the translocator, the internal start-transfer 
sequence remains in the lipid bilayer as a single membrane-spanning a helix. 

Internal start-transfer sequences can bind to the translocation apparatus in 
either of two orientations; this in turn determines which protein segment (the one 
preceding or the one following the start-transfer sequence) is moved across the 
membrane into the ER lumen. In one case, the resulting membrane protein has its 
C-terminus on the lumenal side (pathway A in Figure 12-43), while in the other, it 
has its N-terminus on the lumenal side (pathway B in Figure 12-43). The orienta- 
tion of the start-transfer sequence depends on the distribution of nearby charged 
amino acids, as described in the figure legend. 


Combinations of Start- Transfer and Stop- Transfer Signals 
Determine the Topology of Multipass Transmembrane Proteins 


In multipass transmembrane proteins, the polypeptide chain passes back and 
forth repeatedly across the lipid bilayer as hydrophobic a helices (see Figure 
10-17). It is thought that an internal signal sequence serves as a start-transfer 
signal in these proteins to initiate translocation, which continues until the trans- 
locator encounters a stop-transfer sequence; in double-pass transmembrane 
proteins, for example, the polypeptide can then be released into the bilayer 
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(Figure 12-44). In more complex multipass proteins, in which many hydro- 
phobic a helices span the bilayer, a second start-transfer sequence reinitiates 
translocation further down the polypeptide chain until the next stop-transfer 
sequence causes polypeptide release, and so on for subsequent start-transfer and 
stop-transfer sequences (Figure 12-45 and Movie 12.5). 

Hydrophobic start-transfer and stop-transfer signal sequences both act to fix 
the topology of the protein in the membrane by locking themselves into the mem- 
brane as membrane-spanning a helices; and they can do this in either orienta- 
tion. Whether a given hydrophobic signal sequence functions as a start-transfer 
or stop-transfer sequence must depend on its location in a polypeptide chain, 
since its function can be switched by changing its location in the protein by using 
recombinant DNA techniques. Thus, the distinction between start-transfer and 
stop-transfer sequences results mostly from their relative order in the growing 
polypeptide chain. It seems that the SRP begins scanning an unfolded polypep- 
tide chain for hydrophobic segments at its N-terminus and proceeds toward the 
C-terminus, in the direction that the protein is synthesized. By recognizing the 
first appropriate hydrophobic segment to emerge from the ribosome, the SRP 
sets the “reading frame” for membrane integration: after the SRP initiates trans- 
location, the translocator recognizes the next appropriate hydrophobic segment 
in the direction of transfer as a stop-transfer sequence, causing the region of the 
polypeptide chain in between to be threaded across the membrane. A similar 


Figure 12-43 Integration of a single- 
pass transmembrane protein with 
an internal signal sequence into the 
ER membrane. An internal ER signal 
sequence that functions as a start-transfer 
signal can bind to the translocator in one 
of two ways, leading to a membrane 
protein that has either its C-terminus 
(pathway A) or its N-terminus (pathway 
B) in the ER lumen. Proteins are directed 
into either pathway by features in the 
polypeptide chain flanking the internal 
start-transfer sequence: if there are 
more positively charged amino acids 
immediately preceding the hydrophobic 
core of the start-transfer sequence than 
there are following it, the membrane 
protein is inserted into the translocator 
in the orientation shown in pathway 
A, whereas if there are more positively 
charged amino acids immediately following 
the hydrophobic core of the start-transfer 
sequence than there are preceding it, 
the membrane protein is inserted into 
the translocator in the orientation shown 
in pathway B. Because translocation 
cannot start before a start-transfer 
sequence appears outside the ribosome, 
translocation of the N-terminal portion of 
the protein shown in (B) can occur only 
after this portion has been fully synthesized. 
Note that there are two ways to insert a 
single-pass membrane-spanning protein 
whose N-terminus is located in the ER 
lumen: that shown in Figure 12-42 and that 
shown here in (B). 
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scanning process continues until all of the hydrophobic regions in the protein 
have been inserted into the membrane as transmembrane a helices. 

Because membrane proteins are always inserted from the cytosolic side of the 
ER in this programmed manner, all copies of the same polypeptide chain will have 
the same orientation in the lipid bilayer. This generates an asymmetrical ER mem- 
brane in which the protein domains exposed on one side are different from those 
exposed on the other side. This asymmetry is maintained during the many mem- 
brane budding and fusion events that transport the proteins made in the ER to 
other cell membranes (discussed in Chapter 13). Thus, the way in which a newly 
synthesized protein is inserted into the ER membrane determines the orientation 
of the protein in all of the other membranes as well. 

When proteins are extracted with detergent from a membrane and then recon- 
stituted into artificial lipid vesicles, arandom mixture of right-side-out and inside- 
out protein orientations usually results. Thus, the protein asymmetry observed in 
cell membranes seems not to be an inherent property of the proteins, but instead 
results solely from the process by which proteins are inserted into the ER mem- 
brane from the cytosol. 
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Figure 12-44 Integration of a double- 
pass transmembrane protein with an 
internal signal sequence into the ER 
membrane. In this protein, an internal ER 
signal sequence acts as a start-transfer 
signal (as in Figure 12-43) and initiates 
the transfer of the C-terminal part of the 
protein. At some point after a stop-transfer 
sequence has entered the translocator, 
the translocator discharges the sequence 
laterally into the membrane. 


Figure 12-45 The insertion of the 
multipass membrane protein rhodopsin 
into the ER membrane. Rhodopsin is the 
light-sensitive protein in rod photoreceptor 
cells in the mammalian retina (discussed 

in Chapter 15). (A) A hydropathy plot 

(see Figure 10-20) identifies seven short 
hydrophobic regions in rhodopsin. (B) The 
hydrophobic region nearest the N-terminus 
serves as a Start-transfer sequence that 
causes the preceding N-terminal portion 
of the protein to pass across the ER 
membrane. Subsequent hydrophobic 
sequences function in alternation as start- 
transfer and stop-transfer sequences. The 
green arrows indicate the paired start and 
stop signals inserted into the translocator. 
(C) The final integrated rhodopsin has its 
N-terminus located in the ER lumen and its 
C-terminus located in the cytosol. The blue 
hexagons represent covalently attached 
oligosaccharides. 
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ER Tail-anchored Proteins Are Integrated into the ER Membrane 
by a Special Mechanism 


Many important membrane proteins are anchored in the membrane by a C-ter- 
minal transmembrane, hydrophobic a helix. These ER tail-anchored proteins 
include a large number of SNARE protein subunits that guide vesicular traffic 
(discussed in Chapter 13). When such a tail-anchored protein inserts into the ER 
membrane from the cytosol, only a few amino acids that follow the transmem- 
brane a helix on its C-terminal side are translocated into the ER lumen, while 
most of the protein remains in the cytosol. Because of the unique position of the 
transmembrane a helix in the protein sequence, translation terminates while the 
C-terminal amino acids that will form the transmembrane a helix have not yet 
emerged from the ribosome exit tunnel. Recognition by SRP is therefore not pos- 
sible. It was long thought that these proteins are released from the ribosome and 
the hydrophobic C-terminal tail spontaneously partitions into the ER membrane. 
Such a mechanism could not explain, however, why ER tail-anchored proteins 
insert into the ER membrane selectively and not also into all other membranes 
in the cell. It is now clear that a specialized targeting machinery is involved that 
is fueled by ATP hydrolysis (Figure 12-46). Although the components and details 
differ, this post-translational targeting mechanism is conceptually similar to 
SRP-dependent protein targeting (see Figure 12-37). 

Not all tail-anchored proteins are inserted into the ER, however. Some pro- 
teins contain a C-terminal membrane anchor that contains additional sorting 
information that directs the protein to mitochondria or peroxisomes. How these 
proteins are sorted there remains unknown. 


Translocated Polypeptide Chains Fold and Assemble in the Lumen 
of the Rough ER 


Many of the proteins in the lumen of the ER are in transit, en route to other desti- 
nations; others, however, normally reside there and are present at high concen- 
trations. These ER resident proteins contain an ER retention signal of four amino 
acids at their C-terminus that is responsible for retaining the protein in the ER (see 
Table 12-3. p. 648; discussed in Chapter 13). Some of these proteins function as 
catalysts that help the many proteins that are translocated into the ER lumen to 
fold and assemble correctly. 

One important ER resident protein is protein disulfide isomerase (PDI), which 
catalyzes the oxidation of free sulfhydryl (SH) groups on cysteines to form disul- 
fide (S-S) bonds. Almost all cysteines in protein domains exposed to either the 
extracellular space or the lumen of organelles in the secretory and endocytic 
pathways are disulfide-bonded. By contrast, disulfide bonds form only very rarely 
in domains exposed to the cytosol, because of the reducing environment there. 
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Another ER resident protein is the chaperone protein BiP. We have already dis- 
cussed how BiP pulls proteins post-translationally into the ER through the Sec61 
ER translocator. Like other chaperones (discussed in Chapter 13), BiP recognizes 
incorrectly folded proteins, as well as protein subunits that have not yet assem- 
bled into their final oligomeric complexes. It does so by binding to exposed amino 
acid sequences that would normally be buried in the interior of correctly folded 
or assembled polypeptide chains. An example of a BiP-binding site is a stretch 
of alternating hydrophobic and hydrophilic amino acids that would normally be 
buried in a B sheet with its hydrophobic side oriented towards the hydrophobic 
core of the folded protein. The bound BiP both prevents the protein from aggre- 
gating and helps keep it in the ER (and thus out of the Golgi apparatus and later 
parts of the secretory pathway). Like some other members of the hsp70 family of 
chaperone proteins, which bind unfolded proteins and facilitate their import into 
mitochondria and chloroplasts, BiP hydrolyzes ATP to shuttle between high- and 
low-affinity binding states, which allow it to hold on to and let go of its substrate 
proteins in a dynamic cycle. 


Most Proteins Synthesized in the Rough ER Are Glycosylated by 
the Addition of a Common N-Linked Oligosaccharide 


The covalent addition of oligosaccharides to proteins is one of the major bio- 
synthetic functions of the ER. About half of the soluble and membrane-bound 
proteins that are processed in the ER—including those destined for transport to 
the Golgi apparatus, lysosomes, plasma membrane, or extracellular space—are 
glycoproteins that are modified in this way. Many proteins in the cytosol and 
nucleus are also glycosylated, but not with oligosaccharides: they carry a much 
simpler sugar modification, in which a single N-acetylglucosamine group is 
added to a serine or threonine of the protein. 

During the most common form of protein glycosylation in the ER, a pre- 
formed precursor oligosaccharide (composed of N-acetylglucosamine, mannose, 
and glucose, and containing a total of 14 sugars) is transferred en bloc to proteins. 
Because this oligosaccharide is transferred to the side-chain NH2 group of an 
asparagine in the protein, it is said to be N-linked or asparagine-linked (Figure 
12-47A). The transfer is catalyzed by a membrane-bound enzyme complex, an 
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Figure 12-47 N-linked protein 
glycosylation in the rough ER. (A) Almost 
as soon as a polypeptide chain enters 

the ER lumen, it is glycosylated on target 
asparagine amino acids. The precursor 
oligosaccharide (shown in color) is attached 
only to asparagines in the sequences Asn- 
X-Ser and Asn-X-Thr (where X is any amino 
acid except proline). These sequences 
occur much less frequently in glycoproteins 
than in nonglycosylated cytosolic proteins. 
Evidently there has been selective 

pressure against these sequences during 
protein evolution, presumably because 
glycosylation at too many sites would 
interfere with protein folding. The five sugars 
in the gray box form the core region of this 
oligosaccharide. For many glycoproteins, 
only the core sugars survive the extensive 
oligosaccharide trimming that takes place 
in the Golgi apparatus. (B) The precursor 
oligosaccharide is transferred from a 
dolichol lipid anchor to the asparagine as 
an intact unit in a reaction catalyzed by a 
transmembrane oligosacchary! transferase 
enzyme. One copy of this enzyme is 
associated with each protein translocator 
in the ER membrane. (The translocator is 
not shown.) Oligosaccharyl transferase 
contains 13 transmembrane a helices and 
a large ER lumenal domain that contains 
its substrate-binding sites. The asparagine 
binds a tunnel that penetrates the enzyme 
interior. There, the amino group of the 
asparagine is twisted out of the plane that 
stabilizes the otherwise poorly reactive 
amide bond, activating it for reaction with 
the dolichol-oligosaccharide. The structure 
shown is of a prokaryotic homolog that 
closely resembles the catalytic subunit of 
the eukaryotic oligosaccharyl transferase. 
(PDB code: 3RCE.) 
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oligosaccharyl transferase, which has its active site exposed on the lumenal side 
of the ER membrane; this explains why cytosolic proteins are not glycosylated 
in this way. A special lipid molecule called dolichol anchors the precursor oli- 
gosaccharide in the ER membrane. The precursor oligosaccharide is transferred 
to the target asparagine in a single enzymatic step immediately after that amino 
acid has reached the ER lumen during protein translocation. The precursor oligo- 
saccharide is linked to the dolichol lipid by a high-energy pyrophosphate bond, 
which provides the activation energy that drives the glycosylation reaction (Figure 
12-47B). One copy of oligosaccharyl transferase is associated with each protein 
translocator, allowing it to scan and glycosylate the incoming polypeptide chains 
efficiently. 

The precursor oligosaccharide is built up sugar by sugar on the mem- 
brane-bound dolichol lipid and is then transferred to a protein. The sugars are 
first activated in the cytosol by the formation of nucleotide (UDP or GDP)-sugar 
intermediates, which then donate their sugar (directly or indirectly) to the lipid in 
an orderly sequence. Part way through this process, the lipid-linked oligosaccha- 
ride is flipped, with the help of a transporter, from the cytosolic to the lumenal 
side of the ER membrane (Figure 12-48). 

All of the diversity of the N-linked oligosaccharide structures on mature glyco- 
proteins results from the later modification of the original precursor oligosaccha- 
ride. While still in the ER, three glucoses (see Figure 12-47) and one mannose are 
quickly removed from the oligosaccharides of most glycoproteins. We shall return 
to the importance of glucose trimming shortly. This oligosaccharide “trimming,” 
or “processing, continues in the Golgi apparatus, as we discuss in Chapter 13. 

The N-linked oligosaccharides are by far the most common oligosaccharides, 
being found on 90% of all glycoproteins. Less frequently, oligosaccharides are 
linked to the hydroxyl group on the side chain of a serine, threonine, or hydroxy- 
lysine amino acid. A first sugar of these O-linked oligosaccharides is added in the 
ER and the oligosaccharide is then further extended in the Golgi apparatus (see 
Figure 13-32). 
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Figure 12-48 Synthesis of the lipid- 
linked precursor oligosaccharide 

in the rough ER membrane. The 
oligosaccharide is assembled sugar by 
sugar onto the carrier lipid dolichol (a 
polyisoprenoid; see Panel 2-5, pp. 98-99). 
Dolichol is long and very hydrophobic: its 
22 five-carbon units can span the thickness 
of a lipid bilayer more than three times, so 
that the attached oligosaccharide is firmly 
anchored in the membrane. The first Sugar 
is linked to dolichol by a pyrophosphate 
bridge. This high-energy bond activates 
the oligosaccharide for its eventual transfer 
from the lipid to an asparagine side chain 
of a growing polypeptide on the lumenal 
side of the rough ER. As indicated, the 
synthesis of the oligosaccharide starts on 
the cytosolic side of the ER membrane 
and continues on the lumenal face after 
the (Man)s5(GIcNAc)o lipid intermediate is 
flipped across the bilayer by a transporter 
(which is not shown). All the subsequent 
glycosyl transfer reactions on the lumenal 
side of the ER involve transfers from 
dolichol-P-glucose and dolichol-P- 
mannose; these activated, lipid-linked 
monosaccharides are synthesized from 
dolichol phosphate and UDP-glucose or 
GDP-mannose (as appropriate) on the 
cytosolic side of the ER and are then 
flipped across the ER membrane. GIcNAc 
= N-acetylglucosamine; Man = mannose; 
Glc = glucose. 
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Oligosaccharides Are Used as Tags to Mark the State of Protein 
Folding 


It has long been debated why glycosylation is such a common modification of pro- 
teins that enter the ER. One particularly puzzling observation has been that some 
proteins require N-linked glycosylation for proper folding in the ER, yet the precise 
location of the oligosaccharides attached to the protein’s surface does not seem to 
matter. A clue to the role of glycosylation in protein folding came from studies of 
two ER chaperone proteins, which are called calnexin and calreticulin because 
they require Ca** for their activities. These chaperones are carbohydrate-binding 
proteins, or lectins, which bind to oligosaccharides on incompletely folded pro- 
teins and retain them in the ER. Like other chaperones, they prevent incompletely 
folded proteins from irreversibly aggregating. Both calnexin and calreticulin also 
promote the association of incompletely folded proteins with another ER chaper- 
one, which binds to cysteines that have not yet formed disulfide bonds. 

Calnexin and calreticulin recognize N-linked oligosaccharides that contain 
a single terminal glucose, and they therefore bind proteins only after two of the 
three glucoses on the precursor oligosaccharide have been removed during glu- 
cose trimming by ER glucosidases. When the third glucose has been removed, the 
glycoprotein dissociates from its chaperone and can leave the ER. 

How, then, do calnexin and calreticulin distinguish properly folded from 
incompletely folded proteins? The answer lies in yet another ER enzyme, a glu- 
cosyl transferase that keeps adding a glucose to those oligosaccharides that have 
lost their last glucose. It adds the glucose, however, only to oligosaccharides that 
are attached to unfolded proteins. Thus, an unfolded protein undergoes continu- 
ous cycles of glucose trimming (by glucosidase) and glucose addition (by gluco- 
syl transferase), maintaining an affinity for calnexin and calreticulin until it has 
achieved its fully folded state (Figure 12-49). 


Improperly Folded Proteins Are Exported from the ER and 
Degraded in the Cytosol 


Despite all the help from chaperones, many protein molecules (more than 80% 
for some proteins) translocated into the ER fail to achieve their properly folded 
or oligomeric state. Such proteins are exported from the ER back into the cyto- 
sol, where they are degraded in proteasomes (discussed in Chapter 6). In many 
ways, the mechanism of retrotranslocation is similar to other post-translational 
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Figure 12-49 The role of N-linked 
glycosylation in ER protein folding. 
The ER-membrane-bound chaperone 
protein calnexin binds to incompletely 
folded proteins containing one terminal 
glucose on N-linked oligosaccharides, 
trapping the protein in the ER. Removal 
of the terminal glucose by a glucosidase 
releases the protein from calnexin. A 
glucosyl transferase is the crucial enzyme 
that determines whether the protein is 
folded properly or not: if the protein is still 
incompletely folded, the enzyme transfers 
a new glucose from UDP-glucose to 

the N-linked oligosaccharide, renewing 
the protein’s affinity for calnexin and 
retaining it in the ER. The cycle repeats 
until the protein has folded completely. 
Calreticulin functions similarly, except that 
it is a soluble ER resident protein. Another 
ER chaperone, ERp57 (not shown), 
collaborates with calnexin and calreticulin 
in retaining an incompletely folded protein 
in the ER. ERp57 recognizes free sulfhydryl 
groups, which are a sign of incomplete 
disulfide bond formation. 


686 Chapter 12: Intracellular Compartments and Protein Sorting 


modes of translocation. For example, like translocation into mitochondria or 
chloroplasts, chaperone proteins are necessary to keep the polypeptide chain in 
an unfolded state prior to and during translocation. Similarly, a source of energy is 
required to provide directionality to the transport and to pull the protein into the 
cytosol. Finally, a translocator is necessary. 

Selecting proteins from the ER for degradation is a challenging process: mis- 
folded proteins or unassembled protein subunits should be degraded, but folding 
intermediates of newly made proteins should not. Help in making this distinction 
comes from the N-linked oligosaccharides, which serve as timers that measure 
how long a protein has spent in the ER. The slow trimming of a particular man- 
nose on the core oligosaccharide tree by an enzyme (a mannosidase) in the ER 
creates a new oligosaccharide structure that ER-lumenal lectins of the retrotrans- 
location apparatus recognize. Proteins that fold and exit from the ER faster than 
the mannosidase can remove its target mannose therefore escape degradation. 

In addition to the lectins in the ER that recognize the oligosaccharides, chap- 
erones and protein disulfide isomerases (enzymes mentioned earlier that catalyze 
the formation and breakage of S-S bonds) associate with the proteins that must be 
degraded. The chaperones prevent the unfolded proteins from aggregating, and 
the disulfide isomerases break disulfide bonds that may have formed incorrectly, 
so that a linear polypeptide chain can be translocated back into the cytosol. 

Multiple translocator complexes move different proteins from the ER mem- 
brane or lumen into the cytosol. A common feature is that they each contain an 
E3 ubiquitin ligase enzyme, which attaches polyubiquitin tags to the unfolded 
proteins as they emerge into the cytosol, marking them for destruction. Fueled 
by the energy derived from ATP hydrolysis, a hexomeric ATPase of the family of 
AAA-ATPases (see Figure 6-85) pulls the unfolded protein through the transloca- 
tor into the cytosol. An N-glycanase removes its oligosaccharide chains en bloc. 
Guided by its ubiquitin tag, the deglycosylated polypeptide is rapidly fed into pro- 
teasomes, where it is degraded (Figure 12-50). 


Misfolded Proteins in the ER Activate an Unfolded Protein 
Response 


Cells carefully monitor the amount of misfolded protein in various compart- 
ments. An accumulation of misfolded proteins in the cytosol, for example, triggers 
a heat-shock response (discussed in Chapter 6), which stimulates the transcription 
of genes encoding cytosolic chaperones that help to refold the proteins. Similarly, 
an accumulation of misfolded proteins in the ER triggers an unfolded protein 
response, which includes an increased transcription of genes encoding proteins 
involved in retrotranslocation and protein degradation in the cytosol, ER chaper- 
ones, and many other proteins that help to increase the protein-folding capacity 
of the ER. 
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Figure 12-50 The export and 
degradation of misfolded ER proteins. 
Misfolded soluble proteins in the ER 
lumen are recognized and targeted to a 
translocator complex in the ER membrane. 
They first interact in the ER lumen with 
chaperones, disulfide isomerases, and 
lectins. They are then exported into 

the cytosol through the translocator. 

In the cytosol, they are ubiquitylated, 
deglycosylated, and degraded in 
oroteasomes. Misfolded membrane 
proteins follow a similar pathway but 

use a different translocator. 
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How do misfolded proteins in the ER signal to the nucleus? There are three par- 
allel pathways that execute the unfolded protein response (Figure 12-51A). The 
first pathway, which was initially discovered in yeast cells, is particularly remark- 
able. Misfolded proteins in the ER activate a transmembrane protein kinase in the 
ER, called IRE1, which causes the kinase to oligomerize and phosphorylate itself. 
(Some cell-surface receptor kinases in the plasma membrane are activated in a 
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Figure 12-51 The unfolded protein 
response. (A) By three parallel intracellular 
signaling pathways, the accumulation of 
misfolded proteins in the ER lumen signals 
to the nucleus to activate the transcription 
of genes that encode proteins that help 
the cell cope with misfolded proteins in 

the ER. (B) Regulated RNA splicing is a 
key regulatory switch in pathway 1 of the 
unfolded protein response (Movie 12.6). 
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similar way, as discussed in Chapter 15.) The oligomerization and autophosphor- 
ylation of IRE] activates an endoribonuclease domain in the cytosolic portion 
of the same molecule, which cleaves a specific cytosolic mRNA molecule at two 
positions, excising an intron. (This is a unique exception to the rule that introns 
are spliced out while the RNA is still in the nucleus.) The separated exons are then 
joined by an RNA ligase, generating a spliced mRNA, which is translated to pro- 
duce an active transcription regulatory protein. This protein activates the tran- 
scription of genes encoding the proteins that help mediate the unfolded protein 
response (Figure 12-51B). 

Misfolded proteins also activate a second transmembrane kinase in the ER, 
PERK, which inhibits a translation initiation factor by phosphorylating it, thereby 
reducing the production of new proteins throughout the cell. One consequence 
of the reduction in protein synthesis is to reduce the flux of proteins into the ER, 
thereby reducing the load of proteins that need to be folded there. Some pro- 
teins, however, are preferentially translated when translation initiation factors are 
scarce (discussed in Chapter 7, p. 424), and one of these is a transcription regula- 
tor that helps activate the transcription of the genes encoding proteins active in 
the unfolded protein response. 

Finally, a third transcription regulator, ATF6, is initially synthesized as a trans- 
membrane ER protein. Because it is embedded in the ER membrane, it cannot 
activate the transcription of genes in the nucleus. When misfolded proteins accu- 
mulate in the ER, however, the ATF6 protein is transported to the Golgi appara- 
tus, where it encounters proteases that cleave off its cytosolic domain, which can 
now migrate to the nucleus and help activate the transcription of genes encoding 
proteins involved in the unfolded protein response. (This mechanism is similar 
to that described in Figure 12-16 for activation of the transcription regulator that 
controls cholesterol biosynthesis.) The relative importance of each of these three 
pathways in the unfolded protein response differs in different cell types, enabling 
each cell type to tailor the unfolded protein response to its particular needs. 


some Membrane Proteins Acquire a Covalently Attached 
Glycosylohosphatidylinositol (GPI) Anchor 


As discussed in Chapter 10, several cytosolic enzymes catalyze the covalent addi- 
tion of a single fatty acid chain or prenyl group to selected proteins. The attached 
lipids help direct and attach these proteins to cell membranes. A related process 
is catalyzed by ER enzymes that covalently attach a glycosylphosphatidylinosi- 
tol (GPI) anchor to the C-terminus of some membrane proteins destined for the 
plasma membrane. This linkage forms in the lumen of the ER, where, at the same 
time, the transmembrane segment of the protein is cleaved off (Figure 12-52). A 
large number of plasma membrane proteins are modified in this way. Since they 
are attached to the exterior of the plasma membrane only by their GPI anchors, 
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Figure 12-52 The attachment of a GPI 
anchor to a protein in the ER. GPI- 
anchored proteins are targeted to the 

ER membrane by an N-terminal signal 
sequence (not shown), which is removed 
(see Figure 12—42). Immediately after 

the completion of protein synthesis, the 
precursor protein remains anchored in the 
ER membrane by a hydrophobic C-terminal 
sequence of 15-20 amino acids; the rest of 
the protein is in the ER lumen. Within less 
than a minute, an enzyme in the ER cuts 
the protein free from its membrane-bound 
C-terminus and simultaneously attaches 
the new C-terminus to an amino group 

on a preassembled GPI intermediate. The 
sugar chain contains an inositol attached to 
the lipid from which the GPI anchor derives 
its name. It is followed by a glucosamine 
and three mannoses. The terminal 
mannose links to a phosphoethanolamine 
that provides the amino group to attach 
the protein. The signal that specifies 

this modification is contained within the 
hydrophobic C-terminal sequence and 

a few amino acids adjacent to it on the 
lumenal side of the ER membrane; if this 
signal is added to other proteins, they too 
become modified in this way. Because 

of the covalently linked lipid anchor, the 
protein remains membrane-bound, with all 
of its amino acids exposed initially on the 
lumenal side of the ER and eventually on 
the exterior of the plasma membrane. 
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they can in principle be released from cells in soluble form in response to signals 
that activate a specific phospholipase in the plasma membrane. Trypanosome 
parasites, for example, use this mechanism to shed their coat of GPI-anchored 
surface proteins when attacked by the immune system. GPI anchors may also be 
used to direct plasma membrane proteins into lipid rafts and thus segregate the 
proteins from other membrane proteins (see Figure 10-13). 


The ER Assembles Most Lipid Bilayers 


The ER membrane is the site of synthesis of nearly all of the cell’s major classes 
of lipids, including both phospholipids and cholesterol, required for the produc- 
tion of new cell membranes. The major phospholipid made is phosphatidylcho- 
line, which can be formed in three steps from choline, two fatty acids, and glycerol 
phosphate (Figure 12-53). Each step is catalyzed by enzymes in the ER mem- 
brane, which have their active sites facing the cytosol, where all of the required 
metabolites are found. Thus, phospholipid synthesis occurs exclusively in the 
cytosolic leaflet of the ER membrane. Because fatty acids are not soluble in water, 
they are shepherded from their sites of synthesis to the ER by a fatty acid binding 
protein in the cytosol. After arrival in the ER membrane and activation with CoA, 
acyl transferases successively add two fatty acids to glycerol phosphate to produce 
phosphatidic acid. Phosphatidic acid is sufficiently water-insoluble to remain in 
the lipid bilayer; it cannot be extracted from the bilayer by the fatty acid binding 
proteins. It is therefore this first step that enlarges the ER lipid bilayer. The later 
steps determine the head group of a newly formed lipid molecule and therefore 
the chemical nature of the bilayer, but they do not result in net membrane growth. 
The two other major membrane phospholipids—phosphatidylethanolamine and 
phosphatidylserine (see Figure 10-3)—as well as the minor phospholipid phos- 
phatidylinositol (PI), are all synthesized in this way. 

Because phospholipid synthesis takes place in the cytosolic leaflet of the ER 
lipid bilayer, there needs to be a mechanism that transfers some of the newly 
formed phospholipid molecules to the lumenal leaflet of the bilayer. In synthetic 
lipid bilayers, lipids do not “flip-flop” in this way (see Figure 10-10). In the ER, 
however, phospholipids equilibrate across the membrane within minutes, which 
is almost 100,000 times faster than can be accounted for by spontaneous “flip- 
flop.” This rapid trans-bilayer movement is mediated by a poorly characterized 
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Figure 12-53 The synthesis of 
phosphatidylcholine. As illustrated, this 
phospholipid is synthesized from glycerol 
3-phosphate, cytidine-diphosphocholine 
(CDP-choline), and fatty acids delivered 
to the ER by a cytosolic fatty acid binding 
protein. 
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phospholipid translocator called a scramblase, which nonselectively equilibrates 
phospholipids between the two leaflets of the lipid bilayer (Figure 12-54). Thus, 
the different types of phospholipids are thought to be equally distributed between 
the two leaflets of the ER membrane. 

The plasma membrane contains a different type of phospholipid translocator 
that belongs to the family of P-type pumps (discussed in Chapter 11). These flip- 
pases specifically recognize those phospholipids that contain free amino groups 
in their head groups (phosphatidylserine and phosphatidylethanolamine—see 
Figure 10-3) and transfers them from the extracellular to the cytosolic leaflet, 
using the energy of ATP hydrolysis. The plasma membrane therefore has a highly 
asymmetric phospholipid composition, which is actively maintained by the flip- 
pases (see Figure 10-15). The plasma membrane also contains a scramblase but, 
in contrast to the ER scramblase, which is always active, the plasma membrane 
enzyme is regulated and only activated in some situations, such as in apoptosis 
and in activated platelets, where it acts to abolish the lipid bilayer asymmetry; the 
resulting exposure of phosphatidylserine on the surface of apoptotic cells serves 
as a signal for phagocytic cells to ingest and degrade the dead cell. 

The ER also produces cholesterol and ceramide (Figure 12-55). Ceramide is 
made by condensing the amino acid serine with a fatty acid to form the amino 
alcohol sphingosine (see Figure 10-3); a second fatty acid is then covalently added 
to form ceramide. The ceramide is exported to the Golgi apparatus, where it serves 
as a precursor for the synthesis of two types of lipids: oligosaccharide chains are 
added to form glycosphingolipids (glycolipids; see Figure 10-16), and phospho- 
choline head groups are transferred from phosphatidylcholine to other ceramide 
molecules to form sphingomyelin (discussed in Chapter 10). Thus, both glycolip- 
ids and sphingomyelin are produced relatively late in the process of membrane 
synthesis. Because they are produced by enzymes that have their active sites 
exposed to the Golgi lumen, they are found exclusively in the noncytosolic leaflet 
of the lipid bilayers that contain them. 


Figure 12-54 The role of phospholipid 
translocators in lipid bilayer synthesis. 
(A) Because new lipid molecules are 
added only to the cytosolic half of the ER 
membrane bilayer and lipid molecules 

do not flip spontaneously from one 
monolayer to the other, a transmembrane 
phospholipid translocator (called a 
scramblase) is required to transfer lipid 
molecules from the cytosolic half to 

the lumenal half so that the membrane 
grows as a bilayer. The scramblase is not 
specific for particular phospholipid head 
groups and therefore equilibrates the 
different phospholipids between the two 
monolayers. (B) Fueled by ATP hydrolysis, a 
head-group-specific flippase in the plasma 
membrane actively flios phosphatidylserine 
and phosphatidylethanolamine directionally 
from the extracellular to the cytosolic leaflet, 
creating the characteristically asymmetric 
lipid bilayer of the plasma membrane of 
animal cells (see Figure 10-15). 
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Figure 12-55 The structure of ceramide. 
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As discussed in Chapter 13, the plasma membrane and the membranes of the 
Golgi apparatus, lysosomes, and endosomes all form part of amembrane system 
that communicates with the ER by means of transport vesicles, which transfer 
both proteins and lipids. Mitochondria and plastids, however, do not belong to 
this system, and they therefore require different mechanisms to import proteins 
and lipids for growth. We have already seen that they import most of their proteins 
from the cytosol. Although mitochondria modify some of the lipids they import, 
they do not synthesize lipids de novo; instead, their lipids have to be imported 
from the ER, either directly or indirectly by way of other cell membranes. In either 
case, special mechanisms are required for the transfer. 

The details of how lipid distribution between different membranes is catalyzed 
and regulated are not known. Water-soluble carrier proteins called phospholipid 
exchange proteins (or phospholipid transfer proteins) are thought to transfer indi- 
vidual phospholipid molecules between membranes, functioning much like fatty 
acid binding proteins that shepherd fatty acids through the cytosol (see Figure 
12-54). In addition, mitochondria are often seen in close juxtaposition to ER 
membranes in electron micrographs, and specific junction complexes have been 
identified that hold the ER and outer mitochondrial membrane in close proxim- 
ity. It is thought that these junction complexes provide specific contact-depen- 
dent lipid transfer mechanisms that operate between these adjacent membranes. 


Summary 


The extensive ER network serves as a factory for the production of almost all of the 
cell’s lipids. In addition, a major portion of the cell’s protein synthesis occurs on the 
cytosolic surface of the rough ER: virtually all proteins destined for secretion or for 
the ER itself, the Golgi apparatus, the lysosomes, the endosomes, and the plasma 
membrane are first imported into the ER from the cytosol. In the ER lumen, the 
proteins fold and oligomerize, disulfide bonds are formed, and N-linked oligosac- 
charides are added. The pattern of N-linked glycosylation is used to indicate the 
extent of protein folding, so that proteins leave the ER only when they are prop- 
erly folded. Proteins that do not fold or oligomerize correctly are translocated back 
into the cytosol, where they are deglycosylated, polyubiquitylated, and degraded in 
proteasomes. If misfolded proteins accumulate in excess in the ER, they trigger an 
unfolded protein response, which activates appropriate genes in the nucleus to help 
the ER cope. 

Only proteins that carry a special ER signal sequence are imported into the ER. 
The signal sequence is recognized by a signal-recognition particle (SRP), which 
binds both the growing polypeptide chain and the ribosome and directs them to a 
receptor protein on the cytosolic surface of the rough ER membrane. This binding 
to the ER membrane initiates the translocation process that threads a loop of poly- 
peptide chain across the ER membrane through the hydrophilic pore of a protein 
translocator. 

Soluble proteins—destined for the ER lumen, for secretion, or for transfer to the 
lumen of other organelles—pass completely into the ER lumen. Transmembrane 
proteins destined for the ER or for other cell membranes are translocated part 
way across the ER membrane and remain anchored there by one or more mem- 
brane-spanning a-helical segments in their polypeptide chains. These hydropho- 
bic portions of the protein can act either as start-transfer or stop-transfer signals 
during the translocation process. When a polypeptide contains multiple, alternat- 
ing start-transfer and stop-transfer signals, it will pass back and forth across the 
bilayer multiple times as a multipass transmembrane protein. 

The asymmetry of protein insertion and glycosylation in the ER establishes the 
sidedness of the membranes of all the other organelles that the ER supplies with 
membrane proteins. 
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WHAT WE DON’T KNOW 


e How do nuclear import receptors 


negotiate the tangled gel-like interior of 


a nuclear pore complex so efficiently? 


e Is the nuclear pore complex a 
rigid structure or can it expand and 
contract, depending on the cargo 
transported? 


e Sequence comparisons show that 
signal sequences for an individual 
protein such as insulin are quite 
conserved across species, much 
more so than would be expected 
from our current understanding that 
all that matters for their function are 
general structural features Such as 
hydrophobicity. What other functions 
might signal sequences have that 
could account for their evolutionary 
sequence conservation? 


e How are polyribosomes on the 
endoplasmic reticulum membrane 
arranged so that the next initiating 
ribosome will find an unoccupied 
translocator? 


e Why does the signal-recognition 
particle have an indispensable RNA 
subunit? 
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PROBLEMS 


Which statements are true? Explain why or why not. 


12-1 Like the lumen ofthe ER, the interior of the nucleus 
is topologically equivalent to the outside of the cell. 


12-2 ER-bound and free ribosomes, which are structur- 
ally and functionally identical, differ only in the proteins 
they happen to be making at a particular time. 


12-3 ‘To avoid the inevitable collisions that would occur 
if two-way traffic through a single pore were allowed, 
nuclear pore complexes are specialized so that some 
mediate import while others mediate export. 


12-4 Peroxisomes are found in only a few specialized 
types of eukaryotic cell. 


Discuss the following problems. 


12-5 What is the fate of a protein with no sorting signal? 


12-6 The rough ERis the site of synthesis of many classes 
of membrane proteins. Some of these proteins remain in 
the ER, whereas others are sorted to compartments such 
as the Golgi apparatus, lysosomes, and the plasma mem- 
brane. One measure of the difficulty of the sorting prob- 
lem is the degree of “purification” that must be achieved 
during transport from the ER. Are proteins bound for the 
plasma membrane common or rare among all ER mem- 
brane proteins? 

A few simple considerations allow one to answer 
this question. In a typical growing cell that is dividing once 
every 24 hours, the equivalent of one new plasma mem- 
brane must transit the ER every day. If the ER membrane 
is 20 times the area of a plasma membrane, what is the 
ratio of plasma membrane proteins to other membrane 
proteins in the ER? (Assume that all proteins on their way 
to the plasma membrane remain in the ER for 30 minutes 
on average before exiting, and that the ratio of proteins to 
lipids in the ER and plasma membranes is the same.) 


12-7 Before nuclear pore complexes were well under- 
stood, it was unclear whether nuclear proteins diffused 
passively into the nucleus and accumulated there by bind- 
ing to residents of the nucleus such as chromosomes, or 
whether they were actively imported and accumulated 
regardless of their affinity for nuclear components. 

A classic experiment that addressed this prob- 
lem used several forms of radioactive nucleoplasmin, 
which is a large pentameric protein involved in chromatin 
assembly. In this experiment, either the intact protein or 
the nucleoplasmin heads, tails, or heads with a single tail 
were injected into the cytoplasm of a frog oocyte or into 
the nucleus (Figure Q12-1). All forms of nucleoplasmin, 
except heads, accumulated in the nucleus when injected 
into the cytoplasm, and all forms were retained in the 
nucleus when injected there. 

A. What portion of the nucleoplasmin molecule is 
responsible for localization in the nucleus? 


Figure Q12-1 Cellular 
location of injected 
nucleoplasmin and 
nucleoplasmin 
components 

(Problem 12-7). 
Schematic diagrams 
of autoradiographs 
show the cytoplasm 


nucleoplasmin 
preparation 


intact 1 
Pa 





one tail and nucleus with 
Oa- the location of 
ee nucleoplasmin indicated 
by the red areas. 
heads only 
ee 
Oo 
tails only 





eae 


B. How do these experiments distinguish between 
active transport, in which a nuclear localization signal trig- 
gers transport by the nuclear pore complex, and passive 
diffusion, in which a binding site for a nuclear component 
allows accumulation in the nucleus? 


12-8 Assuming that 32 million histone octamers are 
required to package the human genome, how many his- 
tone molecules must be transported per second per 
nuclear pore complex in cells whose nuclei contain 3000 
nuclear pores and are dividing once per day? 


12-9 The nuclear pore complex (NPC) creates a barrier 
to the free exchange of molecules between the nucleus and 
cytosol, but in a way that remains mysterious. In yeast, for 
example, the central pore of the NPC has a diameter of 35 
nm and is 30 nm long, which is somewhat smaller than its 
vertebrate counterpart. Even so, it is large enough to accom- 
modate virtually all components of the cytosol. Yet the pore 
allows passive diffusion of molecules only up to about 40 
kd; entry of anything larger requires help from a nuclear 
import receptor. Selective permeability is controlled by pro- 
tein components of the NPC that have unstructured, polar 
tails extending into the central pore. These tails are charac- 
terized by periodic repeats of the hydrophobic amino acids 
phenylalanine (F) and glycine (G). 

At high enough concentration (~50 mM), the 
FG-repeat domains of these proteins can form a gel, with 
a meshwork of interactions between the hydrophobic FG 
repeats (Figure Q12-2A). These gels allow passive diffu- 
sion of small molecules, but they prevent entry of larger 
proteins such as the fluorescent protein mCherry fused 
to maltose binding protein (MBP) (Figure Q12-2B). (The 
fusion to MBP makes mCherry too large to enter the 
nucleus by passive diffusion.) However, if the nuclear 
import receptor, importin, is fused to a similar protein, 
MBP-GFP, the importin-MBP-GFP fusion readily enters 
the gel (Figure Q12-2B). 


CHAPTER 12 END-OF-CHAPTER PROBLEMS 


(A) 


(B) 





10 min 


30 min 





importin-MBP-GFP 


Figure Q12-2 FG-repeat gel and influx of proteins into the nucleus 
(Problem 12-9). (A) Cartoon of the meshwork (gel) formed by pairwise 
interactions between hydrophobic FG repeats. For FG-repeats 
separated by 17 amino acids, as is typical, the mesh formed by 
extended amino acid side chains would correspond to about 4 nm on 
a side, which would be large enough to account for the characteristic 
passive diffusion of proteins through nuclear pores. (B) Diffusion of 
MBP-mCherry and importin-MBP-GFP into a gel of FG-repeats. In each 
group, the solution is shown at left and the gel at right. The bright areas 
indicate regions that contain the fluorescent proteins. 


A. FG-repeats only form gels in vitro at relatively high 
concentration (50 mM). Is this concentration reasonable 
for FG repeats in the NPC core? In yeast, there are about 
5000 FG-repeats in each NPC. Given the dimensions of the 
yeast nuclear pore (35 nm diameter and 30 nm length), 
calculate the concentration of FG-repeats in the cylindri- 
cal volume of the pore. Is this concentration comparable 
to the one used in vitro? 

B. A second question is whether the diffusion of 
importin-MBP-GFP through the FG-repeat gel is fast 
enough to account for the efficient flow of materials 
between the nucleus and cytosol. From experiments of 
the type shown in Figure Q12-2B, the diffusion coefficient 
(D) of importin- MBP-GFP through the FG-repeat gel was 
determined to be about 0.1 um?/s. The equation for diffu- 
sion is t = x?/2D, where t is time and x is distance. Calcu- 
late the time it would take importin-MBP-GFP to diffuse 
through a yeast nuclear pore (30 nm) if the pore consisted 
of a gel of FG-repeats. Does this time seem fast enough for 
the needs of a eukaryotic cell? 


12-10 Components of the TIM complexes, the multi- 
subunit protein translocators in the mitochondrial inner 
membrane, are much less abundant than those of the TOM 
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complex. They were initially identified using a genetic 
trick. The yeast Ura3 gene, whose product is an enzyme 
that is normally located in the cytosol where it is essential 
for synthesis of uracil, was modified so that the protein 
carried an import signal for the mitochondrial matrix. 
A population of cells carrying the modified Ura3 gene in 
place of the normal gene was then grown in the absence 
of uracil. Most cells died, but the rare cells that grew were 
shown to be defective for mitochondrial import. Explain 
how this selection identifies cells with defects in compo- 
nents required for import into the mitochondrial matrix. 
Why don’t normal cells with the modified Ura3 gene grow 
in the absence of uracil? Why do cells that are defective for 
mitochondrial import grow in the absence of uracil? 


12-11 If the enzyme dihydrofolate reductase (DHFR), 
which is normally located in the cytosol, is engineered to 
carry a mitochondrial targeting sequence at its N-terminus, 
it is efficiently imported into mitochondria. If the modified 
DHFR is first incubated with methotrexate, which binds 
tightly to the active site, the enzyme remains in the cyto- 
sol. How do you suppose that the binding of methotrexate 
interferes with mitochondrial import? 


12-12 Why do mitochondria need a special translocator 
to import proteins across the outer membrane, when the 
membrane already has large pores formed by porins? 


12-13 Examine the multipass transmembrane protein 
shown in Figure Q12-3. What would you predict would 
be the effect of converting the first hydrophobic trans- 
membrane segment to a hydrophilic segment? Sketch the 
arrangement of the modified protein in the ER membrane. 
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Figure Q12-3 Arrangement of a multipass transmembrane protein 
in the ER membrane (Problem 12-13). Blue hexagons represent 
covalently attached oligosaccharides. The positions of positively and 
negatively charged amino acids flanking the second transmembrane 
segment are shown. 


12-14 All new phospholipids are added to the cytosolic 
leaflet of the ER membrane, yet the ER membrane has a 
symmetrical distribution of different phospholipids in its 
two leaflets. By contrast, the plasma membrane, which 
receives all its membrane components ultimately from the 
ER, has a very asymmetrical distribution of phospholipids 
in the two leaflets of its lipid bilayer. How is the symmetry 
generated in the ER membrane, and how is the asymmetry 
generated and maintained in the plasma membrane? 
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Intracellular Membrane Traffic 


Every cell must eat, communicate with the world around it, and quickly respond 
to changes in its environment. To help accomplish these tasks, cells continually 
adjust the composition of their plasma membrane and internal compartments in 
rapid response to need. They use an elaborate internal membrane system to add 
and remove cell-surface proteins, such as receptors, ion channels, and transport- 
ers (Figure 13-1). Through the process of exocytosis, the secretory pathway deliv- 
ers newly synthesized proteins, carbohydrates, and lipids either to the plasma 
membrane or the extracellular space. By the converse process of endocytosis, cells 
remove plasma membrane components and deliver them to internal compart- 
ments called endosomes, from where they can be recycled to the same or different 
regions of the plasma membrane or be delivered to lysosomes for degradation. 
Cells also use endocytosis to capture important nutrients, such as vitamins, cho- 
lesterol, and iron; these are taken up together with the macromolecules to which 
they bind and are then moved on to endosomes and lysosomes, from where they 
can be transported into the cytosol for use in various biosynthetic processes. 

The interior space, or lumen, of each membrane-enclosed compartment 
along the secretory and endocytic pathways is equivalent to the lumen of most 
other membrane-enclosed compartments and to the cell exterior, in the sense 
that proteins can travel in this space without having to cross a membrane as 
they are passed from one compartment to another by means of numerous mem- 
brane-enclosed transport containers. These containers are formed from the donor 
compartment and are either small, spherical vesicles, larger irregular vesicles, or 
tubules. We shall use the term transport vesicle to apply to all forms of these con- 
tainers. 

Within a eukaryotic cell, transport vesicles continually bud off from one mem- 
brane and fuse with another, carrying membrane components and soluble lume- 
nal molecules, which are referred to as cargo (Figure 13-2). This vesicular traffic 
flows along highly organized, directional routes, which allow the cell to secrete, 
eat, and remodel its plasma membrane and organelles. The secretory pathway 
leads outward from the endoplasmic reticulum (ER) toward the Golgi apparatus 
and cell surface, with a side route leading to lysosomes, while the endocytic path- 
way leads inward from the plasma membrane. In each case, retrieval pathways 
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IN THIS CHAPTER 


THE MOLECULAR MECHANISMS 
OF MEMBRANE TRANSPORT 
AND THE MAINTENANCE OF 
COMPARTMENTAL DIVERSITY 


TRANSPORT FROM THE ER 
THROUGH THE 
GOLGI APPARATUS 


TRANSPORT FROM THE 
TRANS GOLGI NETWORK TO 
LYSOSOMES 


TRANSPORT INTO THE 
CELL FROM THE PLASMA 
MEMBRANE: ENDOCYTOSIS 


TRANSPORT FROM THE TRANS 
GOLGI NETWORK TO THE CELL 
EXTERIOR: EXOCYTOSIS 


Figure 13-1 Exocytosis and 
endocytosis. (A) In exocytosis, a transport 
vesicle fuses with the plasma membrane. 
Its content is released into the extracellular 
space, while the vesicle membrane (red) 
becomes continuous with the plasma 
membrane. (B) In endocytosis, a plasma 
membrane patch (red) is internalized, 
forming a transport vesicle. Its content 
derives from the extracellular space. 
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balance the flow of membrane between compartments in the opposite direction, Figure 13-2 Vesicle transport. Transport 


bringing membrane and selected proteins back to the compartment of origin Vesicles bud off from one compartment 
(Figure l 3-3) and fuse with another. As they do so, they 


: : , carry material as cargo from the lumen 
To perform its function, each transport vesicle that buds from a compartment (the space within a membrane-enclosed 


must be selective. It must take up only the appropriate molecules and must fuse compartment) and membrane of the donor 
only with the appropriate target membrane. A vesicle carrying cargo from the ER compartment to the lumen and membrane 
to the Golgi apparatus, for example, must exclude most proteins that are to stay Ofthe target compartment, as shown. 
in the ER, and it must fuse only with the Golgi apparatus and not with any other 
organelle. 
We begin this chapter by considering the molecular mechanisms of budding 
and fusion that underlie all vesicle transport. We then discuss the fundamental 
problem of how, in the face of this transport, the cell maintains the molecular and 
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Figure 13-3 A “road-map” of the secretory and endocytic pathways. (A) In this schematic roadmap, which was introduced in Chapter 12, the 
endocytic and secretory pathways are illustrated with green and red arrows, respectively. In addition, blue arrows denote retrieval pathways for the 
backflow of selected components. (B) The compartments of the eukaryotic cell involved in vesicle transport. The lumen of each membrane-enclosed 
compartment is topologically equivalent to the outside of the cell. All compartments shown communicate with one another and the outside of the 
cell by means of transport vesicles. In the secretory pathway (red arrows), protein molecules are transported from the ER to the plasma membrane 
or (via endosomes) to lysosomes. In the endocytic pathway (green arrows), molecules are ingested in endocytic vesicles derived from the plasma 
membrane and delivered to early endosomes and then (via late endosomes) to lysosomes. Many endocytosed molecules are retrieved from early 
endosomes and returned (Some via recycling endosomes) to the cell surface for reuse; similarly, some molecules are retrieved from the early and 

late endosomes and returned to the Golgi apparatus, and some are retrieved from the Golgi apparatus and returned to the ER. All of these retrieval 
pathways are shown with blue arrows, as in part (A). 
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functional differences between its compartments. Finally, we consider the func- 
tion of the Golgi apparatus, lysosomes, secretory vesicles, and endosomes, as we 
trace the pathways that connect these organelles. 


THE MOLECULAR MECHANISMS OF MEMBRANE 
TRANSPORT AND THE MAINTENANCE OF 
COMPARTMENTAL DIVERSITY 


Vesicle transport mediates a continuous exchange of components between the ten 
or more chemically distinct, membrane-enclosed compartments that collectively 
comprise the secretory and endocytic pathways. With this massive exchange, how 
can each compartment maintain its special identity? To answer this question, we 
must first consider what defines the character of a compartment. Above all, it is 
the composition of the enclosing membrane: molecular markers displayed on the 
cytosolic surface of the membrane serve as guidance cues for incoming traffic to 
ensure that transport vesicles fuse only with the correct compartment. Many of 
these membrane markers, however, are found on more than one compartment, 
and it is the specific combination of marker molecules that gives each compart- 
ment its molecular address. 

How are these membrane markers kept at high concentration on one compart- 
ment and at low concentration on another? To answer this question, we need to 
consider how patches of membrane, enriched or depleted in specific membrane 
components, bud off from one compartment and transfer to another. 

We begin by discussing how cells segregate proteins into separate membrane 
domains by assembling a special protein coat on the membrane’s cytosolic face. 
We consider how coats form, what they are made of, and how they are used to 
extract specific cargo components from a membrane and compartment lumen for 
delivery to another compartment. Finally, we discuss how transport vesicles dock 
at the appropriate target membrane and then fuse with it to deliver their cargo. 


There Are Various Types of Coated Vesicles 


Most transport vesicles form from specialized, coated regions of membranes. 
They bud off as coated vesicles, which have a distinctive cage of proteins covering 
their cytosolic surface. Before the vesicles fuse with a target membrane, they dis- 
card their coat, as is required for the two cytosolic membrane surfaces to interact 
directly and fuse. 

The coat performs two main functions that are reflected in a common two-lay- 
ered structure. First, an inner coat layer concentrates specific membrane proteins 
in a specialized patch, which then gives rise to the vesicle membrane. In this way, 
the inner layer selects the appropriate membrane molecules for transport. Sec- 
ond, an outer coat layer assembles into a curved, basketlike lattice that deforms 
the membrane patch and thereby shapes the vesicle. 

There are three well-characterized types of coated vesicles, distinguished by 
their major coat proteins: clathrin-coated, COPI-coated, and COPII-coated (Fig- 
ure 13-4). Each type is used for different transport steps. Clathrin-coated vesicles, 
for example, mediate transport from the Golgi apparatus and from the plasma 
membrane, whereas COPI- and COPII-coated vesicles most commonly mediate 
transport from the ER and from the Golgi cisternae (Figure 13-5). There is, how- 
ever, much more variety in coated vesicles and their functions than this short list 
suggests. As we discuss below, there are several types of clathrin-coated vesicles, 
each specialized for a different transport step, and the COPI- and COPII-coated 
vesicles may be similarly diverse. 


The Assembly of a Clathrin Coat Drives Vesicle Formation 


Clathrin-coated vesicles, the first coated vesicles to be identified, transport mate- 
rial from the plasma membrane and between endosomal and Golgi compart- 
ments. COPI-coated vesicles and COPII-coated vesicles transport material early 
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clathrin 


in the secretory pathway: COPI-coated vesicles bud from Golgi compartments, 
and COPII-coated vesicles bud from the ER (see Figure 13-5). We discuss clath- 
rin-coated vesicles first, as they provide a good example of how vesicles form. 

The major protein component of clathrin-coated vesicles is clathrin itself, 
which forms the outer layer of the coat. Each clathrin subunit consists of three 
large and three small polypeptide chains that together form a three-legged struc- 
ture called a triskelion (Figure 13-6A,B). Clathrin triskelions assemble into a bas- 
ketlike framework of hexagons and pentagons to form coated pits (buds) on the 
cytosolic surface of membranes (Figure 13-7). Under appropriate conditions, 
isolated triskelions spontaneously self-assemble into typical polyhedral cages 
in a test tube, even in the absence of the membrane vesicles that these baskets 
normally enclose (Figure 13-6C,D). Thus, the clathrin triskelions determine the 
geometry of the clathrin cage (Figure 13-6E). 


Adaptor Proteins Select Cargo into Clathrin-Coated Vesicles 


Adaptor proteins, another major coat component in clathrin-coated vesicles, 
form a discrete inner layer of the coat, positioned between the clathrin cage and 
the membrane. They bind the clathrin coat to the membrane and trap various 
transmembrane proteins, including transmembrane receptors that capture solu- 
ble cargo molecules inside the vesicle—so-called cargo receptors. In this way, the 
adaptor proteins select a specific set of transmembrane proteins, together with 
the soluble proteins that interact with them, and package them into each newly 
formed clathrin-coated transport vesicle (Figure 13-8). 
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Figure 13-4 Electron micrographs of 
clathrin-coated, COPI-coated, and 
COPIl-coated vesicles. All are shown in 
electron micrographs at the same scale. 
(A Clathrin-coated vesicles. (B) COPI- 
coated vesicles and Golgi cisternae (red 
arrows) from a cell-free system in which 
COPI-coated vesicles bud in the test tube. 
(C) COPII-coated vesicles. (A and B, from 
L. Orci, B. Glick and J. Rothman, Cell 
46:171-184, 1986. With permission from 
Elsevier; C, courtesy of Charles Barlowe 
and Lelio Orci.) 


Figure 13-5 Use of different coats for 
different steps in vesicle traffic. Different 
coat proteins select different cargo and 
shape the transport vesicles that mediate the 
various steps in the secretory and endocytic 
pathways. When the same coats function 

in different places in the cell, they usually 
incorporate different coat protein subunits 
that modify their properties (not shown). Many 
differentiated cells have additional pathways 
beside those shown here, including a sorting 
pathway from the trans Golgi network to 

the apical surface of epithelial cells and a 
specialized recycling pathway for proteins 

of synaptic vesicles in the nerve terminals of 
neurons (see Figure 11-36). The arrows are 
colored as in Figure 13-3. 
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Figure 13-6 The structure of a clathrin coat. (A) Electron micrograph of a clathrin triskelion shadowed with platinum. 

(B) Each triskelion is composed of three clathrin heavy chains and three clathrin light chains, as shown in the diagram. (C and 
D) A cryoelectron micrograph taken of a clathrin coat composed of 36 triskelions organized in a network of 12 pentagons 

and 6 hexagons, with some heavy chains (C) and light chains (D) highlighted (Movie 13.1). The light chains link to the actin 
cytoskeleton, which helps generate force for membrane budding and vesicle movement, and their phosphorylation regulates 
clathrin coat assembly. The interwoven legs of the clathrin triskelions form an outer shell from which the N-terminal domains 

of the triskelions protrude inward. These domains bind to the adaptor proteins shown in Figure 13-8. The coat shown was 
assembled biochemically from pure clathrin triskelions and is too small to enclose a membrane vesicle. (E) Images of clathrin- 
coated vesicles isolated from bovine brain. The clathrin coats are constructed in a similar but less regular way, from pentagons, 
a larger number of hexagons, and sometime heptagons, resembling the architecture of deformed soccer balls. The structures 
were determined by cryoelectron microscopy and tomographic reconstruction. (A, from E. Ungewickell and D. Branton, Nature 
289:420-422, 1981; C and D, from A. Fotin et al., Nature 432:573-579, 2004. All with permission from Macmillan Publishers 
Ltd; E, from Y. Cheng et al., J. Mol. Biol. 365:892-899, 2007. With permission from Elsevier.) 


There are several types of adaptor proteins. The best characterized have four 
different protein subunits; others are single-chain proteins. Each type of adaptor 
protein is specific for a different set of cargo receptors. Clathrin-coated vesicles 
budding from different membranes use different adaptor proteins and thus pack- 
age different receptors and cargo molecules. 

The assembly of adaptor proteins on the membrane is tightly controlled, in 
part by the cooperative interaction of the adaptor proteins with other compo- 
nents of the coat. The adaptor protein AP2 serves as a well-understood example. 
When it binds to a specific phosphorylated phosphatidylinositol lipid (a phospho- 
inositide), it alters its conformation, exposing binding sites for cargo receptors in 
the membrane. The simultaneous binding to the cargo receptors and lipid head 
groups greatly enhances the binding of AP2 to the membrane (Figure 13-9). 

Because several requirements must be met simultaneously to stably bind AP2 
proteins to a membrane, the proteins act as coincidence detectors that only assem- 
ble at the right time and place. Upon binding, they induce membrane curvature, 
which makes the binding of additional AP2 proteins in its proximity more likely. 
The cooperative assembly of the AP2 coat layer then is further amplified by clath- 
rin binding, which leads to the formation and budding of a transport vesicle. 

Adaptor proteins found in other coats also bind to phosphoinositides, which 
not only have a major role in directing when and where coats assemble in the cell, 
but also are used much more widely as molecular markers of compartment iden- 
tity. This helps to control vesicular traffic, as we now discuss. 


Figure 13-7 Clathrin-coated pits and vesicles. This rapid-freeze, deep- 
etch electron micrograph shows numerous clathrin-coated pits and vesicles 
on the inner surface of the plasma membrane of cultured fibroblasts. The 
cells were rapidly frozen in liquid helium, fractured, and deep-etched to 
expose the cytoplasmic surface of the plasma membrane. (Courtesy of John 
Heuser.) 
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Phosphoinositides Mark Organelles and Membrane Domains 


Although inositol phospholipids typically comprise less than 10% of the total 
phospholipids in a membrane, they have important regulatory functions. They 
can undergo rapid cycles of phosphorylation and dephosphorylation at the 3’, 4’, 
and 5’ positions of their inositol sugar head groups to produce various types of 
phosphoinositides (phosphatidylinositol phosphates, or PIPs). The interconver- 
sion of phosphatidylinositol (PI) and PIPs is highly compartmentalized: different 
organelles in the endocytic and secretory pathways have distinct sets of PI and PIP 
kinases and PIP phosphatases (Figure 13-10). The distribution, regulation, and 
local balance of these enzymes determine the steady-state distribution of each 
PIP species. As a consequence, the distribution of PIPs varies from organelle to 
organelle, and often within a continuous membrane from one region to another, 
thereby defining specialized membrane domains. 

Many proteins involved at different steps in vesicle transport contain domains 
that bind with high specificity to the head groups of particular PIPs, distinguishing 
one phosphorylated form from another (see Figure 13-10 E and F). Local control 
of the PI and PIP kinases and PIP phosphatases can therefore be used to rapidly 
control the binding of proteins to a membrane or membrane domain. The produc- 
tion of a particular type of PIP recruits proteins containing matching PIP-binding 
domains. The PIP-binding proteins then help regulate vesicle formation and other 
steps in the control of vesicle traffic (Figure 13-11). The same strategy is widely 
used to recruit specific intracellular signaling proteins to the plasma membrane 
in response to extracellular signals (discussed in Chapter 15). 
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Figure 13-8 The assembly and 
disassembly of a clathrin coat. The 
assembly of the coat introduces curvature 
into the membrane, which leads in turn 

to the formation of a coated bud (called a 
coated pit if it is in the plasma membrane). 
The adaptor proteins bind both clathrin 
triskelions and membrane-bound cargo 
receptors, thereby mediating the selective 
recruitment of both membrane and soluble 
cargo molecules into the vesicle. Other 
membrane-bending and fission proteins 
are recruited to the neck of the budding 
vesicle, where sharp membrane curvature 
is introduced. The coat is rapidly lost 
shortly after the vesicle buds off. 


Figure 13-9 Lipid-induced conformation 
switching of AP2. The AP2 adaptor 
protein complex has four subunits (a, B2, 
u2, and 62). Upon interaction with the 
phosphoinositide PI(4,5)P2 (see Figure 
13-10) in the cytosolic leaflet of the plasma 
membrane, AP2 rearranges so that binding 
sites for cargo receptors become exposed. 
Each AP2 complex binds four PI(4,5)P2 
molecules (for clarity, only one is shown). 

In the open AP2 complex, the u2 and o2 
subunits bind the cytosolic tails of cargo 
receptors that display the appropriate 
endocytosis signals. These signals consist 
of short amino acid sequence motifs. When 
AP2 binds tightly to the membrane, it 
induces curvature, which favors the binding 
of additional AP2 complexes in the vicinity. 
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The forces generated by clathrin coat assembly alone are not sufficient to 
shape and pinch off a vesicle from the membrane. Other membrane-bending 
and force-generating proteins participate at every stage of the process. Mem- 
brane-bending proteins that contain crescent-shaped domains, called BAR 
domains, bind to and impose their shape on the underlying membrane via elec- 
trostatic interactions with the lipid head groups (Figure 13-12; also see Figure 
10-40). Such BAR-domain proteins are thought to help AP2 nucleate clathrin-me- 
diated endocytosis by shaping the plasma membrane to allow a clathrin-coated 
bud to form. Some of these proteins also contain amphiphilic helices that induce 
membrane curvature after being inserted as wedges into the cytoplasmic leaflet 
of the membrane. Other BAR-domain proteins are important in shaping the neck 
of a budding vesicle, where stabilization of sharp membrane bends is essential. 
Finally, the clathrin machinery nucleates the local assembly of actin filaments 
that introduce tension to help pinch off and propel the forming vesicle away from 
the membrane. 


Cytoplasmic Proteins Regulate the Pinching-Off and Uncoating of 
Coated Vesicles 


As aclathrin-coated bud grows, soluble cytoplasmic proteins, including dynamin, 
assemble at the neck of each bud (Figure 13-13). Dynamin contains a PI(4,5) 
P»-binding domain, which tethers the protein to the membrane, and a GTPase 
domain, which regulates the rate at which vesicles pinch off from the membrane. 


Figure 13-11 The intracellular location of phosphoinositides. Different 
types of PIPs are located in different membranes and membrane domains, 
where they are often associated with specific vesicle transport events. The 
membrane of secretory vesicles, for example, contains PI(4)P. When the 
vesicles fuse with the plasma membrane, a PI 5-kinase that is localized 
there converts the PI(4)P into PI(4,5)P2. The PI(4,5)P2, in turn, helps recruit 
adaptor proteins, which initiate the formation of a clathrin-coated pit, as the 
first step in clathrin-mediated endocytosis. Once the clathrin-coated vesicle 
buds off from the plasma membrane, a PI(5)P phosphatase hydrolyzes 
PI(4,5)P2, which weakens the binding of the adaptor proteins, promoting 
vesicle uncoating. We discuss phagocytosis and the distinction between 
regulated and constitutive exocytosis later in the chapter. (Modified from 
M.A. de Matteis and A. Godi, Nat. Cell Biol. 6:487-492, 2004. With 
permission from Macmillan Publishers Ltd.) 
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Figure 13-10 Phosphatidylinositol 

(Pl) and phosphoinositides (PIPs). 

(A, B) The structure of PI shows the 

free hydroxyl groups in the inositol 

sugar that can in principle be modified. 
(C) Phosphorylation of one, two, or three 
of the hydroxyl groups on PI by PI and 
PIP kinases produces a variety of PIP 
species. They are named according to 
the ring position (in parentheses) and the 
number of phosphate groups (Subscript) 
added to PI. PI(8,4)P2 is shown. (D) Animal 
cells have several PI and PIP kinases and 
a similar number of PIP phosphatases, 
which are localized to different organelles, 
where they are regulated to catalyze the 
production of particular PIPs. The red 
and green arrows show the kinase and 
phosphatase reactions, respectively. 

(E, F) Phosphoinositide head groups 

are recognized by protein domains that 
discriminate between the different forms. 
In this way, select groups of proteins 
containing such domains are recruited 

to regions of membrane in which these 
ohosphoinositides are present. PI(3)P and 
PI(4,5)P2 are shown. (D, modified from 
M.A. de Matteis and A. Godi, Nat. Cell Biol. 
6:487-492, 2004. With permission from 
Macmillan Publishers Ltd.) 


KEY: PI(3)P PI(4)P PI(4,5)P, PI(3,5)P, PI(3,4,5)P3 
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The pinching-off process brings the two noncytosolic leaflets of the membrane 
into close proximity and fuses them, sealing off the forming vesicle (see Figure 
13-2). To perform this task, dynamin recruits other proteins to the neck of the 
bud. Together with dynamin, they help bend the patch of membrane—by directly 
distorting the bilayer structure, or by changing its lipid composition through the 
recruitment of lipid-modifying enzymes, or by both mechanisms. 

Once released from the membrane, the vesicle rapidly loses its clathrin coat. A 
PIP phosphatase that is co-packaged into clathrin-coated vesicles depletes PI(4,5) 
P2 from the membrane, which weakens the binding of the adaptor proteins. In 
addition, an hsp70 chaperone protein (see Figure 6-80) functions as an uncoating 
ATPase, using the energy of ATP hydrolysis to peel off the clathrin coat. Auxilin, 
another vesicle protein, is thought to activate the ATPase. The release of the coat, 
however, must not happen prematurely, so additional control mechanisms must 
somehow prevent the clathrin from being removed before it has formed a com- 
plete vesicle (discussed below). 
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Figure 13-12 The structure of BAR 
domains. BAR-domain proteins are diverse 
and enable many membrane-bending 
processes in the cell. BAR domains are 
built from coiled coils that dimerize into 
modules with a positively charged inner 
surface, which preferentially interacts 

with negatively charged lipid head groups 
to bend membranes. Local membrane 
deformations caused by BAR-domain 
proteins facilitate the binding of additional 
BAR-domain proteins, thereby generating 
a positive feedback cycle for curvature 
propagation. Individual BAR-domain 
proteins contain a distinctive curvature and 
often have additional features that adapt 
them to their specific tasks: some have 
short amphiphilic helices that cause further 
membrane deformation by wedge insertion; 
others are flanked by PIP-binding domains 
that direct them to membranes enriched in 
cognate phosphoinositides. 
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Figure 13-13 The role of dynamin in pinching off clathrin-coated vesicles. (A) Multiple dynamin molecules assemble into a spiral around the 
neck of the forming bud. The dynamin spiral is thought to recruit other proteins to the bud neck, which, together with dynamin, destabilize the 
interacting lipid bilayers so that the noncytoplasmic leaflets flow together. The newly formed vesicle then pinches off from the membrane. Specific 
mutations in dynamin can either enhance or block the pinching-off process. (B) Dynamin was discovered as the protein defective in the shibire 
mutant of Drosophila. These mutant flies become paralyzed because clathrin-mediated endocytosis stops, and the synaptic vesicle membrane fails 
to recycle, blocking neurotransmitter release. Deeply invaginated clathrin-coated pits form in the nerve endings of the fly’s nerve cells, with a belt of 
mutant dynamin assembled around the neck, as shown in this thin-section electron micrograph. The pinching-off process fails because the required 
membrane fusion does not take place. (C, D) A model of how conformational changes in the GTPase domains of membrane-assembled dynamin 


may power a conformational change that constricts the neck of the bud. A single dynamin molecule 


is shown in orange in D. (B, from J.H. Koenig 


and K. Ikeda, J. Neurosci. 9:3844-3860, 1989. With permission from the Society of Neuroscience; C and D, adapted from M.G.J. Ford, S. Jenni 


and J. Nunnari, Nature 477:561-566, 2011. With permission from Macmillan Publishers.) 
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Monomeric GTPases Control Coat Assembly 


To balance the vesicle traffic to and from a compartment, coat proteins must 
assemble only when and where they are needed. While local production of PIPs 
plays a major part in regulating the assembly of clathrin coats on the plasma 
membrane and Golgi apparatus, cells superimpose additional ways of regulat- 
ing coat formation. Coat-recruitment GTPases, for example, control the assembly 
of clathrin coats on endosomes and the COPI and COPII coats on Golgi and ER 
membranes. 

Many steps in vesicle transport depend on a variety of GTP-binding pro- 
teins that control both the spatial and temporal aspects of vesicle formation and 
fusion. As discussed in Chapter 3, GTP-binding proteins regulate most processes 
in eukaryotic cells. They act as molecular switches, which flip between an active 
state with GTP bound and an inactive state with GDP bound. Two classes of pro- 
teins regulate the flipping: guanine nucleotide exchange factors (GEFs) activate 
the proteins by catalyzing the exchange of GDP for GTP, and GTPase-activating 
proteins (GAPs) inactivate the proteins by triggering the hydrolysis of the bound 
GTP to GDP (see Figures 3-68 and 15-7). Although both monomeric GTP-binding 
proteins (monomeric GTPases) and trimeric GTP-binding proteins (G proteins) 
have important roles in vesicle transport, the roles of the monomeric GTPases are 
better understood, and we focus on them here. 

Coat-recruitment GTPases are members of a family of monomeric GTPases. 
They include the ARF proteins, which are responsible for the assembly of both 
COPI and clathrin coats assembly at Golgi membranes, and the Sarl protein, 
which is responsible for the assembly of COPII coats at the ER membrane. 
Coat-recruitment GTPases are usually found in high concentration in the cyto- 
sol in an inactive, GDP-bound state. When a COPII-coated vesicle is to bud from 
the ER membrane, for example, a specific Sarl-GEF embedded in the ER mem- 
brane binds to cytosolic Sarl, causing the Sarl to release its GDP and bind GTP 
in its place. (Recall that GTP is present in much higher concentration in the cyto- 
sol than GDP and therefore will spontaneously bind after GDP is released.) In its 
GTP-bound state, the Sarl protein exposes an amphiphilic helix, which inserts 
into the cytoplasmic leaflet of the lipid bilayer of the ER membrane. The tightly 
bound Sarl now recruits adaptor coat protein subunits to the ER membrane to 
initiate budding (Figure 13-14). Other GEFs and coat-recruitment GTPases oper- 
ate in a similar way on other membranes. 

The coat-recruitment GTPases also have a role in coat disassembly. The hydro- 
lysis of bound GTP to GDP causes the GTPase to change its conformation so that 
its hydrophobic tail pops out of the membrane, causing the vesicle’s coat to dis- 
assemble. Although it is not known what triggers the GTP hydrolysis, it has been 
proposed that the GTPases work like timers, which hydrolyze GTP at slow but pre- 
dictable rates, to ensure that vesicle formation is synchronized with the require- 
ments of the moment. COPII coats accelerate GTP hydrolysis by Sar1, and a fully 
formed vesicle will be produced only when bud formation occurs faster than the 
timed disassembly process; otherwise, disassembly will be triggered before a 
vesicle pinches off, and the process will have to start again, perhaps at a more 
appropriate time and place. Once a vesicle pinches off, GTP hydrolysis releases 
Sarl, but the sealed coat is sufficiently stabilized through many cooperative inter- 
actions, including binding to the cargo receptors in the membrane, that it may 
stay on the vesicle until the vesicle docks at a target membrane. There, a kinase 
phosphorylates the coat proteins, which completes coat disassembly and readies 
the vesicle for fusion. 

Clathrin- and COPI-coated vesicles, by contrast, shed their coat soon after 
they pinch off. For COPI vesicles, the curvature of the vesicle membrane serves 
as a trigger to begin uncoating. An ARF-GAP is recruited to the COPI coat as it 
assembles. It interacts with the membrane, and senses the lipid packing density. 
It becomes activated when the curvature of the membrane approaches that of a 
transport vesicle. It then inactivates ARF, causing the coat to disassemble. 
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Not All Transport Vesicles Are Spherical 


Although vesicle-budding is similar at various locations in the cell, each cell 
membrane poses its own special challenges. The plasma membrane, for exam- 
ple, is comparatively flat and stiff, owing to its cholesterol-rich lipid composition 
and underlying actin-rich cortex. Thus, the coordinated action of clathrin coats 
and membrane-bending proteins has to produce sufficient force to introduce 
curvature, especially at the neck of the bud where sharp bends are required for 
the pinching-off processes. In contrast, vesicle-budding from many intracellular 
membranes occurs preferentially at regions where the membranes are already 
curved, such as the rims of the Golgi cisternae or ends of membrane tubules. In 
these places, the primary function of the coats is to capture the appropriate cargo 
proteins rather than to deform the membrane. 

Transport vesicles also occur in various sizes and shapes. Diverse COPII ves- 
icles are required for the transport of large cargo molecules. Collagen, for exam- 
ple, is assembled in the ER as 300-nm-long, stiff procollagen rods that then are 
secreted from the cell where they are cleaved by proteases to collagen, which is 
embedded into the extracellular matrix (discussed in Chapter 19). Procollagen 
rods do not fit into the 60-80 nm COPII vesicles normally observed. To circumvent 
this problem, the procollagen cargo molecules bind to transmembrane packag- 
ing proteins in the ER, which control the assembly of the COPII coat components 
(Figure 13-15). These events drive the local assembly of much larger COPII vesi- 
cles that accommodate the oversized cargo. Human mutations in genes encoding 
such packaging proteins resultin collagen defects with severe consequences, such 
as skeletal abnormalities and other developmental defects. Similar mechanisms 
must regulate the sizes of vesicles required to secrete other large macromolecular 
complexes, including the lipoprotein particles that transport lipids out of cells. 
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Figure 13-14 Formation of a COPII- 
coated vesicle. (A) Inactive, soluble 
Sar1-GDP binds to a Sar1-GEF in the ER 
membrane, causing the Sar1 to release 
its GDP and bind GTP. A GTP-triggered 
conformational change in Sari exposes 
an amphiphilic helix, which inserts into the 
cytoplasmic leaflet of the ER membrane, 
initiating membrane bending (which is 

not shown). (B) GTP-bound Sar1 binds 

to a complex of two COPII adaptor coat 
proteins, called Sec23 and Sec24, which 
form the inner coat. Sec24 has several 
different binding sites for the cytosolic 
tails of cargo receptors. The entire surface 
of the complex that attaches to the 
membrane is gently curved, matching 

the diameter of COPIl-coated vesicles. 

(C) A complex of two additional COPII 
coat proteins, called Sec13 and Sec 31, 
forms the outer shell of the coat. Like 
clathrin, they can assemble on their own 
into symmetrical cages with appropriate 
dimensions to enclose a COPIl-coated 
vesicle. (D) Membrane-bound, active 
Sar1-GTP recruits COPII adaptor proteins 
to the membrane. They select certain 
transmembrane proteins and cause the 
membrane to deform. The adaptor proteins 
then recruit the outer coat proteins which 
help form a bud. A subsequent membrane 
fusion event pinches off the coated vesicle. 
Other coated vesicles are thought to form 
in a similar way. (C, modified from 

S.M. Stagg et al., Nature 439:234-238, 
2006. With permission from Macmillan 
Publishers Ltd.) 
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Many other vesicle budding events likewise involve variations of common 
mechanisms. When living cells are genetically engineered to express fluores- 
cent membrane components, the endosomes and trans Golgi network are seen 
in a fluorescence microscope to continually send out long tubules. Coat proteins 
assemble onto the membrane tubules and help recruit specific cargo. The tubules 
then either regress or pinch off with the help of dynamin-like proteins to form 
transport vesicles of different sizes and shapes. 

Tubules have a higher surface-to-volume ratio than the larger organelles from 
which they form. They are therefore relatively enriched in membrane proteins 
compared with soluble cargo proteins. As we discuss later, this property of tubules 
is an important feature for sorting proteins in endosomes. 


Rab Proteins Guide Transport Vesicles to Their Target Membrane 


To ensure an orderly flow of vesicle traffic, transport vesicles must be highly accu- 
rate in recognizing the correct target membrane with which to fuse. Because of the 
diversity and crowding of membrane systems in the cytoplasm, a vesicle is likely 
to encounter many potential target membranes before it finds the correct one. 
Specificity in targeting is ensured because all transport vesicles display surface 
markers that identify them according to their origin and type of cargo, and tar- 
get membranes display complementary receptors that recognize the appropriate 
markers. This crucial process occurs in two steps. First, Rab proteins and Rab effec- 
tors direct the vesicle to specific spots on the correct target membrane. Second, 
SNARE proteins and SNARE regulators mediate the fusion of the lipid bilayers. 

Rab proteins play a central part in the specificity of vesicle transport. Like the 
coat-recruitment GTPases discussed earlier (see Figure 13-14), Rab proteins are 
also monomeric GTPases. With over 60 known members, the Rab subfamily is the 
largest of the monomeric GTPase subfamilies. Each Rab protein is associated with 
one or more membrane-enclosed organelles of the secretory or endocytic path- 
ways, and each of these organelles has at least one Rab protein on its cytosolic sur- 
face (Table 13-1). Their highly selective distribution on these membrane systems 
makes Rab proteins ideal molecular markers for identifying each membrane type 
and guiding vesicle traffic between them. Rab proteins can function on transport 
vesicles, on target membranes, or both. 

Like the coat-recruitment GTPases, Rab proteins cycle between a membrane 
and the cytosol and regulate the reversible assembly of protein complexes on the 
membrane. In their GDP-bound state, they are inactive and bound to another 
protein (Rab-GDP dissociation inhibitor, or GDI) that keeps them soluble in the 
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Figure 13-15 Packaging of procollagen 
into large tubular COPII-coated vesicles. 
The cartoons show models for two COPII 
coat assembly modes. The models are 
based on cryoelectron tomography 
images of reconstituted COPII vesicles. 
On a spherical membrane (left), the 
Sec23/24 inner coat proteins assemble in 
patches that anchor the Sec13/31 outer 
coat protein cage. The Sec13/31 rods 
assemble a cage of triangles, squares, 
and pentagons. When procollagen needs 
to be packaged (right), special packaging 
proteins sense the cargo and modify the 
coat assembly process. This interaction 
recruits the COPII inner coat protein Sec24 
and locally enhances the rate with which 
Sarl cycles on and off the membrane 
(not shown). In addition, a monoubiquitin 
is added to the Sec31 protein, changing 
the assembly properties of the outer 
cage. Sec23/24 proteins arrange in larger 
arrays and Sec13/31 arrange in a regular 
lattice of diamond shapes. As the result, 
a large tubular vesicle is formed that can 
accommodate the large cargo molecules. 
The packaging proteins are not part of 
the budding vesicle but remain in the 

ER. (Modified from G. Zanetti et al., eLife 
2:e00951, 2013.) 
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TABLE 13-1 





cytosol; in their GTP-bound state, they are active and tightly associated with the 
membrane of an organelle or transport vesicle. Membrane-bound Rab-GEFs 
activate Rab proteins on both transport vesicle and target membranes; for some 
membrane fusion events, activated Rab molecules are required on both sides 
of the reaction. Once in the GTP-bound state and membrane-bound through a 
now-exposed lipid anchor, Rab proteins bind to other proteins, called Rab effec- 
tors, which are the downstream mediators of vesicle transport, membrane teth- 
ering, and membrane fusion (Figure 13-16). The rate of GTP hydrolysis sets the 
concentration of active Rab and, consequently, the concentration of its effectors 
on the membrane. 

In contrast to the highly conserved structure of Rab proteins, the structures 
and functions of Rab effectors vary greatly, and the same Rab proteins can often 
bind to many different effectors. Some Rab effectors are motor proteins that pro- 
pel vesicles along actin filaments or microtubules to their target membrane. Oth- 
ers are tethering proteins, some of which have long, threadlike domains that serve 
as “fishing lines” that can extend to link two membranes more than 200 nm apart; 
other tethering proteins are large protein complexes that link two membranes 
that are closer together and interact with a wide variety of other proteins that facil- 
itate the membrane fusion step. The tethering complex that docks COPH-coated 
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Figure 13-16 Tethering of a transport 
vesicle to a target membrane. Rab 
effector proteins interact with active Rab 
proteins (Rab-GTPs, yellow) located on 
the target membrane, vesicle membrane, 
or both, to establish the first connection 
between the two membranes that are 
going to fuse. In the example shown here, 
the Rab effector is a filamentous tethering 
protein (dark green). Next, SNARE proteins 
on the two membranes (red and blue) pair, 
docking the vesicle to the target membrane 
and catalyzing the fusion of the two 
apposed lipid bilayers. During docking and 
fusion, a Rab-GAP (not shown) induces the 
Rab protein to hydrolyze its bound GTP to 
GDP, causing the Rab to dissociate from 
the membrane and return to the cytosol 

as Rab-GDP, where it is bound by a GDI 
protein that Keeps the Rab soluble and 
inactive. 
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vesicles, for example, contains a protein kinase that phosphorylates the coat pro- 
teins to complete the uncoating process. Coupling uncoating to vesicle delivery 
helps to ensure directionality of the transport process and fusion with the proper 
membrane. Rab effectors can also interact with SNAREs to couple membrane 
tethering to fusion (see Figure 13-16). 

The assembly of Rab proteins and their effectors on a membrane is coopera- 
tive and results in the formation of large, specialized membrane patches. Rab5, 
for example, assembles on endosomes and mediates the capture of endocytic 
vesicles arriving from the plasma membrane. The experimental depletion of 
Rab5 causes disappearance of the entire endosomal and lysosomal membrane 
system, highlighting the crucial role of Rab proteins in organelle biogenesis and 
maintenance. 

A Rab5 domain concentrates tethering proteins that catch incoming vesicles. 
Its assembly on endosomal membranes begins when a Rab5-GDP/GDI complex 
encounters a Rab-GEEF. GDI is released and Rab5-GDP is converted to Rab5-GTP. 
Active Rab5-GTP becomes anchored to the membrane and recruits more Rab5- 
GEF to the endosome, thereby stimulating the recruitment of more Rab5 to the 
same site. In addition, active Rab5 activates a PI 3-kinase, which locally converts 
PI to PI(3)P, which in turn binds some of the Rab effectors including tethering pro- 
teins and stabilizes their local membrane attachment (Figure 13-17). This type of 
positive feedback greatly amplifies the assembly process and helps to establish 
functionally distinct membrane domains within a continuous membrane. 

The endosomal membrane provides a striking example of how different Rab 
proteins and their effectors help to create multiple specialized membrane domains, 
each fulfilling a particular set of functions. Thus, while the Rab5 membrane domain 
receives incoming endocytic vesicles from the plasma membrane, distinct Rab11 
and Rab4 domains in the same membrane organize the budding of recycling vesi- 
cles that return proteins from the endosome to the plasma membrane. 


Rab Cascades Can Change the Identity of an Organelle 


A Rab domain can be disassembled and replaced by a different Rab domain, 
changing the identity of an organelle. Such ordered recruitment of sequentially 
acting Rab proteins is called a Rab cascade. Over time, for example, Rab5 domains 
are replaced by Rab7 domains on endosomal membranes. This converts an early 
endosome, marked by Rab5, into a late endosome, marked by Rab7. Because the 
set of Rab effectors recruited by Rab7 is different from that recruited by Rab5, 
this change reprograms the compartment: as we discuss later, it alters the mem- 
brane dynamics, including the incoming and outgoing traffic, and repositions 
the organelle away from the plasma membrane toward the cell interior. All of the 
cargo contained in the early endosome that has not been recycled to the plasma 
membrane is now part of a late endosome. This process is also referred to as endo- 
some maturation. The self-amplifying nature of the Rab domains renders the pro- 
cess of endosome maturation unidirectional and irreversible (Figure 13-18). 







707 


Rab effector 
proteins 


alal 


Rab5 membrane domain 





Figure 13-17 The formation of a Rab5 
domain on the endosome membrane. 
A Rab5-GEF on the endosome membrane 
binds a Rab5 protein and induces it to 
exchange GDP for GTP. GDI is lost and 
GTP binding alters the conformation of 
the Rab protein, exposing an amphiphilic 
helix and a covalently attached lipid group, 
which together anchor the Rab5-GTP 

to the membrane. Active Rab5 activates 
PI 3-kinase, which converts PI into PI(3) 

P. PI(S)P and active Rab5 together bind a 
variety of Rab effector proteins that contain 
PI(3)P-binding sites, including filamentous 
tethering proteins that catch incoming 
clathrin-coated endocytic vesicles from 
the plasma membrane. With the help of 
another effector, active Rab5 also recruits 
more Rab5-GEF, further enhancing the 
assembly of the Rab5 domain on the 
membrane. 

Controlled cycles of GTP hydrolysis 
and GDP-GTP exchange dynamically 
regulate the size and activity of such 
Rab domains. Unlike SNAREs, which are 
integral membrane proteins, the GDP/ 
GTP cycle, coupled to the membrane/ 
cytosol translocation cycle, endows 
the Rab machinery with the ability to 
undergo assembly and disassembly on 
the membrane. (Adapted from M. Zerial 
and H. McBride, Nat. Rev. Mol. Cell Biol. 
2:107-117, 2001. With permission from 
Macmillan Publishers Ltd.) 
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SNAREs Mediate Membrane Fusion 


Once a transport vesicle has been tethered to its target membrane, it unloads its 
cargo by membrane fusion. Membrane fusion requires bringing the lipid bilayers 
of two membranes to within 1.5 nm of each other so that they can merge. When 
the membranes are in such close apposition, lipids can flow from one bilayer to 
the other. For this close approach, water must be displaced from the hydrophilic 
surface of the membrane—a process that is highly energetically unfavorable and 
requires specialized fusion proteins that overcome this energy barrier. We have 
already discussed the role of dynamin in a related task during the pinching-off of 
clathrin-coated vesicles (see Figure 13-13). 

The SNARE proteins (also called SNAREs, for short) catalyze the membrane 
fusion reactions in vesicle transport. There are at least 35 different SNARESs in an 
animal cell, each associated with a particular organelle in the secretory or endo- 
cytic pathways. These transmembrane proteins exist as complementary sets, with 
v-SNAREs usually found on vesicle membranes and t-SNAREs usually found on 
target membranes (see Figure 13-16). A v-SNARE is a single polypeptide chain, 
whereas a t-SNARE is usually composed of three proteins. The v-SNAREs and 
t-SNAREs have characteristic helical domains, and when a v-SNARE interacts with 
a t-SNARE, the helical domains of one wrap around the helical domains of the 
other to form a very stable four-helix bundle. The resulting trans-SNARE complex 
locks the two membranes together. Biochemical membrane fusion assays with all 
different SNARE combinations show that v- and t-SNARE pairing is highly specific. 
The SNAREs thus provide an additional layer of specificity in the transport process 
by helping to ensure that vesicles fuse only with the correct target membrane. 

The trans-SNARE complexes catalyze membrane fusion by using the energy 
that is freed when the interacting helices wrap around each other to pull the 
membrane faces together, simultaneously squeezing out water molecules from 
the interface (Figure 13-19). When liposomes containing purified v-SNAREs are 
mixed with liposomes containing complementary t-SNAREs, their membranes 
fuse, albeit slowly. In the cell, other proteins recruited to the fusion site, presum- 
ably Rab effectors, cooperate with SNAREs to accelerate fusion. Fusion does not 
always follow immediately after v-SNAREs and t-SNAREs pair. As we discuss later, 
in the process of regulated exocytosis, fusion is delayed until secretion is triggered 
by a specific extracellular signal. 

Rab proteins, which can regulate the availability of SNARE proteins, exert an 
additional layer of control. t-SNAREs in target membranes are often associated 
with inhibitory proteins that must be released before the t-SNARE can function. 
Rab proteins and their effectors trigger the release of such SNARE inhibitory 








Figure 13-19 A model for how SNARE proteins may catalyze membrane 
fusion. Bilayer fusion occurs in multiple steps. A tight pairing between 

v- and t-SNAREs forces lipid bilayers into close apposition and expels 

water molecules from the interface. Lipid molecules in the two interacting 
(cytosolic) leaflets of the bilayers then flow between the membranes to form 
a connecting stalk. Lipids of the two noncytosolic leaflets then contact each 
other, forming a new bilayer, which widens the fusion zone (hemifusion, or 
half-fusion). Rupture of the new bilayer completes the fusion reaction. 


Figure 13-18 A model for a generic 

Rab cascade. The local activation of a 
RabA-GEF leads to assembly of a RabA 
domain on the membrane. Active RabA 
recruits its effector proteins, one of which 
is a GEF for RabB. The RabB-GEF then 
recruits RabB to the membrane, which in 
turn begins to recruit its effectors, among 
them a GAP for RabA. The RabA-GAP 
activates RabA GTP hydrolysis leading 

to the inactivation of the RabA and the 
disassembly of the RabA domain as the 
RabB domain grows. In this way, the RabA 
domain is irreversibly replaced by the RabB 
domain. In principle, this sequence can 
continued by the recruitment of a next GEF 
by RabB. (Adapted from A.H. Hutagalung 
and P.J. Novick, Physiol. Rev. 91:119-149, 
2011. With permission from The American 
Physiological Society.) 
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proteins. In this way, SNARE proteins are concentrated and activated in the cor- 
rect location on the membrane, where tethering proteins capture incoming ves- 
icles. Rab proteins thus speed up the process by which appropriate SNARE pro- 
teins in two membranes find each other. 

For vesicle transport to operate normally, transport vesicles must incorpo- 
rate the appropriate SNARE and Rab proteins. Not surprisingly, therefore, many 
transport vesicles will form only if they incorporate the appropriate complement 
of SNARE and Rab proteins in their membrane. How this crucial control process 
operates during vesicle budding remains a mystery. 


Interacting SNAREs Need to Be Pried Apart Before They Can 
Function Again 


Most SNARE proteins in cells have already participated in multiple rounds of 
vesicle transport and are sometimes present in a membrane as stable complexes 
with partner SNAREs. The complexes have to disassemble before the SNAREs can 
mediate new rounds of transport. A crucial protein called NSF cycles between 
membranes and the cytosol and catalyzes the disassembly process. NSF is a 
hexameric ATPase of the family of AAA-ATPases (see Figure 6-85) that uses the 
energy of ATP hydrolysis to unravel the intimate interactions between the helical 
domains of paired SNARE proteins (Figure 13-20). The requirement for NSF-me- 
diated reactivation of SNAREs by SNARE complex disassembly helps prevent 
membranes from fusing indiscriminately: if the t-SNAREs in a target membrane 
were always active, any membrane containing an appropriate v-SNARE might 
fuse whenever the two membranes made contact. It is not known how the activity 
of NSF is controlled so that the SNARE machinery is activated at the right time and 
place. It is also not known how v-SNAREs are selectively retrieved and returned to 
their compartment of origin so that they can be reused in newly formed transport 
vesicles. 

Membrane fusion is important in other processes beside vesicle transport. 
The plasma membranes of a sperm and an egg fuse during fertilization, and myo- 
blasts fuse with one another during the development of multinucleate muscle 
fibers (discussed in Chapter 22). Likewise, the ER network and mitochondria fuse 
and fragment in a dynamic way (discussed in Chapters 12 and 14). All cell mem- 
brane fusions require special proteins and are tightly regulated to ensure that only 
appropriate membranes fuse. The controls are crucial for maintaining both the 
identity of cells and the individuality of each type of intracellular compartment. 

The membrane fusions catalyzed by viral fusion proteins are well understood. 
These proteins have a crucial role in permitting the entry of enveloped viruses 
(which have a lipid-bilayer-based membrane coat) into the cells that they infect 
(discussed in Chapters 5 and 23). For example, viruses such as the human immu- 
nodeficiency virus (HIV), which causes AIDS, bind to cell-surface receptors and 
then fuse with the plasma membrane of the target cell (Figure 13-21). This fusion 
event allows the viral nucleic acid inside the nucleocapsid to enter the cytosol, 
where it replicates. Other viruses, such as the influenza virus, first enter the cell 
by receptor-mediated endocytosis (discussed later) and are delivered to endo- 
somes; the low pH in endosomes activates a fusion protein in the viral envelope 
that catalyzes the fusion of the viral and endosomal membranes, releasing the 
viral nucleic acid into the cytosol. Viral fusion proteins and SNAREs promote lipid 
bilayer fusion in similar ways. 
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Figure 13-20 Dissociation of SNARE 
pairs by NSF after a membrane fusion 
cycle. After a v-SNARE and t-SNARE have 
mediated the fusion of a transport vesicle 
with a target membrane, NSF binds to 

the SNARE complex and, with the help of 
accessory proteins, hydrolyzes ATP to pry 
the SNAREs apart. 
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Summary 


Directed and selective transport of particular membrane components from one 
membrane-enclosed compartment to another in a eukaryotic cell maintains the 
differences between those compartments. Transport vesicles, which can be spheri- 
cal, tubular, or irregularly shaped, bud from specialized coated regions of the donor 
membrane. The assembly of the coat helps to collect specific membrane and soluble 
cargo molecules for transport and to drive the formation of the vesicle. 

There are various types of coated vesicles. The best characterized are clath- 
rin-coated vesicles, which mediate transport from the plasma membrane and the 
trans Golgi network, and COPI- and COPII-coated vesicles, which mediate trans- 
port between Golgi cisternae and between the ER and the Golgi apparatus, respec- 
tively. Coats have a common two-layered structure: an inner layer formed of adap- 
tor proteins links the outer layer (or cage) to the vesicle membrane and also traps 
specific cargo molecules for packaging into the vesicle. The coat is shed before the 
vesicle fuses with its appropriate target membrane. 

Local synthesis of specific phosphoinositides creates binding sites that trigger 
clathrin coat assembly and vesicle budding. In addition, monomeric GTPases help 
regulate various steps in vesicle transport, including both vesicle budding and 
docking. The coat-recruitment GTPases, including Sarl and the ARF proteins, reg- 
ulate coat assembly and disassembly. A large family of Rab proteins functions as 
vesicle-targeting GTPases. Rab proteins are recruited to both, forming transport 
vesicles and target membranes. The assembly and disassembly of Rab proteins 
and their effectors in specialized membrane domains are dynamically controlled 
by GTP binding and hydrolysis. Active Rab proteins recruit Rab effectors, such as 
motor proteins, which transport vesicles along actin filaments or microtubules, and 
filamentous tethering proteins, which help ensure that the vesicles deliver their con- 
tents only to the appropriate target membrane. Complementary v-SNARE proteins 
on transport vesicles and t-SNARE proteins on the target membrane form stable 
trans-SNARE complexes, which force the two membranes into close apposition so 
that their lipid bilayers can fuse. 


TRANSPORT FROM THE ER THROUGH THE 
GOLGI APPARATUS 


As discussed in Chapter 12, newly synthesized proteins cross the ER membrane 
from the cytosol to enter the secretory pathway. During their subsequent transport, 
from the ER to the Golgi apparatus and from the Golgi apparatus to the cell surface 
and elsewhere, these proteins are successively modified as they pass through a 
series of compartments. Transfer from one compartment to the next involves a 
delicate balance between forward and backward (retrieval) transport pathways. 
Some transport vesicles select cargo molecules and move them to the next com- 
partment in the pathway, while others retrieve escaped proteins and return them 
to a previous compartment where they normally function. Thus, the pathway from 
the ER to the cell surface consists of many sorting steps, which continuously select 
membrane and soluble lumenal proteins for packaging and transport. 


Figure 13-21 The entry of enveloped 
viruses into cells. Electron micrographs 
showing how HIV enters a cell by fusing 
its membrane with the plasma membrane 
of the cell. (From B.S. Stein et al., Cel/ 
49:659-668, 1987. With permission from 
Elsevier.) 
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TRANSPORT FROM THE ER THROUGH THE GOLGI APPARATUS 


In this section, we focus mainly on the Golgi apparatus (also called the Golgi 
complex). It is a major site of carbohydrate synthesis, as well as a sorting and dis- 
patching station for products of the ER. The cell makes many polysaccharides in 
the Golgi apparatus, including the pectin and hemicellulose of the cell wall in 
plants and most of the glycosaminoglycans of the extracellular matrix in animals 
(discussed in Chapter 19). The Golgi apparatus also lies on the exit route from the 
ER, and a large proportion of the carbohydrates that it makes are attached as oli- 
gosaccharide side chains to the many proteins and lipids that the ER sends to it. A 
subset of these oligosaccharide groups serve as tags to direct specific proteins into 
vesicles that then transport them to lysosomes. But most proteins and lipids, once 
they have acquired their appropriate oligosaccharides in the Golgi apparatus, are 
recognized in other ways for targeting into the transport vesicles going to other 
destinations. 


Proteins Leave the ER in COPII-Coated Transport Vesicles 


To initiate their journey along the secretory pathway, proteins that have entered 
the ER and are destined for the Golgi apparatus or beyond are first packaged into 
COPII-coated transport vesicles. These vesicles bud from specialized regions of 
the ER called ER exit sites, whose membrane lacks bound ribosomes. Most animal 
cells have ER exit sites dispersed throughout the ER network. 

Entry into vesicles that leave the ER can be a selective process or can happen by 
default. Many membrane proteins are actively recruited into such vesicles, where 
they become concentrated. These cargo membrane proteins display exit (trans- 
port) signals on their cytosolic surface that adaptor proteins of the inner COPII 
coat recognize (Figure 13-22); some of these components act as cargo receptors 
and are recycled back to the ER after they have delivered their cargo to the Golgi 
apparatus. Soluble cargo proteins in the ER lumen, by contrast, have exit signals 
that attach them to transmembrane cargo receptors. Proteins without exit signals 
can also enter transport vesicles, including protein molecules that normally func- 
tion in the ER (so-called ER resident proteins), some of which slowly leak out of 
the ER and are delivered to the Golgi apparatus. Different cargo proteins enter the 
transport vesicles with substantially different rates and efficiencies, which may 
result from differences in their folding and oligomerization efficiencies and kinet- 
ics, as well as the factors already discussed. The exit step from the ER is a major 
checkpoint at which quality control is exerted on the proteins that a cell secretes 
or displays on its surface, as we discussed in Chapter 12. 

The exit signals that direct soluble proteins out of the ER for transport to the 
Golgi apparatus and beyond are not well understood. Some transmembrane 
proteins that serve as cargo receptors for packaging some secretory proteins into 
COPII-coated vesicles are lectins that bind to oligosaccharides on the secreted 
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Figure 13-22 The recruitment of 
membrane and soluble cargo molecules 
into ER transport vesicles. Membrane 
proteins are packaged into budding 
transport vesicles through interactions 

of exit signals on their cytosolic tails with 
adaptor proteins of the inner COPII coat. 
Some of these membrane proteins function 
as cargo receptors, binding soluble 
proteins in the ER lumen and helping to 
package them into vesicles. Other proteins 
may enter the vesicle by bulk flow. A typical 
50 nm transport vesicle contains about 200 
membrane proteins, which can be of many 
different types. As indicated, unfolded or 
incompletely assembled proteins are bound 
to chaperones and transiently retained in 
the ER compartment. 
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proteins. One such lectin, for example, binds to mannose on two secreted 
blood-clotting factors (Factor V and Factor VIII), thereby packaging the pro- 
teins into transport vesicles in the ER; its role in protein transport was identified 
because humans who lack it owing to an inherited mutation have lowered serum 
levels of Factors V and VIII, and they therefore bleed excessively. 


Only Proteins That Are Properly Folded and Assembled Can Leave 
the ER 


To exit from the ER, proteins must be properly folded and, if they are subunits of 
multiprotein complexes, they need to be completely assembled. Those that are 
misfolded or incompletely assembled transiently remain in the ER, where they 
are bound to chaperone proteins (discussed in Chapter 6) such as BiP or calnexin. 
The chaperones may cover up the exit signals or somehow anchor the proteins 
in the ER. Such failed proteins are eventually transported back into the cytosol, 
where they are degraded by proteasomes (discussed in Chapters 6 and 12). This 
quality-control step prevents the onward transport of misfolded or misassembled 
proteins that could potentially interfere with the functions of normal proteins. 
Such failures are surprisingly common. More than 90% of the newly synthesized 
subunits of the T cell receptor (discussed in Chapter 24) and of the acetylcholine 
receptor (discussed in Chapter 11), for example, are normally degraded without 
ever reaching the cell surface where they function. Thus, cells must make a large 
excess of some protein molecules to produce a select few that fold, assemble, and 
function properly. 

Sometimes, however, there are drawbacks to the stringent quality-control 
mechanism. The predominant mutations that cause cystic fibrosis, a common 
inherited disease, result in the production of a slightly misfolded form of a plasma 
membrane protein important for Cl transport. Although the mutant protein 
would function normally if it reached the plasma membrane, it is retained in the 
ER and then is degraded by cytosolic proteasomes. This devastating disease thus 
results not because the mutation inactivates the protein but because the active 
protein is discarded before it reaches the plasma membrane. 


Vesicular Tubular Clusters Mediate Transport from the ER to the 
Golgi Apparatus 


After transport vesicles have budded from ER exit sites and have shed their coat, 
they begin to fuse with one another. The fusion of membranes from the same 
compartment is called homotypic fusion, to distinguish it from heterotypic fusion, 
in which a membrane from one compartment fuses with the membrane of a dif- 
ferent compartment. As with heterotypic fusion, homotypic fusion requires a set 
of matching SNAREs. In this case, however, the interaction is symmetrical, with 
both membranes contributing v-SNAREs and t-SNAREs (Figure 13-23). 

The structures formed when ER-derived vesicles fuse with one another are 
called vesicular tubular clusters, because they have a convoluted appearance in 
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Figure 13-23 Homotypic membrane 
fusion. In step 1, NSF pries apart identical 
pairs of v-SNAREs and t-SNAREs in both 
membranes (see Figure 13-20). In steps 

2 and 3, the separated matching SNAREs 
on adjacent identical membranes interact, 
which leads to membrane fusion and the 
formation of one continuous compartment. 
Subsequently, the compartment grows 

by further homotypic fusion with vesicles 
from the same kind of membrane, 
displaying matching SNAREs. Homotypic 
fusion occurs when ER-derived transport 
vesicles fuse with one another, but also 
when endosomes fuse to generate larger 
endosomes. Rab proteins help regulate the 
extent of homotypic fusion and hence the 
size of a cell’s compartments (not shown). 
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Figure 13-24 Vesicular tubular clusters. (A) An electron micrograph of vesicular tubular 

clusters forming around an exit site. Many of the vesicle-like structures seen in the micrograph 

are cross sections of tubules that extend above and below the plane of this thin section and are 
interconnected. (B) Vesicular tubular clusters move along microtubules to carry proteins from the 
ER to the Golgi apparatus. COPI-coated vesicles mediate the budding of vesicles that return to the 
ER from these clusters (and from the Golgi apparatus). (A, courtesy of William Balch.) 





the electron microscope (Figure 13-24A). These clusters constitute a compart- 
ment that is separate from the ER and lacks many of the proteins that function 
in the ER. They are generated continually and function as transport containers 
that bring material from the ER to the Golgi apparatus. The clusters move quickly 
along microtubules to the Golgi apparatus with which they fuse (Figure 13-24B 
and Movie 13.2). 

As soon as vesicular tubular clusters form, they begin to bud off transport ves- 
icles of their own. Unlike the COPII-coated vesicles that bud from the ER, these 
vesicles are COPI-coated (see Figure 13-24A). COPI-coated vesicles are unique in 
that the components that make up the inner and outer coat layers are recruited as 
a preassembled complex, called coatomer. They function as a retrieval pathway, 
carrying back ER resident proteins that have escaped, as well as proteins such 
as cargo receptors and SNAREs that participated in the ER budding and vesicle 
fusion reactions. This retrieval process demonstrates the exquisite control mech- 
anisms that regulate coat assembly reactions. The COPI coat assembly begins only 
seconds after the COPII coats have been shed, and remains a mystery how this 
switch in coat assembly is controlled. 

The retrieval (or retrograde) transport continues as the vesicular tubular clus- 
ters move toward the Golgi apparatus. Thus, the clusters continuously mature, 
gradually changing their composition as selected proteins are returned to the ER. 
The retrieval continues from the Golgi apparatus, after the vesicular tubular clus- 
ters have delivered their cargo. 


The Retrieval Pathway to the ER Uses Sorting Signals 


The retrieval pathway for returning escaped proteins back to the ER depends on 
ER retrieval signals. Resident ER membrane proteins, for example, contain signals 
that bind directly to COPI coats and are thus packaged into COPI-coated transport 
vesicles for retrograde delivery to the ER. The best-characterized retrieval signal 
of this type consists of two lysines, followed by any two other amino acids, at the 
extreme C-terminal end ofthe ER membrane protein. It is called a KKXX sequence, 
based on the single-letter amino acid code. 

Soluble ER resident proteins, such as BiP, also contain a short ER retrieval sig- 
nal at their C-terminal end, but it is different: it consists of a Lys-Asp-Glu-Leu or 
a similar sequence. If this signal (called the KDEL sequence) is removed from BiP 
by genetic engineering, the protein is slowly secreted from the cell. If the signal 
is transferred to a protein that is normally secreted, the protein is now efficiently 
returned to the ER, where it accumulates. 
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Unlike the retrieval signals on ER membrane proteins, which can interact 
directly with the COPI coat, soluble ER resident proteins must bind to special- 
ized receptor proteins such as the KDEL receptor—a multipass transmembrane 
protein that binds to the KDEL sequence and packages any protein displaying it 
into COPI-coated retrograde transport vesicles (Figure 13-25). To accomplish this 
task, the KDEL receptor itself must cycle between the ER and the Golgi apparatus, 
and its affinity for the KDEL sequence must differ in these two compartments. 
The receptor must have a high affinity for the KDEL sequence in vesicular tubular 
clusters and the Golgi apparatus, so as to capture escaped, soluble ER resident 
proteins that are present there at low concentration. It must have a low affinity for 
the KDEL sequence in the ER, however, to unload its cargo in spite of the very high 
concentration of KDEL-containing soluble resident proteins in the ER. 

How does the affinity of the KDEL receptor change depending on the compart- 
ment in which it resides? The answer is likely related to the lower pH in the Golgi 
compartments, which is regulated by H* pumps. As we discuss later, pH-sensitive 
protein-protein interactions form the basis for many of the protein sorting steps 
in the cell. 

Most membrane proteins that function at the interface between the ER and 
Golgi apparatus, including v- and t-SNAREs and some cargo receptors, also enter 
the retrieval pathway back to the ER. 


Many Proteins Are Selectively Retained in the Compartments in 
Which They Function 


The KDEL retrieval pathway only partly explains how ER resident proteins are 
maintained in the ER. As mentioned, cells that express genetically modified 
ER resident proteins, from which the KDEL sequence has been experimentally 
removed, secrete these proteins. But the rate of secretion is much slower than for a 
normal secretory protein. It seems that a mechanism that is independent of their 
KDEL signal normally retains ER resident proteins and that only those proteins 
that escape this retention mechanism are captured and returned via the KDEL 
receptor. A suggested retention mechanism is that ER resident proteins bind to 
one another, thus forming complexes that are too big to enter transport vesicles 
efficiently. Because ER resident proteins are present in the ER at very high con- 
centrations (estimated to be millimolar), relatively low-affinity interactions would 
suffice to retain most of the proteins in such complexes. 

Aggregation of proteins that function in the same compartment is a general 
mechanism that compartments use to organize and retain their resident proteins. 
Golgi enzymes that function together, for example, also bind to each other and are 
thereby restrained from entering transport vesicles leaving the Golgi apparatus. 
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Figure 13-25 Retrieval of soluble ER 
resident proteins. ER resident proteins 
that escape from the ER are returned by 
vesicle transport. (A) The KDEL receptor 
present in both vesicular tubular clusters 
and the Golgi apparatus captures the 
soluble ER resident proteins and carries 
them in COPI-coated transport vesicles 
back to the ER. (Recall that the COPI- 
coated vesicles shed their coats as soon as 
they are formed.) Upon binding its ligands 
in the tubular cluster or Golgi, the KDEL 
receptor may change conformation, so 

as to facilitate its recruitment into budding 
COPI-coated vesicles. (B) The retrieval 

of ER proteins begins in vesicular tubular 
clusters and continues from later parts of 
the Golgi apparatus. In the environment of 
the ER, the ER resident proteins dissociate 
from the KDEL receptor, which is then 
returned to the Golgi apparatus for reuse. 
We discuss the different compartments of 
the Golgi apparatus shortly. 
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The Golgi Apparatus Consists of an Ordered Series of 
Compartments 


Because it could be selectively visualized by silver stains, the Golgi apparatus was 
one of the first organelles described by early light microscopists. It consists of a 
collection of flattened, membrane-enclosed compartments called cisternae, that 
somewhat resemble a stack of pita breads. Each Golgi stack typically consists of 
four to six cisternae (Figure 13-26), although some unicellular flagellates can have 
more than 20. In animal cells, tubular connections between corresponding cister- 
nae link many stacks, thus forming a single complex, which is usually located near 
the cell nucleus and close to the centrosome (Figure 13-27A). This localization 
depends on microtubules. If microtubules are experimentally depolymerized, the 
Golgi apparatus reorganizes into individual stacks that are found throughout the 
cytoplasm, adjacent to ER exit sites. Some cells, including most plant cells, have 
hundreds of individual Golgi stacks dispersed throughout the cytoplasm where 
they are typically found adjacent to ER exit sites (Figure 13-27B). 
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Figure 13-26 The Golgi apparatus. 

(A) Three-dimensional reconstruction 

from electron micrographs of the Golgi 
apparatus in a secretory animal cell. The 
cis face of the Golgi stack is that closest 
to the ER. (B) A thin-section electron 
micrograph of an animal cell. In plant 
cells, the Golgi apparatus is generally 
more distinct and more clearly separated 
from other intracellular membranes than 

in animal cells. (A, redrawn from 

A. Rambourg and Y. Clermont, Eur. J. Cell 
Biol. 51:189-200, 1990. With permission 
from Wissenschaftliche Verlagsgesellschaft; 
B, courtesy of Brij J. Gupta.) 


Figure 13-27 Localization of the Golgi 
apparatus in animal and plant cells. 

(A) The Golgi apparatus in a cultured 
fibroblast stained with a fluorescent 
antibody that recognizes a Golgi resident 
protein (bright orange). The Golgi apparatus 
is polarized, facing the direction in which 
the cell was crawling before fixation. 

(B) The Golgi apparatus in a plant cell that 
is expressing a fusion protein consisting 
of a resident Golgi enzyme fused to green 
fluorescent protein. (A, courtesy of John 
Henley and Mark McNiven; B, courtesy of 
Chris Hawes.) 
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During their passage through the Golgi apparatus, transported molecules 
undergo an ordered series of covalent modifications. Each Golgi stack has two 
distinct faces: a cis face (or entry face) and a trans face (or exit face). Both cis and 
trans faces are closely associated with special compartments, each composed of a 
network of interconnected tubular and cisternal structures: the cis Golgi network 
(CGN) and the trans Golgi network (TGN), respectively. The CGN is a collection 
of fused vesicular tubular clusters arriving from the ER. Proteins and lipids enter 
the cis Golgi network and exit from the trans Golgi network, bound for the cell 
surface or another compartment. Both networks are important for protein sort- 
ing: proteins entering the CGN can either move onward in the Golgi apparatus or 
be returned to the ER. Similarly, proteins exiting from the TGN move onward and 
are sorted according to their next destination: endosomes, secretory vesicles, or 
the cell surface. They also can be returned to an earlier compartment. Some mem- 
brane proteins are retained in the part of the Golgi apparatus where they function. 

As described in Chapter 12, a single species of N-linked oligosaccharide is 
attached en bloc to many proteins in the ER and then trimmed while the pro- 
tein is still in the ER. The oligosaccharide intermediates created by the trimming 
reactions serve to help proteins fold and to help transport misfolded proteins to 
the cytosol for degradation in proteasomes. Thus, they play an important role in 
controlling the quality of proteins exiting from the ER. Once these ER functions 
have been fulfilled, the cell reutilizes the oligosaccharides for new functions. This 
begins in the Golgi apparatus, which generates the heterogeneous oligosaccha- 
ride structures seen in mature proteins. After arrival in the CGN, proteins enter 
the first of the Golgi processing compartments (the cis Golgi cisternae). They then 
move to the next compartment (the medial cisternae) and finally to the trans 
cisternae, where glycosylation is completed. The lumen of the trans cisternae is 
thought to be continuous with the TGN, the place where proteins are segregated 
into different transport packages and dispatched to their final destinations. 

The oligosaccharide processing steps occur in an organized sequence in the 
Golgi stack, with each cisterna containing a characteristic mixture of processing 
enzymes. Proteins are modified in successive stages as they move from cisterna 
to cisterna across the stack, so that the stack forms a multistage processing unit. 

Investigators discovered the functional differences between the cis, medial, 
and trans subdivisions of the Golgi apparatus by localizing the enzymes involved 
in processing N-linked oligosaccharides in distinct regions of the organelle, both 
by physical fractionation of the organelle and by labeling the enzymes in electron 
microscope sections with antibodies (Figure 13-28). The removal of mannose 
and the addition of N-acetylglucosamine, for example, occur in the cis and medial 
cisternae, while the addition of galactose and sialic acid occurs in the trans cis- 
terna and trans Golgi network. Figure 13-29 summarizes the functional compart- 
mentalization of the Golgi apparatus. 


Oligosaccharide Chains Are Processed in the Golgi Apparatus 


Whereas the ER lumen is full of soluble lumenal resident proteins and enzymes, 
the resident proteins in the Golgi apparatus are all membrane bound, as the enzy- 
matic reactions apparently occur entirely on membrane surfaces. All of the Golgi 
glycosidases and glycosyl transferases, for example, are single-pass transmem- 
brane proteins, many of which are organized in multienzyme complexes. 


Figure 13-28 Molecular compartmentalization of the Golgi apparatus. 

A series of electron micrographs shows the Golgi apparatus (A) unstained, 
(B) stained with osmium, which preferentially labels the cisternae of the 

cis compartment, and (C and D) stained to reveal the location of specific 
enzymes. Nucleoside diphosphatase is found in the trans Golgi cisternae 

(C), while acid phosphatase is found in the trans Golgi network (D). Note that 
Usually more than one cisterna is stained. The enzymes are therefore thought 
to be highly enriched rather than precisely localized to a specific cisterna. 
(Courtesy of Daniel S. Friend.) 
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Two broad classes of N-linked oligosaccharides, the complex oligosaccha- 
rides and the high-mannose oligosaccharides, are attached to mammalian gly- 
coproteins. Sometimes, both types are attached (in different places) to the same 
polypeptide chain. Complex oligosaccharides are generated when the original 
N-linked oligosaccharide added in the ER is trimmed and further sugars are 
added; by contrast, high-mannose oligosaccharides are trimmed but have no new 
sugars added to them in the Golgi apparatus (Figure 13-30). The sialic acids in the 


(B) COMPLEX OLIGOSACCHARIDE 


& = N-acetylglucosamine (GIcNAc) 


@& = mannose (Man) 


® = galactose (Gal) p -9 


= N-acetylneuraminic acid 
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Figure 13-30 The two main classes of asparagine-linked (N-linked) oligosaccharides found 
in mature mammalian glycoproteins. (A) Both complex oligosaccharides and high-mannose 
oligosaccharides share a common core region derived from the original N-linked oligosaccharide 
added in the ER (see Figure 12-50) and typically containing two N-acetylglucosamines (GIcNAc) 
and three mannoses (Man). (B) Each complex oligosaccharide consists of a core region, together 
with a terminal region that contains a variable number of copies of a special trisaccharide unit 
(N-acetylglucosamine-galactose-sialic acid) linked to the core mannoses. Frequently, the terminal 
region is truncated and contains only GIcNAc and galactose (Gal) or just GIcNAc. In addition, 
a fucose may be added, usually to the core GIcNAc attached to the asparagine (Asn). Thus, 
although the steps of processing and subsequent sugar addition are rigidly ordered, complex 
oligosaccharides can be heterogeneous. Moreover, although the complex oligosaccharide 
shown has three terminal branches, two and four branches are also common, depending on the 
glycoprotein and the cell in which it is made. (C) High-mannose oligosaccharides are not trimmed 
back all the way to the core region and contain additional mannoses. Hybrid oligosaccharides with 
one Man branch and one GIcNAc and Gal branch are also found (not shown). 

The three amino acids indicated in (A) constitute the sequence recognized by the oligosaccharyl 
transferase enzyme that adds the initial oligosaccharide to the protein. Ser, serine; Thr, threonine; 
X, any amino acid, except proline. 
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Figure 13-29 Oligosaccharide 
processing in Golgi compartments. 

The localization of each processing step 
shown was determined by a combination 
of techniques, including biochemical 
subfractionation of the Golgi apparatus 
membranes and electron microscopy after 
staining with antibodies specific for some 
of the processing enzymes. Processing 
enzymes are not restricted to a particular 
cisterna; instead, their distribution is graded 
across the stack, such that early-acting 
enzymes are present mostly in the cis Golgi 
cisternae and later-acting enzymes are 
mostly in the trans Golgi cisternae. Man, 
mannose; GIcNAc, N-acetylglucosamine; 
Gal, galactose; NANA, N-acetylneuraminic 
acid (sialic acid). 
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Figure 13-31 Oligosaccharide processing in the ER and the Golgi apparatus. The processing pathway is highly ordered, 
so that each step shown depends on the previous one. Step 1: Processing begins in the ER with the removal of the glucoses 
from the oligosaccharide initially transferred to the protein. Then a mannosidase in the ER membrane removes a specific 
mannose. The remaining steps occur in the Golgi stack. Step 2: Golgi mannosidase | removes three more mannoses. Step 
3: N-acetylglucosamine transferase | then adds an N-acetylglucosamine. Step 4: Mannosidase II then removes two additional 
mannoses. This yields the final core of three mannoses that is present in a complex oligosaccharide. At this stage, the bond 
between the two N-acetylglucosamines in the core becomes resistant to attack by a highly specific endoglycosidase (Endo 
H). Since all later structures in the pathway are also Endo H-resistant, treatment with this enzyme is widely used to distinguish 
complex from high-mannose oligosaccharides. Step 5: Finally, as shown in Figure 13-30, additional N-acetylglucosamines, 
galactoses, and sialic acids are added. These final steps in the synthesis of a complex oligosaccharide occur in the cisternal 
compartments of the Golgi apparatus: three types of glycosyl transferase enzymes act sequentially, using sugar substrates 
that have been activated by linkage to the indicated nucleotide; the membranes of the Golgi cisternae contain specific carrier 
proteins that allow each sugar nucleotide to enter in exchange for the nucleoside phosphates that are released after the sugar Is 
attached to the protein on the lumenal face. 

Note that, as a biosynthetic organelle, the Golgi apparatus differs from the ER: all Sugars in the Golgi are assembled inside the 
lumen from sugar nucleotide, whereas in the ER, the N-linked precursor oligosaccharide is assembled partly in the cytosol and 
partly in the lumen, and most lumenal reactions use dolichol-linked sugars as their substrates (See Figure 12-51). 


complex oligosaccharides are of special importance because they bear a negative 
charge. Whether a given oligosaccharide remains high-mannose or is processed 
depends largely on its position in the protein. If the oligosaccharide is accessible 
to the processing enzymes in the Golgi apparatus, it is likely to be converted to a 
complex form; if it is inaccessible because its sugars are tightly held to the pro- 
tein’s surface, it is likely to remain in a high-mannose form. The processing that 
generates complex oligosaccharide chains follows the highly ordered pathway 
shown in Figure 13-31. 

Beyond these commonalities in oligosaccharide processing that are shared 
among most cells, the products of the carbohydrate modifications carried out in 
the Golgi apparatus are highly complex and have given rise to a new field of study 
called glycobiology. The human genome, for example, encodes hundreds of dif- 
ferent Golgi glycosyl transferases and many glycosidases, which are expressed dif- 
ferently from one cell type to another, resulting in a variety of glycosylated forms 
of a given protein or lipid in different cell types and at varying stages of differenti- 
ation, depending on the spectrum of enzymes expressed by the cell. The complex- 
ity of modifications is not limited to N-linked oligosaccharides but also occurs on 
O-linked sugars, as we discuss next. 


Proteoglycans Are Assembled in the Golgi Apparatus 


In addition to the N-linked oligosaccharide alterations made to proteins as they 
pass through the Golgi cisternae en route from the ER to their final destinations, 
many proteins are also modified in the Golgi apparatus in other ways. Some pro- 
teins have sugars added to the hydroxyl groups of selected serines or threonines, 
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or, in some cases—such as collagens—to hydroxylated proline and lysine side 
chains. This O-linked glycosylation (Figure 13-32), like the extension of N-linked 
oligosaccharide chains, is catalyzed by a series of glycosyl transferase enzymes 
that use the sugar nucleotides in the lumen of the Golgi apparatus to add sugars 
to a protein one at a time. Usually, N-acetylgalactosamine is added first, followed 
by a variable number of additional sugars, ranging from just a few to 10 or more. 

The Golgi apparatus confers the heaviest O-linked glycosylation of all on 
mucins, the glycoproteins in mucus secretions, and on proteoglycan core pro- 
teins, which it modifies to produce proteoglycans. As discussed in Chapter 19, 
this process involves the polymerization of one or more glycosaminoglycan chains 
(long, unbranched polymers composed of repeating disaccharide units; see Fig- 
ure 19-35) onto serines on a core protein. Many proteoglycans are secreted and 
become components of the extracellular matrix, while others remain anchored 
to the extracellular face of the plasma membrane. Still others form a major com- 
ponent of slimy materials, such as the mucus that is secreted to form a protective 
coating on the surface of many epithelia. 

The sugars incorporated into glycosaminoglycans are heavily sulfated in the 
Golgi apparatus immediately after these polymers are made, thus adding a sig- 
nificant portion of their characteristically large negative charge. Some tyrosines 
in proteins also become sulfated shortly before they exit from the Golgi appara- 
tus. In both cases, the sulfation depends on the sulfate donor 3'-phosphoadenos- 
ine-5’-phosphosulfate (PAPS) (Figure 13-33), which is transported from the cyto- 
sol into the lumen of the trans Golgi network. 


What Is the Purpose of Glycosylation? 


There is an important difference between the construction of an oligosaccha- 
ride and the synthesis of other macromolecules such as DNA, RNA, and protein. 
Whereas nucleic acids and proteins are copied from a template in a repeated 
series of identical steps using the same enzyme or set of enzymes, complex carbo- 
hydrates require a different enzyme at each step, each product being recognized 
as the exclusive substrate for the next enzyme in the series. The vast abundance 
of glycoproteins and the complicated pathways that have evolved to synthesize 
them emphasize that the oligosaccharides on glycoproteins and glycosphingolip- 
ids have very important functions. 

N-linked glycosylation, for example, is prevalent in all eukaryotes, including 
yeasts. N-linked oligosaccharides also occur in a very similar form in archaeal 
cell wall proteins, suggesting that the whole machinery required for their synthe- 
sis is evolutionarily ancient. N-linked glycosylation promotes protein folding in 
two ways. First, it has a direct role in making folding intermediates more soluble, 
thereby preventing their aggregation. Second, the sequential modifications of the 
N-linked oligosaccharide establish a “glyco-code” that marks the progression of 
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Figure 13-32 N- and O-linked 
glycosylation. In each case, only the single 
sugar group that is directly attached to the 
protein chain is shown. 





3’-phosphoadenosine-5’-phosphosulfate 
(PAPS) 


Figure 13-33 The structure of PAPS. 


720 Chapter 13: Intracellular Membrane Traffic 


protein folding and mediates the binding of the protein to chaperones (discussed 
in Chapter 12) and lectins—for example, in guiding ER-to-Golgi transport. As we 
discuss later, lectins also participate in protein sorting in the trans Golgi network. 

Because chains of sugars have limited flexibility, even a small N-linked oli- 
gosaccharide protruding from the surface of a glycoprotein (Figure 13-34) can 
limit the approach of other macromolecules to the protein surface. In this way, 
for example, the presence of oligosaccharides tends to make a glycoprotein more 
resistant to digestion by proteolytic enzymes. It may be that the oligosaccharides 
on cell-surface proteins originally provided an ancestral cell with a protective 
coat; compared to the rigid bacterial cell wall, such a sugar coat has the advantage 
that it leaves the cell with the freedom to change shape and move. 

The sugar chains have since become modified to serve other purposes as 
well. The mucus coat of lung and intestinal cells, for example, protects against 
many pathogens. The recognition of sugar chains by lectins in the extracellular 
space is important in many developmental processes and in cell-cell recognition: 
selectins, for example, are transmembrane lectins that function in cell-cell 
adhesion during blood cell migration, as discussed in Chapter 19. The presence 
of oligosaccharides may modify a protein’s antigenic and functional properties, 
making glycosylation an important factor in the production of proteins for 
pharmaceutical purposes. 

Glycosylation can also have important regulatory roles. Signaling through 
the cell-surface signaling receptor Notch, for example, is an important factor in 
determining the cell’s fate in development (discussed in Chapter 21). Notch is a 
transmembrane protein that is O-glycosylated by addition of a single fucose to 
some serines, threonines, and hydroxylysines. Some cell types express an addi- 
tional glycosyl transferase that adds an N-acetylglucosamine to each of these 
fucoses in the Golgi apparatus. This addition changes the specificity of Notch for 
the cell-surface signal proteins that activate it. 


Transport Through the Golgi Apparatus May Occur by Cisternal 
Maturation 


It is still uncertain how the Golgi apparatus achieves and maintains its polarized 
structure and how molecules move from one cisterna to another, and it is likely 
that more than one mechanism is involved in each case. One hypothesis, called 
the cisternal maturation model, views the Golgi cisternae as dynamic structures 
that mature from early to late by acquiring and then losing specific Golgi-resident 
proteins. According to this view, new cis cisternae continually form as vesicular 
tubular clusters arrive from the ER and progressively mature to become a medial 
cisterna and then a trans cisterna (Figure 13-35A). A cisterna therefore moves 
through the Golgi stack with cargo in its lumen. Retrograde transport of the Golgi 
enzymes by budding COPI-coated vesicles explains their characteristic distribu- 
tion. As we discuss later, when a cisterna finally moves forward to become part of 
the trans Golgi network, various types of coated vesicles bud off it until this net- 
work disappears, to be replaced by a maturing cisterna just behind. At the same 
time, other transport vesicles are continually retrieving membrane from post- 
Golgi compartments and returning it to the trans Golgi network. 

The cisternal maturation model is supported by studies using Golgi enzymes 
from different cisternae that were fluorescently labeled with different colors. Such 
studies performed in yeast cells where Golgi cisternae are not stacked reveal that 
individual Golgi cisternae change their color, thereby demonstrating that they 
change their complement of resident enzymes as they mature, even though they 
are not stacked. In further support of the model, electron microscopic observa- 
tions found that large structures such as procollagen rods in fibroblasts and scales 
in certain algae move progressively through the Golgi stack. 

An alternative view holds that Golgi cisternae are long-lived structures that 
retain their characteristic set of Golgi-resident proteins firmly in place, and cargo 
proteins are transported from one cisterna to the next by transport vesicles (Fig- 
ure 13-35B). According to this vesicle transport model, retrograde flow of vesicles 
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Figure 13-34 The three-dimensional 
structure of a small N-linked 
oligosaccharide. The structure was 
determined by x-ray crystallographic 
analysis of a glycoprotein. This 
oligosaccharide contains only 6 sugars, 
whereas there are 14 sugars in the N-linked 
oligosaccharide that is initially transferred to 
proteins in the ER (see Figure 12-47). 

(A) A backbone model showing all atoms 
except hydrogens; only the asparagine 

of the protein is shown. (B) A space-filling 
model, with the asparagine and sugars 
indicated using the same color scheme as 
in (A). (B, courtesy of Richard Feldmann.) 
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(B) VESICLE TRANSPORT MODEL 


Figure 13-35 Two possible models explaining the organization of the Golgi apparatus and how proteins move through it. 
It is likely that the transport through the Golgi apparatus in the forward direction (red arrows) involves elements of both models. 
(A) According to the cisternal maturation model, each Golgi cisterna matures as it migrates outward through the stack. At each 
stage, the Golgi resident proteins that are carried forward in a maturing cisterna are moved backward to an earlier compartment 
in COPI-coated vesicles. When a newly formed cisterna moves to a medial position, for example, “leftover” cis Golgi enzymes 
would be extracted and transported retrogradely to a new cis cisterna behind. Likewise, the medial enzymes would be received 
by retrograde transport from the cisternae just ahead. In this way, a cis cisterna would mature to a medial and then trans cisterna 
as it moves outward. (B) In the vesicle transport model, Golgi cisternae are static compartments, which contain a characteristic 
complement of resident enzymes. The passing of molecules from cis to trans through the Golgi is accomplished by forward- 
moving transport vesicles, which bud from one cisterna and fuse with the next in a cis-to-trans direction. 


retrieves escaped ER and Golgi proteins and returns them to upstream compart- 
ments. Directional flow could be achieved because forward-moving cargo mol- 
ecules are selectively packaged into forward-moving vesicles. Although both 
forward- and backward-moving vesicles would likely be COPI-coated, the coats 
may contain different adaptor proteins that confer selectivity on the packaging of 
cargo molecules. Alternatively, transport vesicles shuttling between Golgi cister- 
nae might not be directional at all, transporting cargo randomly back and forth; 
directional flow would then occur because of the continual input to the cis cis- 
terna and output from the trans cisterna. 

The vesicle transport model is supported by experiments that show that cargo 
molecules are present in small COPI-coated vesicles and that these vesicles can 
deliver them to Golgi cisternae over large distances. In addition, when exper- 
imentally aggregated membrane proteins are introduced into Golgi cisternae, 
they can be observed staying in place, while soluble cargo, even if present as large 
aggregates, traverses the Golgi at normal rates. 

It is likely that aspects of both models are true. A stable core of long-lasting 
cisternae might exist in the center of each Golgi cisterna, while regions at the rim 
may undergo continuous maturation, perhaps utilizing Rab cascades that change 
their identity. As matured pieces of the cisternae are formed, they might break 
off and fuse with downstream cisternae by homotypic fusion mechanisms, taking 
large cargo molecules with them. In addition, small COPI-coated vesicles might 
transport small cargo in the forward direction and retrieve escaped Golgi enzymes 
and return them to their appropriate upstream cisternae. 


Golgi Matrix Proteins Helo Organize the Stack 


The unique architecture of the Golgi apparatus depends on both the microtubule 
cytoskeleton, as already mentioned, and cytoplasmic Golgi matrix proteins, which 
form a scaffold between adjacent cisternae and give the Golgi stack its structural 
integrity. Some of the matrix proteins, called golgins, form long tethers composed 
of stiff coiled-coil domains with interspersed hinge regions. Golgins form a forest 
of tentacles that can extend 100-400 nm from the surface of the Golgi stack. They 
are thought to help retain Golgi transport vesicles close to the organelle through 
interactions with Rab proteins (Figure 13-36). When the cell prepares to divide, 
mitotic protein kinases phosphorylate the Golgi matrix proteins, causing the Golgi 
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Figure 13-36 A model of golgin function. 
Filamentous golgins anchored to Golgi 
membranes capture transport vesicles 

by binding to Rab proteins on the vesicle 
surface. 
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apparatus to fragment and disperse throughout the cytosol. The Golgi fragments 
are then distributed evenly to the two daughter cells, where the matrix proteins 
are dephosphorylated, leading to the reassembly of the Golgi stack. Similarly, 
during apoptosis, proteolytic cleavage of golgins by caspases ensues (discussed in 
Chapter 18), fragments the Golgi apparatus as the cell self-destructs. 


Summary 


Correctly folded and assembled proteins in the ER are packaged into COPII-coated 
transport vesicles that pinch off from the ER membrane. Shortly thereafter, the vesi- 
cles shed their coat and fuse with one another to form vesicular tubular clusters. In 
animal cells, the clusters then move on microtubule tracks to the Golgi apparatus, 
where they fuse with one another to form the cis Golgi network. Any resident ER 
proteins that escape from the ER are returned there from the vesicular tubular clus- 
ters and Golgi apparatus by retrograde transport in COPI-coated vesicles. 

The Golgi apparatus, unlike the ER, contains many sugar nucleotides, which 
glycosyl transferase enzymes use to glycosylate lipid and protein molecules as they 
pass through the Golgi apparatus. The mannoses on the N-linked oligosaccharides 
that are added to proteins in the ER are often initially removed, and further sugars 
are added. Moreover, the Golgi apparatus is the site where O-linked glycosylation 
occurs and where glycosaminoglycan chains are added to core proteins to form 
proteoglycans. Sulfation of the sugars in proteoglycans and of selected tyrosines on 
proteins also occurs in a late Golgi compartment. 

The Golgi apparatus modifies the many proteins and lipids that it receives from 
the ER and then distributes them to the plasma membrane, lysosomes, and secre- 
tory vesicles. The Golgi apparatus is a polarized organelle, consisting of one or more 
stacks of disc-shaped cisternae. Each stack is organized as a series of at least three 
functionally distinct compartments, termed cis, medial, and trans cisternae. The 
cis and trans cisternae are each connected to special sorting stations, called the cis 
Golgi network and the trans Golgi network, respectively. Proteins and lipids move 
through the Golgi stack in the cis-to-trans direction. This movement may occur by 
vesicle transport, by progressive maturation of the cis cisternae as they migrate con- 
tinuously through the stack, or, most likely, by a combination of these two mecha- 
nisms. Continual retrograde vesicle transport from upstream to more downstream 
cisternae is thought to keep the enzymes concentrated in the cisternae where they 
are needed. The finished new proteins end up in the trans Golgi network, which 
packages them in transport vesicles and dispatches them to their specific destina- 
tions in the cell. 


TRANSPORT FROM THE TRANS GOLGI NETWORK 
TO LYSOSOMES 


The trans Golgi network sorts all of the proteins that pass through the Golgi appa- 
ratus (except those that are retained there as permanent residents) according to 
their final destination. The sorting mechanism is especially well understood for 
those proteins destined for the lumen of lysosomes, and in this section we con- 
sider this selective transport process. We begin with a brief account of lysosome 
structure and function. 


Lysosomes Are the Principal Sites of Intracellular Digestion 


Lysosomes are membrane-enclosed organelles filled with soluble hydrolytic 
enzymes that digest macromolecules. Lysosomes contain about 40 types of 
hydrolytic enzymes, including proteases, nucleases, glycosidases, lipases, phos- 
pholipases, phosphatases, and sulfatases. All are acid hydrolases; that is, hydro- 
lases that work best at acidic pH. For optimal activity, they need to be activated 
by proteolytic cleavage, which also requires an acid environment. The lysosome 
provides this acidity, maintaining an interior pH of about 4.5-5.0. By this arrange- 
ment, the contents of the cytosol are doubly protected against attack by the cell’s 
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own digestive system: the membrane of the lysosome keeps the digestive enzymes 
out of the cytosol, but, even if they leak out, they can do little damage at the cyto- 
solic pH of about 7.2. 

Like all other membrane-enclosed organelles, the lysosome not only contains 
a unique collection of enzymes, but also has a unique surrounding membrane. 
Most of the lysosome membrane proteins, for example, are highly glycosylated, 
which helps to protect them from the lysosome proteases in the lumen. Transport 
proteins in the lysosome membrane carry the final products of the digestion of 
macromolecules—such as amino acids, sugars, and nucleotides—to the cytosol, 
where the cell can either reuse or excrete them. 

A vacuolar H* ATPase in the lysosome membrane uses the energy of ATP 
hydrolysis to pump H* into the lysosome, thereby maintaining the lumen at its 
acidic pH (Figure 13-37). The lysosome Ht pump belongs to the family of V-type 
ATPases and has a similar architecture to the mitochondrial and chloroplast ATP 
synthases (F-type ATPases), which convert the energy stored in H* gradients into 
ATP (see Figure 11-12). By contrast to these enzymes, however, the vacuolar Ht 
ATPase exclusively works in reverse, pumping H* into the organelle. Similar or 
identical V-type ATPases acidify all endocytic and exocytic organelles, including 
lysosomes, endosomes, some compartments of the Golgi apparatus, and many 
transport and secretory vesicles. In addition to providing a low-pH environment 
that is suitable for reactions occurring in the organelle lumen, the H* gradient 
provides a source of energy that drives the transport of small metabolites across 
the organelle membrane. 


Lysosomes Are Heterogeneous 


Lysosomes are found in all eukaryotic cells. They were initially discovered by the 
biochemical fractionation of cell extracts; only later were they seen clearly in the 
electron microscope. Although extraordinarily diverse in shape and size, staining 
them with specific antibodies shows they are members of a single family of organ- 
elles. They can also be identified by histochemical techniques that reveal which 
organelles contain acid hydrolase (Figure 13-38). 

The heterogeneous morphology of lysosomes contrasts with the relatively uni- 
form structures of many other cell organelles. The diversity reflects the wide variety 
of digestive functions that acid hydrolases mediate, including the breakdown of 
intra- and extracellular debris, the destruction of phagocytosed microorganisms, 
and the production of nutrients for the cell. Their morphological diversity, how- 
ever, also reflects the way lysosomes form. Late endosomes containing material 
received from both the plasma membrane by endocytosis and newly synthesized 
lysosomal hydrolases fuse with preexisting lysosomes to form structures that are 
sometimes referred to as endolysosomes, which then fuse with one another (Fig- 
ure 13-39). When the majority of the endocytosed material within an endolyso- 
some has been digested so that only resistant or slowly digestible residues remain, 
these organelles become “classical” lysosomes. These are relatively dense, round, 
and small, but they can enter the cycle again by fusing with late endosomes or 
endolysosomes. Thus, there is no real distinction between endolysosomes and 
lysosomes: they are the same except that they are in different stages of a matura- 
tion cycle. For this reason, lysosomes are sometimes viewed as a heterogeneous 
collection of distinct organelles, the common feature of which is a high content of 


Figure 13-38 Histochemical visualization of lysosomes. These electron 
micrographs show two sections of a cell stained to reveal the location of 
acid phosphatase, a marker enzyme for lysosomes. The larger membrane- 
enclosed organelles, containing dense precipitates of lead phosphate, are 
lysosomes. Their diverse morphology reflects variations in the amount and 
nature of the material they are digesting. The precipitates are produced when 
tissue fixed with glutaraldehyde (to fix the enzyme in place) is incubated with 
a phosphatase substrate in the presence of lead ions. Red arrows in the top 
panel indicate two small vesicles thought to be carrying acid hydrolases from 
the Golgi apparatus. (Courtesy of Daniel S. Friend.) 
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hydrolytic enzymes. It is especially hard to apply a narrower definition than this in 
plant cells, as we discuss next. 


Plant and Fungal Vacuoles Are Remarkably Versatile Lysosomes 


Most plant and fungal cells (including yeasts) contain one or several very large, 
fluid-filled vesicles called vacuoles. They typically occupy more than 30% of the 
cell volume, and as much as 90% in some cell types (Figure 13-40). Vacuoles are 
related to animal cell lysosomes and contain a variety of hydrolytic enzymes, but 
their functions are remarkably diverse. The plant vacuole can act as a storage 
organelle for both nutrients and waste products, as a degradative compartment, 
as an economical way of increasing cell size, and as a controller of turgor pressure 
(the osmotic pressure that pushes outward on the cell wall and keeps the plant 
from wilting) (Figure 13-41). The same cell may have different vacuoles with dis- 
tinct functions, such as digestion and storage. 

The vacuole is important as a homeostatic device, enabling plant cells to with- 
stand wide variations in their environment. When the pH in the environment 
drops, for example, the flux of H* into the cytosol is balanced, at least in part, by 
an increased transport of H* into the vacuole, which tends to keep the pH in the 
cytosol constant. Similarly, many plant cells maintain an almost constant turgor 
pressure despite large changes in the tonicity of the fluid in their immediate envi- 
ronment. They do so by changing the osmotic pressure of the cytosol and vac- 
uole—in part by the controlled breakdown and resynthesis of polymers such as 
polyphosphate in the vacuole, and in part by altering the transport rates of sugars, 
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Figure 13-39 A model for lysosome 
maturation. Late endosomes fuse with 
preexisting lysosomes (bottom arrow) or 
preexisting endolysosomes (top arrow). 
Endolysosomes eventually mature into 
lysosomes as hydrolases complete the 
digestion of their contents, which can include 
intralumenal vesicles. Lysosomes also fuse 
with phagosomes, as we discuss later. 


Figure 13-40 The plant cell vacuole. 

(A) A confocal image of cells from an 
Arabidopsis embryo that is expressing an 
aquaporin — YFP (yellow fluorescent protein) 
fusion protein in its tonoplast, or vacuole 
membrane (green); the cell walls have been 
false-colored orange. Each cell contains 
several large vacuoles. (B) This electron 
micrograph of cells in a young tobacco 

leaf shows the cytosol as a thin layer, 
containing chloroplasts, pressed against 
the cell wall by the enormous vacuole. 

(A, courtesy of C. Carroll and L. Frigerio, 
based on S. Gattolin et al., Mol. Plant 
4:180-189, 2011. With permission from 
Oxford University Press; B, courtesy of 

J. Burgess.) 
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amino acids, and other metabolites across the plasma membrane and the vacuo- 
lar membrane. The turgor pressure regulates the activities of distinct transporters 
in each membrane to control these fluxes. 

Humans often harvest substances stored in plant vacuoles—from rubber to 
opium to the flavoring of garlic. Many stored products have a metabolic function. 
Proteins, for example, can be preserved for years in the vacuoles of the storage 
cells of many seeds, such as those of peas and beans. When the seeds germinate, 
these proteins are hydrolyzed, and the resulting amino acids provide a food sup- 
ply for the developing embryo. Anthocyanin pigments stored in vacuoles color 
the petals of many flowers so as to attract pollinating insects, while noxious mole- 
cules released from vacuoles when a plant is eaten or damaged provide a defense 
against predators. 


Multiple Pathways Deliver Materials to Lysosomes 


Lysosomes are meeting places where several streams of intracellular traffic con- 
verge. A route that leads outward from the ER via the Golgi apparatus delivers 
most of the lysosome’s digestive enzymes, while at least four paths from different 
sources feed substances into lysosomes for digestion. 

The best studied of these degradation paths is the one followed by macromol- 
ecules taken up from extracellular fluid by endocytosis. A similar pathway found 
in phagocytic cells, such as macrophages and neutrophils in vertebrates, is ded- 
icated to the engulfment, or phagocytosis, of large particles and microorganisms 
to form phagosomes. A third pathway called macropinocytosis specializes in the 
nonspecific uptake of fluids, membrane, and particles attached to the plasma 
membrane. We will return to discuss these pathways later in the chapter. A fourth 
pathway called autophagy originates in the cytoplasm of the cell itself and is used 
to digest cytosol and worn-out organelles, as we discuss next. The four paths to 
degradation in lysosomes are illustrated in Figure 13-42. 
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Figure 13-41 The role of the vacuole 

in controlling the size of plant cells. A 
plant cell can achieve a large increase in 
volume without increasing the volume of 
the cytosol. Localized weakening of the cell 
wall orients a turgor-driven cell enlargement 
that accompanies the uptake of water 

into an expanding vacuole. The cytosol 

is eventually confined to a thin peripheral 
layer, which is connected to the nuclear 
region by strands of cytosol stabilized by 
bundles of actin filaments (not shown). 


Figure 13-42 Four pathways to 
degradation in lysosomes. Materials in 
each pathway are derived from a different 
source. Note that the autophagosome has 
a double membrane. In all cases, the final 
step is the fusion with lysosomes. 
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Autophagy Degrades Unwanted Proteins and Organelles 


All cell types dispose of obsolete parts by a lysosome-dependent process called 
autophagy, or “self-eating.’ The degradation process is important during normal 
cell growth and in development, where it helps restructure differentiating cells, 
but also in adaptive responses to stresses such as starvation and infection. Auto- 
phagy can remove large objects—macromolecules, large protein aggregates, and 
even whole organelles—that other disposal mechanisms such as proteasomal 
degradation cannot handle. Defects in autophagy may prevent cells from clearing 
away invading microbes, unwanted protein aggregates and abnormal proteins, 
and thereby contribute to diseases ranging from infectious disorders to neurode- 
generation and cancer. 

In the initial stages of autophagy, cytoplasmic cargo becomes surrounded by 
a double membrane that assembles by the fusion of small vesicles of unknown 
origin, forming an autophagosome (Figure 13-43). A few tens of different pro- 
teins have been identified in yeast and animal cells that participate in the process, 
which must be tightly regulated: either too little or too much can be deleterious. 
The whole process occurs in the following sequence of steps: 


1. Induction by activation of signaling molecules: Protein kinases (including 
the mTOR complex 1, discussed in Chapter 15) that relay information 
about the metabolic status of the cell, become activated and signal to the 
autophagic machinery. 


2. Nucleation and extension ofa delimiting membrane into a crescent-shaped 
cup: Membrane vesicles, characterized by the presence of ATG9, the only 
transmembrane protein involved in the process, are recruited to an assem- 
bly site, where they nucleate autophagosome formation. ATG9 is not incor- 
porated into the autophagosome: a retrieval pathway must remove it from 
the assembling structure. 


3. Closure of the membrane cup around the target to form a sealed dou- 
ble-membrane-enclosed autophagosome. 


4. Fusion of the autophagosome with lysosomes, catalyzed by SNAREs. 


5. Digestion of the inner membrane and the lumenal contents of the auto- 
phagosome. 


Autophagy can be either nonselective or selective. In nonselective autophagy, 
a bulk portion of cytoplasm is sequestered in autophagosomes. It might occur, for 
example, in starvation conditions: when external nutrients are limiting, metabo- 
lites derived from the digestion of the captured cytosol might help the cell survive. 
In selective autophagy specific cargo is packaged into autophagosomes that tend 
to contain little cytosol, and their shape reflects the shape of the cargo. Selective 
autophagy mediates the degradation of worn out, or otherwise unwanted, mito- 
chondria, peroxisomes, ribosomes, and ER; it can also be used to destroy invading 
microbes. 
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Figure 13-43 A model of autophagy. 

(A) Activation of a signaling pathway 
initiates a nucleation event in the 
cytoplasm. A crescent of autophagosomal 
membrane grows by fusion of vesicles 

of unknown origin and eventually fuses 

to form a double-membrane-enclosed 
autophagosome, which sequesters 

a portion of the cytoplasm. The 
autophagosome then fuses with lysosomes 
containing acid hydrolases that digest 

its content. During the formation of the 
autophagosome membrane, a ubiquitin- 
like protein becomes activated by covalent 
attachment of a phosphatidylethanolamine 
lipid anchor. These proteins then mediate 
vesicle tethering and fusion, leading to the 
formation of a crescent-shaped membrane 
structure that assembles around its target 
(not shown). (B) An electron micrograph 

of an autophagosome containing a 
mitochondrion and a peroxisome. 

(B, courtesy of Daniel S. Friend, from 

D.W. Fawcett, A Textbook of Histology, 
12th ed. New York: Chapman and Hall, 
1994. With permission from Kluwer.) 
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The selective autophagy of worn out or damaged mitochondria is called mito- 
phagy. As discussed in Chapters 12 and 14, when mitochondria function nor- 
mally, the inner mitochondrial membrane is energized by an electrochemical H* 
gradient that drives ATP synthesis and the import of mitochondrial precursor pro- 
teins and metabolites. Damaged mitochondria cannot maintain the gradient, so 
protein import is blocked. As a consequence, a protein kinase called Pink1, which 
is normally imported into mitochondria, is instead retained on the mitochondrial 
surface where it recruits the ubiquitin ligase Parkin from the cytosol. Parkin ubiq- 
uitylates mitochondrial outer membrane proteins, which mark the organelle for 
selective destruction in autophagosomes. Mutations in Pinkl or Parkin cause a 
form of early-onset Parkinson’s disease, a degenerative disorder of the central 
nervous system. It is not known why the neurons that die prematurely in this dis- 
ease are particularly reliant on mitophagy. 


A Mannose 6-Phosphate Receptor Sorts Lysosomal Hydrolases in 
the Trans Golgi Network 


We now consider the pathway that delivers lysosomal hydrolases from the TGN to 
lysosomes. The enzymes are first delivered to endosomes in transport vesicles that 
bud from the TGN, before they move on to endolysosomes and lysosomes (see 
Figure 13-39). The vesicles that leave the TGN incorporate the lysosomal proteins 
and exclude the many other proteins being packaged into different transport ves- 
icles for delivery elsewhere. 

How are lysosomal hydrolases recognized and selected in the TGN with the 
required accuracy? In animal cells they carry a unique marker in the form of 
mannose 6-phosphate (M6P) groups, which are added exclusively to the N-linked 
oligosaccharides of these soluble lysosomal enzymes as they pass through the 
lumen of the cis Golgi network (Figure 13-44). Transmembrane M6P receptor 
proteins, which are present in the TGN, recognize the M6P groups and bind to the 
lysosomal hydrolases on the lumenal side of the membrane and to adaptor pro- 
teins in assembling clathrin coats on the cytosolic side. In this way, the receptors 
help package the hydrolases into clathrin-coated vesicles that bud from the TGN 
and deliver their contents to early endosomes. 

The M6P receptor protein binds to M6P at pH 6.5-6.7 in the TGN lumen and 
releases it at pH 6, which is the pH in the lumen of endosomes. Thus, after the 
receptor is delivered, the lysosomal hydrolases dissociate from the M6P recep- 
tors, which are retrieved into transport vesicles that bud from endosomes. These 
vesicles are coated with retromer, a coat protein complex specialized for endo- 
some-to-TGN transport, which returns the receptors to the TGN for reuse (Figure 
13-45). 

Transport in either direction requires signals in the cytoplasmic tail of the M6P 
receptor that direct this protein to the endosome or back to the TGN. These signals 
are recognized by the retromer complex that recruits M6P receptors into transport 
vesicles that bud from endosomes. The recycling of the M6P receptor resembles 
the recycling of the KDEL receptor discussed earlier, although it differs in the type 
of coated vesicles that mediate the transport. 

Not all the hydrolase molecules that are tagged with M6P get to lysosomes. 
Some escape the normal packaging process in the trans Golgi network and are 
transported “by default” to the cell surface, where they are secreted into the extra- 
cellular fluid. Some M6P receptors, however, also take a detour to the plasma 
membrane, where they recapture the escaped lysosomal hydrolases and return 
them by receptor-mediated endocytosis (discussed later) to lysosomes via early and 
late endosomes. As lysosomal hydrolases require an acidic milieu to work, they 
can do little harm in the extracellular fluid, which usually has a neutral pH of 7.4. 

For the sorting system that segregates lysosomal hydrolases and dispatches 
them to endosomes to work, the M6P groups must be added only to the appropri- 
ate glycoproteins in the Golgi apparatus. This requires specific recognition of the 
hydrolases by the Golgi enzymes responsible for adding ME6P. Since all glycopro- 
teins leave the ER with identical N-linked oligosaccharide chains, the signal for 
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Figure 13-44 The structure of mannose 
6-phosphate on a lysosomal hydrolase. 
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adding the MOP units to oligosaccharides must reside somewhere in the polypep- 
tide chain of each hydrolase. Genetic engineering experiments have revealed that 
the recognition signal is a cluster of neighboring amino acids on each protein’s 
surface, known as a signal patch (Figure 13-46). Since most lysosomal hydrolases 
contain multiple oligosaccharides, they acquire many M6P groups, providing a 
high-affinity signal for the M6P receptor. 


Defects in the GICNAc Phosphotransferase Cause a Lysosomal 
Storage Disease in Humans 


Genetic defects that affect one or more of the lysosomal hydrolases cause a num- 
ber of human lysosomal storage diseases. The defects result in an accumulation 
of undigested substrates in lysosomes, with severe pathological consequences, 
most often in the nervous system. In most cases, there is a mutation in a struc- 
tural gene that codes for an individual lysosomal hydrolase. This occurs in Hurler’s 
disease, for example, in which the enzyme required for the breakdown of certain 
types of glycosaminoglycan chains is defective or missing. The most severe form 
of lysosomal storage disease, however, is a very rare inherited metabolic disor- 
der called inclusion-cell disease (I-cell disease). In this condition, almost all of the 
hydrolytic enzymes are missing from the lysosomes of many cell types, and their 
undigested substrates accumulate in these lysosomes, which consequently form 
large inclusions in the cells. The consequent pathology is complex, affecting all 
organ systems, skeletal integrity, and mental development; individuals rarely live 
beyond six or seven years. 

I-cell disease is due to a single gene defect and, like most genetic enzyme defi- 
ciencies, it is recessive—that is, it occurs only in individuals having two copies 
of the defective gene. In patients with I-cell disease, all the hydrolases missing 
from lysosomes are found in the blood: because they fail to sort properly in the 
Golgi apparatus, they are secreted rather than transported to lysosomes. The 
mis-sorting has been traced to a defective or missing GlcNAc phosphotransfer- 
ase. Because lysosomal enzymes are not phosphorylated in the cis Golgi network, 
the M6P receptors do not segregate them into the appropriate transport vesicles 
in the TGN. Instead, the lysosomal hydrolases are carried to the cell surface and 
secreted. 

In I-cell disease, the lysosomes in some cell types, such as hepatocytes, con- 
tain a normal complement of lysosomal enzymes, implying that there is another 
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Figure 13-45 The transport of newly 
synthesized lysosomal hydrolases to 
endosomes. The sequential action of two 
enzymes in the cis and trans Golgi network 
adds mannose 6-phosphate (M6P) groups 
to the precursors of lysosomal enzymes 
(see Figure 13-46). The M6P-tagged 
hydrolases then segregate from all other 
types of proteins in the TGN because 
adaptor proteins (not shown) in the clathrin 
coat bind the MOP receptors, which, in 
turn, bind the M6P-modified lysosomal 
hydrolases. The clathrin-coated vesicles 
bud off from the TGN, shed their coat, 

and fuse with early endosomes. At the 
lower pH of the endosome, the hydrolases 
dissociate from the M6P receptors, and the 
empty receptors are retrieved in retromer- 
coated vesicles to the TGN for further 
rounds of transport. In the endosomes, 
the phosphate is removed from the M6P 
attached to the hydrolases, which may 
further ensure that the hydrolases do not 
return to the TGN with the receptor. 
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pathway for directing hydrolases to lysosomes that is used by some cell types 
but not others. Alternative sorting receptors function in these M6P-independent 
pathways. Similarly, an M6P-independent pathway in all cells sorts the mem- 
brane proteins of lysosomes from the TGN for transport to late endosomes, and 
those proteins are therefore normal in I-cell disease. 


Some Lysosomes and Multivesicular Bodies Undergo Exocytosis 


Targeting of material to lysosomes is not necessarily the end of the pathway. Lyso- 
somal secretion of undigested content enables all cells to eliminate indigestible 
debris. For most cells, this seems to be a minor pathway, used only when the cells 
are stressed. Some cell types, however, contain specialized lysosomes that have 
acquired the necessary machinery for fusion with the plasma membrane. Mela- 
nocytes in the skin, for example, produce and store pigments in their lysosomes. 
These pigment-containing melanosomes release their pigment into the extra- 
cellular space of the epidermis by exocytosis. The pigment is then taken up by 
keratinocytes, leading to normal skin pigmentation. In some genetic disorders, 
defects in melanosome exocytosis block this transfer process, leading to forms 
of hypopigmentation (albinism). Under certain conditions, multivesicular bodies 
can also fuse with the plasma membrane. If that occurs, their intralumenal vesi- 
cles are released from cells. Circulating small vesicles, also called exosomes, have 
been observed in the blood and may be used to transport components between 
cells, although the importance of such a mechanism of potential communication 
between distant cells is unknown. Some exosomes may derive from direct vesicle 
budding events at the plasma membrane, which is a topologically equivalent pro- 
cess (see Figure 13-57). 


Summary 


Lysosomes are specialized for the intracellular digestion of macromolecules. They 
contain unique membrane proteins and a wide variety of soluble hydrolytic enzymes 
that operate best at pH 5, which is the internal pH of lysosomes. An ATP-driven H* 
pump in the lysosomal membrane maintains this low pH. Newly synthesized lyso- 
somal proteins transported from the lumen of the ER, through the Golgi apparatus; 
they are then carried from the trans Golgi network to endosomes by means of clath- 
rin-coated transport vesicles, before moving on to lysosomes. 

The lysosomal hydrolases contain N-linked oligosaccharides that are covalently 
modified in a unique way in the cis Golgi so that their mannoses are phosphory- 
lated. These mannose 6-phosphate (M6P) groups are recognized by an M6P recep- 
tor protein in the trans Golgi network that segregates the hydrolases and helps pack- 
age them into budding transport vesicles that deliver their contents to endosomes. 
The MOP receptors shuttle back and forth between the trans Golgi network and the 
endosomes. The low pH in endosomes and the removal of the phosphate from the 
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Figure 13-46 The recognition of 

a lysosomal hydrolase. A GIcNAc 
ohosphotransferase recognizes lysosomal 
hydrolases in the Golgi apparatus. The 
enzyme has separate catalytic and 
recognition sites. The catalytic site 

binds both high-mannose N-linked 
oligosaccharides and UDP-GIcNAc. The 
recognition site binds to a signal patch that 
is present only on the surface of lysosomal 
hydrolases. A second enzyme cleaves 

off the GIcNAc, leaving the mannose 
6-phosphate exposed. 
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MGP group cause the lysosomal hydrolases to dissociate from these receptors, mak- 
ing the transport of the hydrolases unidirectional. A separate transport system uses 
clathrin-coated vesicles to deliver resident lysosomal membrane proteins from the 
trans Golgi network to endosomes. 


TRANSPORT INTO THE CELL FROM THE PLASMA 
MEMBRANE: ENDOCYTOSIS 


The routes that lead inward from the cell surface start with the process of endocy- 
tosis, by which cells take up plasma membrane components, fluid, solutes, mac- 
romolecules, and particulate substances. Endocytosed cargo includes receptor- 
ligand complexes, a spectrum of nutrients and their carriers, extracellular matrix 
components, cell debris, bacteria, viruses, and, in specialized cases, even other 
cells. Through endocytosis, the cell regulates the composition of its plasma mem- 
brane in response to changing extracellular conditions. 

In endocytosis, the material to be ingested is progressively enclosed by a small 
portion of the plasma membrane, which first invaginates and then pinches off 
to form an endocytic vesicle containing the ingested substance or particle. Most 
eukaryotic cells constantly form endocytic vesicles, a process called pinocytosis 
(“cell drinking”); in addition, some specialized cells contain dedicated pathways 
that take up large particles on demand, a process called phagocytosis (“cell eat- 
ing”). Endocytic vesicles form at the plasma membrane by multiple mechanisms 
that differ in both the molecular machinery used and how that machinery is reg- 
ulated. 

Once generated at the plasma membrane, most endocytic vesicles fuse with a 
common receiving compartment, the early endosome, where internalized cargo 
is sorted: some cargo molecules are returned to the plasma membrane, either 
directly or via a recycling endosome, and others are designated for degradation 
by inclusion in a late endosome. Late endosomes form from a bulbous, 
vacuolar portion of early endosomes by a process called endosome maturation. 
This conversion process changes the protein composition of the endosome 
membrane, patches of which invaginate and become incorporated within the 
organelles as intralumenal vesicles, while the endosome itself moves from the 
cell periphery to a location close to the nucleus. As an endosome matures, it 
ceases to recycle material to the plasma membrane and irreversibly commits 
its remaining contents to degradation: late endosomes fuse with one another 
and with lysosomes to form endolysosomes, which degrade their contents, as 
discussed earlier (Figure 13-47). 

Each of the stages of endosome maturation—from the early endosome to the 
endolysosome—is connected through bidirectional vesicle transport pathways to 
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Figure 13-47 Endosome maturation: 

the endocytic pathway from the plasma 
membrane to lysosomes. Endocytic 
vesicles fuse near the cell periphery with 
an early endosome, which is the primary 
sorting station. Tubular portions of the early 
endosome bud off vesicles that recycle 
endocytosed cargo back to the plasma 
membrane — either directly, or indirectly 

via recycling endosomes. Recycling 
endosomes can store proteins until they are 
needed. Conversion of early endosomes to 
late endosomes is accompanied by loss of 
the tubular projections. Membrane proteins 
destined for degradation are internalized 

in intralumenal vesicles. The developing 
late endosome, or multivesicular body, 
moves on microtubules to the cell interior. 
Fully matured late endosomes no longer 
send vesicles to the plasma membrane, 
and they fuse with one another and 

with endolysosomes and lysosomes to 
degrade their contents. Each stage of 
endosome maturation is connected via 
transport vesicles with the TGN, providing 
a continuous supply of newly synthesized 
lysosomal proteins. 
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the TGN. These pathways allow insertion of newly synthesized materials, such as Figure 13-48 The formation of clathrin- 
lysosomal enzymes arriving from the ER, and the retrieval ofcomponents, suchas _ ©2@ted vesicles from the plasma 


, i membrane. These electron micrographs 
the M6P receptor, back into the early parts of the secretory pathway. We next dis- illustrate the probable sequence ei 


cuss how the cell uses and controls the various features of endocytic trafficking. in the formation of a clathrin-coated vesicle 
from a clathrin-coated pit. The clathrin- 
coated pits and vesicles shown are larger 
than those seen in normal-sized cells; they 


Pinocytic Vesicles Form from Coated Pits in the Plasma 


Membrane are from a very large hen oocyte and they 
; ; take up lipoprotein particles to form yolk. 
Virtually all eukaryotic cells continually ingest portions of their plasma mem- The lipoprotein particles bound to their 
brane in the form of small pinocytic (endocytic) vesicles. The rate at which plasma | membrane-bound receptors appear as 
membrane is internalized in this process of pinocytosis varies between cell types, = 4 dense, fuzzy layer on the extracellular 


“ae e . . . surface of the plasma membrane —which 
but it is usually surprisingly high. A macrophage, for example, ingests 25% of its a clea sind 


own volume of fluid each hour. This means that it must ingest 3% of its plasma vesicle. (Courtesy of M.M. Perry and A.B. 
membrane each minute, or 100% in about half an hour. Fibroblasts endocytose Gilbert, J. Cell Sci. 39:257-272, 1979. 
at a somewhat lower rate (1% of their plasma membrane per minute), whereas With permission from The Company of 
some amoebae ingest their plasma membrane even more rapidly. Since a cell's Biologists.) 
surface area and volume remain unchanged during this process, it is clear that the 
same amount of membrane being removed by endocytosis is being added to the 
cell surface by the converse process of exocytosis. In this sense, endocytosis and 
exocytosis are linked processes that can be considered to constitute an endocytic- 
exocytic cycle. The coupling between exocytosis and endocytosis is particularly 
strict in specialized structures characterized by high membrane turnover, such as 
a nerve terminal. 
The endocytic part of the cycle often begins at clathrin-coated pits. These spe- 
cialized regions typically occupy about 2% of the total plasma membrane area. The 
lifetime of a clathrin-coated pit is short: within a minute or so of being formed, it 
invaginates into the cell and pinches off to form a clathrin-coated vesicle (Figure 
13-48). About 2500 clathrin-coated vesicles pinch off from the plasma membrane 
of a cultured fibroblast every minute. The coated vesicles are even more transient 
than the coated pits: within seconds of being formed, they shed their coat and 
fuse with early endosomes. 


Not All Pinocytic Vesicles Are Clathrin-Coated 


In addition to clathrin-coated pits and vesicles, cells can form other types of 
pinocytic vesicles, such as caveolae (from the Latin for “little cavities”), originally 
recognized by their ability to transport molecules across endothelial cells that 
form the inner lining of blood vessels. Caveolae, sometimes seen in the electron 
microscope as deeply invaginated flasks, are present in the plasma membrane 
of most vertebrate cell types (Figure 13-49). They are thought to form from lipid 
rafts in the plasma membrane (discussed in Chapter 10), which are especially rich 
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in cholesterol, glycosphingolipids, and glycosylphosphatidylinositol (GPI)-an- 
chored membrane proteins (see Figure 10-13). The major structural proteins in 
caveolae are caveolins, a family of unusual integral membrane proteins that each 
insert a hydrophobic loop into the membrane from the cytosolic side but do not 
extend across the membrane. On their cytosolic side, caveolins are bound to large 
protein complexes of caving proteins, which are thought to stabilize the mem- 
brane curvature. 

In contrast to clathrin-coated and COPI- or COPII-coated vesicles, caveolae 
are usually static structures. Nonetheless, they can be induced to pinch off and 
serve as endocytic vesicles to transport cargo to early endosomes or to the plasma 
membrane on the opposite side of a polarized cell (in a process called transcyto- 
sis, which we discuss later).Some animal viruses such as SV40 and papillomavirus 
(which causes warts) enter cells in vesicles derived from caveolae. The viruses are 
first delivered to early endosomes and move from there in transport vesicles to the 
lumen of the ER. The viral genome exits across the ER membrane into the cytosol, 
from where it is imported into the nucleus to start the infection cycle. Cholera 
toxin (discussed in Chapters 15 and 19) also enters the cell through caveolae and 
is transported to the ER before entering the cytosol. 

Macropinocytosis is another clathrin-independent endocytic mechanism 
that can be activated in practically all animal cells. In most cell types, it does 
not operate continually but rather is induced for a limited time in response to 
cell-surface receptor activation by specific cargoes, including growth factors, inte- 
erin ligands, apoptotic cell remnants, and some viruses. These ligands activate a 
complex signaling pathway, resulting in a change in actin dynamics and the for- 
mation of cell-surface protrusions, called ruffles (discussed in Chapter 16). When 
ruffles collapse back onto the cell, large fluid-filled endocytic vesicles form, called 
macropinosomes (Figure 13-50), which transiently increase the bulk fluid uptake 
of a cell by up to tenfold. Macropinocytosis is a dedicated degradative pathway: 
macropinosomes acidify and then fuse with late endosomes or endolysosomes, 
without recycling their cargo back to the plasma membrane. 


Cells Use Receptor-Mediated Endocytosis to Import Selected 
Extracellular Macromolecules 
In most animal cells, clathrin-coated pits and vesicles provide an efficient path- 


way for taking up specific macromolecules from the extracellular fluid. In this 
process, called receptor-mediated endocytosis, the macromolecules bind to 


Figure 13-49 Caveolae in the plasma 
membrane of a fibroblast. (A) This 
electron micrograph shows a plasma 
membrane with a very high density of 
caveolae. (B) This rapid-freeze deep-etch 
image demonstrates the characteristic 
“cauliflower” texture of the cytosolic face of 
the caveolae membrane. The characteristic 
texture is thought to result from aggregates 
of caveolins and cavins. A clathrin-coated 
pit is also seen at the upper right. (Courtesy 
of R.G.W. Anderson, from K.G. Rothberg 
et al., Cell 68:673-682, 1992. With 
permission from Elsevier.) 
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complementary transmembrane receptor proteins, which accumulate in coated 
pits, and then enter the cell as receptor-macromolecule complexes in clath- 
rin-coated vesicles (see Figure 13-48). Because ligands are selectively captured 
by receptors, receptor-mediated endocytosis provides a selective concentrating 
mechanism that increases the efficiency of internalization of particular ligands 
more than a hundredfold. In this way, even minor components of the extracellular 
fluid can be efficiently taken up in large amounts. A particularly well-understood 
and physiologically important example is the process that mammalian cells use 
to import cholesterol. 

Many animal cells take up cholesterol through receptor-mediated endocytosis 
and, in this way, acquire most of the cholesterol they require to make new mem- 
brane. If the uptake is blocked, cholesterol accumulates in the blood and can con- 
tribute to the formation in blood vessel (artery) walls of atherosclerotic plaques, 
deposits of lipid and fibrous tissue that can cause strokes and heart attacks by 
blocking arterial blood flow. In fact, it was a study of humans with a strong genetic 
predisposition for atherosclerosis that first revealed the mechanism of recep- 
tor-mediated endocytosis. 

Most cholesterol is transported in the blood as cholesteryl esters in the form of 
lipid-protein particles known as low-density lipoproteins (LDLs) (Figure 13-51). 
When a cell needs cholesterol for membrane synthesis, it makes transmembrane 
receptor proteins for LDL and inserts them into its plasma membrane. Once in 
the plasma membrane, the LDL receptors diffuse until they associate with clath- 
rin-coated pits that are in the process of forming. There, an endocytosis signal 
in the cytoplasmic tail of the LDL receptors binds the membrane-bound adap- 
tor protein AP2 after its conformation has been locally unlocked by binding to 
PI(4,5)P2 on the plasma membrane. Coincidence detection, as discussed earlier, 
thus imparts both efficiency and selectivity to the process (see Figure 13-9). AP2 
then recruits clathrin to initiate endocytosis. 

Since coated pits constantly pinch off to form coated vesicles, any LDL parti- 
cles bound to LDL receptors in the coated pits are rapidly internalized in coated 
vesicles. After shedding their clathrin coats, the vesicles deliver their contents to 
early endosomes. Once the LDL and LDL receptors encounter the low pH in early 
endosomes, LDL is released from its receptor and is delivered via late endosomes 
to lysosomes. There, the cholesteryl esters in the LDL particles are hydrolyzed to 
free cholesterol, which is now available to the cell for new membrane synthesis 
(Movie 13.3). If too much free cholesterol accumulates in a cell, the cell shuts off 
both its own cholesterol synthesis and the synthesis of LDL receptors, so that it 
ceases both to make or to take up cholesterol. 

This regulated pathway for cholesterol uptake is disrupted in individuals who 
inherit defective genes encoding LDL receptors. The resulting high levels of blood 
cholesterol predispose these individuals to develop atherosclerosis prematurely, 
and many would die at an early age of heart attacks resulting from coronary artery 
disease if they were not treated with drugs such as statins that lower the level of 
blood cholesterol. In some cases, the receptor is lacking altogether. In others, the 
receptors are defective—in either the extracellular binding site for LDL or the 
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Figure 13-50 Schematic representation 
of macropinocytosis. Cell signaling events 
lead to a reprogramming of actin dynamics, 
which in turn triggers the formation of cell- 
surface ruffles. As the ruffles collapse back 
onto the cell surface, they nonspecifically 
trap extracellular fluid and macromolecules 
and particles contained in it, forming large 
vacuoles, or macropinosomes, as shown. 
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Figure 13-51 A low-density lipoprotein 
(LDL) particle. Each spherical particle has 
a mass of 3 x 10° daltons. It contains a 
core of about 1500 cholesterol molecules 
esterified to long-chain fatty acids. A 

lipid monolayer composed of about 

800 phospholipid and 500 unesterified 
cholesterol molecules surrounds the core 
of cholesteryl esters. A single molecule of 
apolipoprotein B, a 500,000-dalton beltlike 
protein, organizes the particle and mediates 
the specific binding of LDL to cell-surface 
LDL receptors. 
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intracellular binding site that attaches the receptor AP2 adaptor protein in clath- 
rin-coated pits. In the latter case, normal numbers of LDL receptors are present, 
but they fail to become localized in clathrin-coated pits. Although LDL binds to 
the surface of these mutant cells, it is not internalized, directly demonstrating the 
importance of clathrin-coated pits for the receptor-mediated endocytosis of cho- 
lesterol. 

More than 25 distinct receptors are known to participate in receptor-mediated 
endocytosis of different types of molecules. They all apparently use clathrin-de- 
pendent internalization routes and are guided into clathrin-coated pits by signals 
in their cytoplasmic tails that bind to adaptor proteins in the clathrin coat. Many 
of these receptors, like the LDL receptor, enter coated pits irrespective of whether 
they have bound their specific ligands. Others enter preferentially when bound 
to a specific ligand, suggesting that a ligand-induced conformational change is 
required for them to activate the signal sequence that guides them into the pits. 
Since most plasma membrane proteins fail to become concentrated in clath- 
rin-coated pits, the pits serve as molecular filters, preferentially collecting certain 
plasma membrane proteins (receptors) over others. 

Electron-microscope studies of cultured cells exposed simultaneously to dif- 
ferent labeled ligands demonstrate that many kinds of receptors can cluster in the 
same coated pit, whereas some other receptors cluster in different clathrin-coated 
pits. The plasma membrane of one clathrin-coated pit can accommodate more 
than 100 receptors of assorted varieties. 


Specific Proteins Are Retrieved from Early Endosomes and 
Returned to the Plasma Membrane 


Early endosomes are the main sorting station in the endocytic pathway, just as 
the cis and trans Golgi networks serve this function in the secretory pathway. In 
the mildly acidic environment of the early endosome, many internalized recep- 
tor proteins change their conformation and release their ligand, as already dis- 
cussed for the M6P receptors. Those endocytosed ligands that dissociate from 
their receptors in the early endosome are usually doomed to destruction in lyso- 
somes (although cholesterol is an exception, as just discussed), along with the 
other soluble contents of the endosome. Some other endocytosed ligands, how- 
ever, remain bound to their receptors, and thereby share the fate of the receptors. 

In the early endosome, the LDL receptor dissociates from its ligand, LDL, and 
is recycled back to the plasma membrane for reuse, leaving the discharged LDL to 
be carried to lysosomes (Figure 13-52). The recycling transport vesicles bud from 
long, narrow tubules that extend from the early endosomes. It is likely that the 
geometry of these tubules helps the sorting process: because tubules have a large 
membrane area enclosing a small volume, membrane proteins become enriched 
over soluble proteins. The transport vesicles return the LDL receptor directly to 
the plasma membrane. 

The transferrin receptor follows a similar recycling pathway as the LDL recep- 
tor, but unlike the LDL receptor it also recycles its ligand. Transferrin is a soluble 
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Figure 13-52 The receptor-mediated 
endocytosis of LDL. Note that the LDL 
dissociates from its receptors in the acidic 
environment of the early endosome. After 

a number of steps, the LDL ends up in 
endolysosomes and lysosomes, where it 

is degraded to release free cholesterol. In 
contrast, the LDL receptors are returned 

to the plasma membrane via transport 
vesicles that bud off from the tubular 

region of the early endosome, as shown. 
For simplicity, only one LDL receptor is 
shown entering the cell and returning to the 
plasma membrane. Whether it is occupied 
or not, an LDL receptor typically makes 
one round trip into the cell and back to 

the plasma membrane every 10 minutes, 
making a total of several hundred trips in its 
20-hour life-span. 
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protein that carries iron in the blood. Cell-surface transferrin receptors deliver 
transferrin with its bound iron to early endosomes by receptor-mediated endocy- 
tosis. The low pH in the endosome induces transferrin to release its bound iron, 
but the iron-free transferrin itself (called apotransferrin) remains bound to its 
receptor. The receptor-apotransferrin complex enters the tubular extensions of 
the early endosome and from there is recycled back to the plasma membrane. 
When the apotransferrin returns to the neutral pH of the extracellular fluid, it dis- 
sociates from the receptor and is thereby freed to pick up more iron and begin 
the cycle again. Thus, transferrin shuttles back and forth between the extracellu- 
lar fluid and early endosomes, avoiding lysosomes and delivering iron to the cell 
interior, as needed for cells to grow and proliferate. 


Plasma Membrane Signaling Receptors are Down-Regulated by 
Degradation in Lysosomes 


Asecond pathway that endocytosed receptors can follow from endosomes is taken 
by many signaling receptors, including opioid receptors and the receptor that 
binds epidermal growth factor (EGF). EGF is a small, extracellular signal protein 
that stimulates epidermal and various other cells to divide. Unlike LDL receptors, 
EGF receptors accumulate in clathrin-coated pits only after binding their ligand, 
and most do not recycle but are degraded in lysosomes, along with the ingested 
EGF. EGF binding therefore first activates intracellular signaling pathways and 
then leads to a decrease in the concentration of EGF receptors on the cell surface, 
a process called receptor downregulation, that reduces the cell’s subsequent sen- 
sitivity to EGF (see Figure 15-20). 

Receptor downregulation is highly regulated. The activated receptors are first 
covalently modified on the cytosolic face with the small protein ubiquitin. Unlike 
polyubiquitylation, which adds a chain of ubiquitins that typically targets a pro- 
tein for degradation in proteasomes (discussed in Chapter 6), ubiquitin tagging 
for sorting into the clathrin-dependent endocytic pathway adds just one or a few 
single ubiquitin molecules to the protein—a process called monoubiquitylation 
or multiubiquitylation, respectively. Ubiquitin-binding proteins recognize the 
attached ubiquitin and help direct the modified receptors into clathrin-coated 
pits. After delivery to the early endosome, other ubiquitin-binding proteins that 
are part of ESCRT complexes (ESCRT = Endosome Sorting Complex Required for 
Transport) recognize and sort the ubiquitylated receptors into intralumenal vesi- 
cles, which are retained in the maturing late endosome (see Figure 13-47). In this 
way, addition of ubiquitin blocks receptor recycling to the plasma membrane and 
directs the receptors into the degradation pathway, as we discuss next. 


Early Endosomes Mature into Late Endosomes 


The endosomal compartments can be made visible in the electron microscope 
by adding a readily detectable tracer molecule, such as the enzyme peroxidase, 
to the extracellular medium and allowing varying lengths of time for the cell to 
endocytose the tracer. The distribution of the molecule after its uptake reveals 
the sequence of events. Within a minute or so after adding the tracer, it starts to 
appear in early endosomes, just beneath the plasma membrane (Figure 13-53). 
By 5-15 minutes later, it has moved to late endosomes, close to the Golgi appara- 
tus and near the nucleus. 

How early endosomes arise is not entirely clear, but their membrane and 
volume are mainly derived from incoming endocytic vesicles that fuse with one 
another (Movie 13.4). Early endosomes are relatively small and patrol the cyto- 
plasm underlying the plasma membrane in jerky back-and-forth movements 
along microtubules, capturing incoming vesicles. Typically, an early endosome 
receives incoming vesicles for about 10 minutes, during which time membrane 
and fluid is rapidly recycled to the plasma membrane. Some of the incoming 
cargo, however, accumulates over the lifetime of the early endosome, eventually 
to be included in the late endosome. 
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Figure 13-53 Electron micrograph of an 
early endosome. The endosome is labeled 
with endocytosed horseradish peroxidase, 
a widely used enzyme marker, detected 

in this case by an electron-dense reaction 
product. Many tubular extensions protrude 
from the central vacuolar space of the 
early endosome, which will later mature 

to give rise to a late endosome. (From 

J. Tooze and M. Hollinshead, J. Cell Biol. 
118:813-830, 1992.) 


736 Chapter 13: Intracellular Membrane Traffic 


Early endosomes have tubular and vacuolar domains (see Figure 13-53). Most 
of the membrane surface is in the tubules and most of the volume is in the vacuolar 
domain. During endosome maturation, the two domains have different fates: the 
vacuolar portions of the early endosome are retained and transformed into late 
endosomes; the tubular portions shrink. Maturing endosomes, also called mul- 
tivesicular bodies, migrate along microtubules toward the cell interior, shedding 
membrane tubules and vesicles that recycle material to the plasma membrane 
and TGN, and receiving newly synthesized lysosomal proteins. As they concen- 
trate in a perinuclear region of the cell, the multivesicular bodies fuse with each 
other, and eventually with endolysosomes and lysosomes (see Figure 13-47). 

Many changes occur during the maturation process. (1) The endosome changes 
shape and location, as the tubular domains are lost and the vacuolar domains are 
thoroughly modified. (2) Rab proteins, phosphoinositide lipids, fusion machinery 
(SNAREs and tethers), and microtubule motor proteins all participate in a molec- 
ular makeover of the cytosolic face of the endosome membrane, changing the 
functional characteristics of the organelle. (3) A V-type ATPase in the endosome 
membrane pumps Ht from the cytosol into the endosome lumen and acidifies the 
organelle. Crucially, the increasing acidity that accompanies maturation renders 
lysosomal hydrolases increasingly more active, influencing many receptor-ligand 
interactions, thereby controlling receptor loading and unloading. (4) Intralume- 
nal vesicles sequester endocytosed signaling receptors inside the endosome, 
thus halting the receptor signaling activity. (5) Lysosome proteins are delivered 
from the TGN to the maturing endosome. Most of these events occur gradually 
but eventually lead to a complete transformation of the endosome into an early 
endolysosome. 

In addition to committing selected cargo for degradation, the maturation pro- 
cess is important for lysosome maintenance. The continual delivery of lysosome 
components from the TGN to maturing endosomes, ensures a steady supply of 
new lysosome proteins. The endocytosed materials mix in early endosomes with 
newly arrived acid hydrolases. Although mild digestion may start here, many 
hydrolases are synthesized and delivered as proenzymes, called zymogens, which 
contain extra inhibitory domains that keep the hydrolases inactive until these 
domains are proteolytically removed at later stages of endosome maturation. 
Moreover, the pH in early endosomes is not low enough to activate lysosomal 
hydrolases optimally. By these means, cells can retrieve membrane proteins 
intact from early endosomes and recycle them back to the plasma membrane. 


ESCRT Protein Complexes Mediate the Formation of Intralumenal 
Vesicles in Multivesicular Bodies 


As endosomes mature, patches of their membrane invaginate into the endosome 
lumen and pinch off to form intralumenal vesicles. Because of their appearance in 
the electron microscope such maturing endosomes are also called multivesicular 
bodies (Figure 13-54). 

The multivesicular bodies carry endocytosed membrane proteins that are to 
be degraded. As part of the protein-sorting process, receptors destined for degra- 
dation, such as the occupied EGF receptors described previously, selectively par- 
tition into the invaginating membrane of the multivesicular bodies. In this way, 
both the receptors and any signaling proteins strongly bound to them are seques- 
tered away from the cytosol where they might otherwise continue signaling. 
They also are made fully accessible to the digestive enzymes that eventually will 
degrade them (Figure 13-55). In addition to endocytosed membrane proteins, 
multivesicular bodies include the soluble content of early endosomes destined 
for late endosomes and digestion in lysosomes. 

As discussed earlier, sorting into intralumenal vesicles requires one or multi- 
ple ubiquitin tags, which are added to the cytosolic domains of membrane pro- 
teins. These tags initially help guide the proteins into clathrin-coated vesicles in 
the plasma membrane. Once delivered to the endosomal membrane, the ubiq- 
uitin tags are recognized again, this time by a series of cytosolic ESCRT protein 





intralumenal multivesicular 
vesicle body 





Figure 13-54 Electron micrograph of a 
multivesicular body. The large amount 
of internal membrane will be delivered to 
the lysosome, for digestion. (Courtesy of 
Andrew Staehelin, from A. Driouich, 

A. Jauneau and L.A. Staehelin; Plant 
Physiol. 113:487-492, 1997. With 
permission from the American Society of 
Plant Biologists.) 
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complexes, (ESCRT-0, -I, -II, and -III), which bind sequentially and ultimately 
mediate the sorting process into the intralumenal vesicles. Membrane invagina- 
tion into multivesicular bodies also depends on a lipid kinase that phosphorylates 
phosphatidylinositol to produce PI(3)P, which serves as an additional docking site 
for the ESCRT complexes; these complexes require both PI(3)P and the presence 
of ubiquitylated cargo proteins to bind to the endosomal membrane. ESCRT-III 
forms large multimeric assemblies on the membrane that bend the membrane 
(Figure 13-56). 

Mutant cells compromised in ESCRT function display signaling defects. In 
such cells, activated receptors cannot be down-regulated by endocytosis and 
packaging into multivesicular bodies. The still-active receptors therefore mediate 
prolonged signaling, which can lead to uncontrolled cell proliferation and cancer. 

Processes that shape membranes often use similar machinery. Because of 
strong similarities in their protein sequences, researchers think that ESCRT com- 
plexes are evolutionarily related to components that mediate cell-membrane 
deformation in cytokinesis in archaea. Similarly, the ESCRT machinery that 
drives the internal budding from the endosome membrane to form intralumenal 
vesicles is also used in animal cell cytokinesis and virus budding, which are topo- 
logically equivalent, as both processes involve budding away from the cytosolic 
surface of the membrane (Figure 13-57). 


Recycling Endosomes Regulate Plasma Membrane Composition 


The fates of endocytosed receptors—and of any ligands remaining bound to 
them—vary according to the specific type of receptor. As we discussed, most 
receptors are recycled and returned to the same plasma membrane domain from 
which they came; some proceed to a different domain of the plasma membrane, 
thereby mediating transcytosis; and some progress to lysosomes, where they are 
degraded. 

Receptors on the surface of polarized epithelial cells can transfer specific 
macromolecules from one extracellular space to another by transcytosis. A new- 
born, for example, obtains antibodies from its mother’s milk (which help protect 
it against infection) by transporting them across the epithelium of its gut. The 
lumen of the gut is acidic, and, at this low pH, the antibodies in the milk bind 
to specific receptors on the apical (absorptive) surface of the gut epithelial cells. 
The receptor-antibody complexes are internalized via clathrin-coated pits and 





= ESCRT-O — 
CYTOSOL e 
LUMEN OF 7 
ENDOSOME 7 ubiquitin 


\ 


endosome 
membrane 


cargo Ku 
forming intralumenal ~ 
vesicle 





137 


Figure 13-55 The sequestration of 
endocytosed proteins into intralumenal 
vesicles of multivesicular bodies. 
Ubiquitylated membrane proteins are 
sorted into domains on the endosome 
membrane, which invaginate and pinch 
off to form intralumenal vesicles. The 
ubiquitin marker is removed and returned 
to the cytosol for reuse before the 
intralumenal vesicle closes. Eventually, 
proteases and lipases in lysosomes 
digest all of the internal membranes. 

The invagination processes are essential 
for complete digestion of endocytosed 
membrane proteins: because the outer 
membrane of the multivesicular body 
becomes continuous with the lysosomal 
membrane, which is resistant to lysosomal 
hydrolases; the hydrolases, for example, 
could not digest the cytosolic domains of 
endocytosed transmembrane proteins, 
such as the EGF receptor shown here, 

if the protein were not localized in 
intralumenal vesicles. 


Figure 13-56 Sorting of endocytosed 
membrane proteins into the intralumenal 
vesicles of a multivesicular body. 

A series of complex binding events 

passes the ubiquitylated cargo proteins 
sequentially from one ESCRT complex 
(ESCRT-O) to the next, eventually 
concentrating them in membrane areas 
that bud into the lumen of the endosome 
to form intralumenal vesicles. ESCRT-III 
assembles into expansive multimeric 
structures and mediates invagination. The 
mechanisms of how cargo molecules are 
shepherded into the vesicles and how the 
vesicles are formed without including the 
ESCRT complexes themselves remain 
unknown. ESCRT complexes are soluble in 
the cytosol, are recruited to the membrane 
sequentially as needed, and are then 
released back into the cytosol as the 
vesicle pinches off. 
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vesicles and are delivered to early endosomes. The complexes remain intact and 
are retrieved in transport vesicles that bud from the early endosome and subse- 
quently fuse with the basolateral domain of the plasma membrane. On exposure 
to the neutral pH of the extracellular fluid that bathes the basolateral surface of 
the cells, the antibodies dissociate from their receptors and eventually enter the 
baby’s bloodstream. 

The transcytotic pathway from the early endosome back to the plasma mem- 
brane is not direct. The receptors first move from the early endosome to the recy- 
cling endosome. The variety of pathways that different receptors follow from early 
endosomes implies that, in addition to binding sites for their ligands and binding 
sites for coated pits, many receptors also possess sorting signals that guide them 
into the appropriate transport pathway (Figure 13-58). 

Cells can regulate the release of membrane proteins from recycling endo- 
somes, thus adjusting the flux of proteins through the transcytotic pathway 
according to need. This regulation, the mechanism of which is uncertain, allows 
recycling endosomes to play an important part in adjusting the concentration of 
specific plasma membrane proteins. Fat cells and muscle cells, for example, con- 
tain large intracellular pools of the glucose transporters that are responsible for 
the uptake of glucose across the plasma membrane. These membrane transport 
proteins are stored in specialized recycling endosomes until the hormone insu- 
lin stimulates the cell to increase its rate of glucose uptake. In response to the 
insulin signal, transport vesicles rapidly bud from the recycling endosome and 
deliver large numbers of glucose transporters to the plasma membrane, thereby 
greatly increasing the rate of glucose uptake into the cell (Figure 13-59). Similarly, 
kidney cells regulate the insertion of aquaporins and V-ATPase into the plasma 
membrane to increase water and acid excretion, respectively, both in response to 
hormones. 


Specialized Phagocytic Cells Can Ingest Large Particles 


Phagocytosis is a special form of endocytosis in which a cell uses large endocytic 
vesicles called phagosomes to ingest large particles such as microorganisms and 
dead cells. Phagocytosis is distinct, both in purpose and mechanism, from mac- 
ropinocytosis, which we discussed earlier. In protozoa, phagocytosis is a form of 
feeding: large particles taken up into phagosomes end up in lysosomes, and the 
products of the subsequent digestive processes pass into the cytosol to be used as 
food. However, few cells in multicellular organisms are able to ingest such large 
particles efficiently. In the gut of animals, for example, extracellular processes 
break down food particles, and cells import the small products of hydrolysis. 
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Figure 13-57 Conserved mechanism 
in multivesicular body formation and 
virus budding. In the two topologically 
equivalent processes indicated by the 
arrows, ESCRT complexes (not shown) 
shape membranes into buds that bulge 
away from the cytosol. 


Figure 13-58 Possible fates for 
transmembrane receptor proteins that 
have been endocytosed. Three pathways 
from the early endosomal compartment 

in an epithelial cell are shown. Retrieved 
receptors are returned (1) to the same 
plasma membrane domain from which 
they came (recycling) or (2) via a recycling 
endosome to a different domain of 

the plasma membrane (transcytosis). 

(3) Receptors that are not specifically 
retrieved from early or recycling endosomes 
follow the pathway from the endosomal 
compartment to lysosomes, where they are 
degraded (degradation). If the ligand that is 
endocytosed with its receptor stays bound 
to the receptor in the acidic environment 

of the endosome, it shares the same fate 
as the receptor; otherwise, it is delivered 

to lysosomes. Recycling endosomes are 

a way-station on the transcytotic pathway. 
In the transcytosis example shown here, 
an antibody Fc receptor on a gut epithelial 
cell binds antibody and is endocytosed, 
eventually carrying the antibody to the 
basolateral plasma membrane. The 
receptor is called an Fc receptor because 
it binds the Fc part of the antibody 
(discussed in Chapter 24). 
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Phagocytosis is important in most animals for purposes other than nutrition, 
and itis carried out mainly by specialized cells—so-called professional phagocytes. 
In mammals, two important classes of white blood cells that act as professional 
phagocytes are macrophages and neutrophils (Movie 13.5). These cells develop 
from hemopoietic stem cells (discussed in Chapter 22), and they ingest invad- 
ing microorganisms to defend us against infection. Macrophages also have an 
important role in scavenging senescent cells and cells that have died by apoptosis 
(discussed in Chapter 18). In quantitative terms, the clearance of senescent and 
dead cells is by far the most important: our macrophages, for example, phagocy- 
tose more than 10! senescent red blood cells in each of us every day. 

The diameter of a phagosome is determined by the size of its ingested parti- 
cles, and those particles can be almost as large as the phagocytic cell itself (Fig- 
ure 13-60). Phagosomes fuse with lysosomes, and the ingested material is then 
degraded. Indigestible substances remain in the lysosomes, forming residual bod- 
ies that can be excreted from cells by exocytosis, as mentioned earlier. Some of the 
internalized plasma membrane components never reach the lysosome, because 
they are retrieved from the phagosome in transport vesicles and returned to the 
plasma membrane. 

Some pathogenic bacteria have evolved elaborate mechanisms to prevent 
phagosome-lysosome fusion. The bacterium Legionella pneumophila, for exam- 
ple, which causes Legionnaires’ disease (discussed in Chapter 23), injects into 
its unfortunate host a Rab-modifying enzyme that causes certain Rab proteins 
to misdirect membrane traffic, thereby preventing phagosome-lysosome fusion. 
The bacterium, thus spared from lysosomal degradation, remains in the modified 
phagosome, growing and dividing as an intracellular pathogen, protected from 
the host’s adaptive immune system. 

Phagocytosis is a cargo-triggered process. That is, it requires the activation of 
cell-surface receptors that transmit signals to the cell interior. Thus, to be phago- 
cytosed, particles must first bind to the surface of the phagocyte (although not 
all particles that bind are ingested). Phagocytes have a variety of cell surface 
receptors that are functionally linked to the phagocytic machinery of the cell. 
The best-characterized triggers of phagocytosis are antibodies, which protect us 
by binding to the surface of infectious microorganisms (pathogens) and initiat- 
ing a series of events that culminate in the invader being phagocytosed. When 
antibodies initially attack a pathogen, they coat it with antibody molecules that 
bind to Fc receptors on the surface of macrophages and neutrophils, activating 
the receptors to induce the phagocytic cell to extend pseudopods, which engulf 
the particle and fuse at their tips to form a phagosome (Figure 13-61A). Localized 
actin polymerization, initiated by Rho family GTPases and their activating Rho- 
GEFs (discussed in Chapters 15 and 16), shapes the pseudopods. The activated 
Rho GTPases switch on the kinase activity of local PI kinases to produce PI(4,5)P2 
in the membrane (see Figure 13-11), which stimulates actin polymerization. To 
seal off the phagosome and complete the engulfment, actin is depolymerized by a 
PI 3-kinase that converts the PI(4,5)P2 to PI(3,4,5)P3, which is required for closure 
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Figure 13-59 Storage of plasma 
membrane proteins in recycling 
endosomes. Recycling endosomes can 
serve as an intracellular storage site for 
specialized plasma membrane proteins 
that can be mobilized when needed. In 
the example shown, insulin binding to the 
insulin receptor triggers an intracellular 
signaling pathway that causes the rapid 
insertion of glucose transporters into the 
plasma membrane of a fat or muscle cell, 
greatly increasing its glucose intake. 





Figure 13-60 Phagocytosis by a 
macrophage. A scanning electron 
micrograph of a mouse macrophage 
phagocytosing two chemically altered red 
blood cells. The red arrows point to edges 
of thin processes (pseudopods) of the 
macrophage that are extending as collars 
to engulf the red cells. (Courtesy of Jean 
Paul Revel.) 
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of the phagosome and may also contribute to reshaping the actin network to help 
drive the invagination of the forming phagosome (Figure 13-61B). In this way, 
the ordered generation and consumption of specific phosphoinositides guides 
sequential steps in phagosome formation. 

Several other classes of receptors that promote phagocytosis have been char- 
acterized. Some recognize complement components, which collaborate with anti- 
bodies in targeting microbes for destruction (discussed in Chapter 24). Others 
directly recognize oligosaccharides on the surface of certain pathogens. Still oth- 
ers recognize cells that have died by apoptosis. Apoptotic cells lose the asymmet- 
ric distribution of phospholipids in their plasma membrane. As a consequence, 
negatively charged phosphatidylserine, which is normally confined to the cyto- 
solic leaflet of the lipid bilayer, is now exposed on the outside of the cell, where it 
helps to trigger the phagocytosis of the dead cell. 

Remarkably, macrophages will also phagocytose a variety of inanimate parti- 
cles—such as glass or latex beads and asbestos fibers—yet they do not phagocy- 
tose live cells in their own body. The living cells display “don’t-eat-me” signals in 
the form of cell-surface proteins that bind to inhibiting receptors on the surface of 
macrophages. The inhibitory receptors recruit tyrosine phosphatases that antag- 
onize the intracellular signaling events required to initiate phagocytosis, thereby 
locally inhibiting the phagocytic process. Thus phagocytosis, like many other cell 
processes, depends on a balance between positive signals that activate the pro- 
cess and negative signals that inhibit it. Apoptotic cells are thought both to gain 
“eat-me” signals (such as extracellularly exposed phosphatidylserine) and to lose 
their “don’t-eat-me” signals, causing them to be very rapidly phagocytosed by 
macrophages. 


Summary 


Cells ingest fluid, molecules, and particles by endocytosis, in which localized regions 
of the plasma membrane invaginate and pinch off to form endocytic vesicles. In 
most cells, endocytosis internalizes a large fraction of the plasma membrane every 
hour. The cells remain the same size because most of the plasma membrane compo- 
nents (proteins and lipids) that are endocytosed are continually returned to the cell 
surface by exocytosis. This large-scale endocytic-exocytic cycle is mediated largely 
by clathrin-coated pits and vesicles but clathrin-independent endocytic pathways 
also contribute. 

While many of the endocytosed molecules are quickly recycled to the plasma 
membrane, others eventually end up in lysosomes, where they are degraded. Most of 
the ligands that are endocytosed with their receptors dissociate from their receptors 


Figure 13-61 A neutrophil reshaping 

its plasma membrane during 
phagocytosis. (A) An electron micrograph 
of a neutrophil phagocytosing a 
bacterium, which is in the process of 
dividing. (B) PSeudopod extension and 
phagosome formation are driven by actin 
polymerization and reorganization, which 
respond to the accumulation of specific 
ohosphoinositides in the membrane 

of the forming phagosome: PI(4,5)P2 
stimulates actin polymerization, which 
promotes pseudopod formation, and then 
PI(3,4,5)P3 depolymerizes actin filaments 
at the base. (A, courtesy of Dorothy F. 
Bainton, Phagocytic Mechanisms in Health 
and Disease. New York: Intercontinental 
Medical Book Corporation, 1971.) 
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in the acidic environment of the endosome and eventually end up in lysosomes, 
while most of the receptors are recycled via transport vesicles back to the cell surface 
for reuse. Many cell-surface signaling receptors become tagged with ubiquitin when 
activated by binding their extracellular ligands. Ubiquitylation guides activated 
receptors into clathrin-coated pits, they and their ligands are efficiently internalized 
and delivered to early endosomes. 

Early endosomes, rapidly mature into late endosomes. During maturation, 
patches of the endosomal membrane containing ubiquitylated receptors invagi- 
nate and pinch off to form intralumenal vesicles. This process is mediated by ESCRT 
complexes and sequesters the receptors away from the cytosol, which terminates 
their signaling activity. Late endosomes migrate along microtubules toward the 
interior of the cell where they fuse with one another and with lysosomes to form 
endolysosomes, where degradation occurs. 

In some cases, both receptor and ligand are transferred to a different plasma 
membrane domain, causing the ligand to be released at a different surface from 
where it originated, a process called transcytosis. In some cells, endocytosed plasma 
membrane proteins and lipids can be stored in recycling endosomes, for as long as 
necessary until they are needed. 


TRANSPORT FROM THE TRANS GOLGI NETWORK TO 
THE CELL EXTERIOR: EXOCYTOSIS 


Having considered the cell’s endocytic and internal digestive systems and the 
various types of incoming membrane traffic that converge on lysosomes, we now 
return to the Golgi apparatus and examine the secretory pathways that lead out- 
ward to the cell exterior. Transport vesicles destined for the plasma membrane 
normally leave the TGN in a steady stream as irregularly shaped tubules. The 
membrane proteins and the lipids in these vesicles provide new components 
for the cell’s plasma membrane, while the soluble proteins inside the vesicles 
are secreted to the extracellular space. The fusion of the vesicles with the plasma 
membrane is called exocytosis. This is the route, for example, by which cells 
secrete most of the proteoglycans and glycoproteins of the extracellular matrix, as 
discussed in Chapter 19. 

All cells require this constitutive secretory pathway, which operates contin- 
uously (Movie 13.6). Specialized secretory cells, however, have a second secre- 
tory pathway in which soluble proteins and other substances are initially stored 
in secretory vesicles for later release by exocytosis. This is the regulated secre- 
tory pathway, found mainly in cells specialized for secreting products rapidly 
on demand—such as hormones, neurotransmitters, or digestive enzymes 
(Figure 13-62). In this section, we consider the role of the Golgi apparatus in both 
of these pathways and compare the two mechanisms of secretion. 


Many Proteins and Lipids Are Carried Automatically from the 
Trans Golgi Network (TGN) to the Cell Surface 


A cell capable of regulated secretion must separate at least three classes of proteins 
before they leave the TGN—those destined for lysosomes (via endosomes), those 
destined for secretory vesicles, and those destined for immediate delivery to the 
cell surface (Figure 13-63). We have already discussed how proteins destined for 
lysosomes are tagged with M6P for packaging into specific departing vesicles, and 
analogous signals are thought to direct secretory proteins into secretory vesicles. 
The nonselective constitutive secretory pathway transports most other proteins 
directly to the cell surface. Because entry into this pathway does not require a par- 
ticular signal, it is also called the default pathway. Thus, in an unpolarized cell 
such as a white blood cell or a fibroblast, it seems that any protein in the lumen of 
the Golgi apparatus is automatically carried by the constitutive pathway to the cell 
surface unless it is specifically returned to the ER, retained as a resident protein 
in the Golgi apparatus itself, or selected for the pathways that lead to regulated 
secretion or to lysosomes. In polarized cells, where different products have to be 
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delivered to different domains of the cell surface, we shall see that the options are 
more complex. 


secretory Vesicles Bud from the Trans Golgi Network 


Cells that are specialized for secreting some of their products rapidly on demand 
concentrate and store these products in secretory vesicles (often called dense- 
core secretory granules because they have dense cores when viewed in the electron 
microscope). Secretory vesicles form from the TGN, and they release their con- 
tents to the cell exterior by exocytosis in response to specific signals. The secreted 
product can be either a small molecule (such as histamine or a neuropeptide) or 
a protein (such as a hormone or digestive enzyme). 

Proteins destined for secretory vesicles (called secretory proteins) are pack- 
aged into appropriate vesicles in the TGN by a mechanism that involves the selec- 
tive aggregation of the secretory proteins. Clumps of aggregated, electron-dense 
material can be detected by electron microscopy in the lumen of the TGN. The 
signals that direct secretory proteins into such aggregates are not well defined 
and may be quite diverse. When a gene encoding a secretory protein is artificially 
expressed in a secretory cell that normally does not make the protein, the for- 
eign protein is appropriately packaged into secretory vesicles. This observation 


protein mixture 


sorting 








mannose 6-phosphate 
receptor 







plasma membrane 


EXTRACELLULAR 
SPACE 


Golgi apparatus 


Figure 13-62 The constitutive and 
regulated secretory pathways. The 

two pathways diverge in the TGN. The 
constitutive secretory pathway operates 

in all cells. Many soluble proteins are 
continually secreted from the cell by 

this pathway, which also supplies the 
plasma membrane with newly synthesized 
membrane lipids and proteins. Specialized 
secretory cells also have a regulated 
secretory pathway, by which selected 
proteins in the TGN are diverted into 
secretory vesicles, where the proteins 

are concentrated and stored until an 
extracellular signal stimulates their secretion. 
The regulated secretion of small molecules, 
such as histamine and neurotransmitters, 
occurs by a similar pathway; these 
molecules are actively transported from 
the cytosol into preformed secretory 
vesicles. There they are often bound to 
specific macromolecules (proteoglycans, 
for histamine) so that they can be stored at 
high concentration without generating an 
excessively high osmotic pressure. 


Figure 13-63 The three best-understood 
pathways of protein sorting in the 

trans Golgi network. (1) Proteins with the 
mannose 6-phosphate (M6P) marker (see 
Figure 13-45) are diverted to lysosomes 
(via endosomes) in clathrin-coated 
transport vesicles. (2) Proteins with signals 
directing them to secretory vesicles are 
concentrated in such vesicles as part of a 
regulated secretory pathway that is present 
only in specialized secretory cells. (8) In 
unpolarized cells, a constitutive secretory 
pathway delivers proteins with no special 
features to the cell surface. In polarized 
cells, such as epithelial cells, however, 
secreted and plasma membrane proteins 
are selectively directed to either the apical 
or the basolateral plasma membrane 
domain, so a specific signal must mediate 
at least one of these two pathways, as we 
discuss later. 
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shows that, although the proteins that an individual cell expresses and packages 
in secretory vesicles differ, they contain common sorting signals, which function 
properly even when the proteins are expressed in cells that do not normally make 
them. 

It is unclear how the aggregates of secretory proteins are segregated into secre- 
tory vesicles. Secretory vesicles have unique proteins in their membrane, some of 
which might serve as receptors for aggregated protein in the TGN. The aggregates 
are much too big, however, for each molecule of the secreted protein to be bound 
by its own cargo receptor, as occurs for transport of the lysosomal enzymes. The 
uptake of the aggregates into secretory vesicles may therefore more closely resem- 
ble the uptake of particles by phagocytosis at the cell surface, where the plasma 
membrane zippers up around large structures. 

Initially, most of the membrane of the secretory vesicles that leave the TGN is 
only loosely wrapped around the clusters of aggregated secretory proteins. Mor- 
phologically, these immature secretory vesicles resemble dilated trans Golgi cis- 
ternae that have pinched off from the Golgi stack. As the vesicles mature, they 
fuse with one another and their contents become concentrated (Figure 13-64A), 
probably as the result of both the continuous retrieval of membrane that is recy- 
cled to the TGN, and the progressive acidification of the vesicle lumen that results 
from the increasing concentration of V-type ATPases in the vesicle membrane 
that acidify all endocytic and exocytic organelles (see Figure 13-37). The degree of 
concentration of proteins during the formation and maturation of secretory ves- 
icles is only a small part of the total 200-400-fold concentration of these proteins 
that occurs after they leave the ER. Secretory and membrane proteins become 
concentrated as they move from the ER through the Golgi apparatus because of 
an extensive retrograde retrieval process mediated by COPI-coated transport ves- 
icles that carry soluble ER resident proteins back to the ER, while excluding the 
secretory and membrane proteins (see Figure 13-25). 

Membrane recycling is important for returning Golgi components to the Golgi 
apparatus, as well as for concentrating the contents of secretory vesicles. The ves- 
icles that mediate this retrieval originate as clathrin-coated buds on the surface of 
immature secretory vesicles, often being seen even on budding secretory vesicles 
that have not yet pinched off from the Golgi stack (Figure 13-64B). 

Because the final mature secretory vesicles are so densely filled with contents, 
the secretory cell can disgorge large amounts of material promptly by exocytosis 
when triggered to do so (Figure 13-65). 


Precursors of Secretory Proteins Are Proteolytically Processed 
During the Formation of Secretory Vesicles 


Concentration is not the only process to which secretory proteins are subjected as 
the secretory vesicles mature. Many protein hormones and small neuropeptides, 
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Figure 13-64 The formation of 
secretory vesicles. (A) Secretory 

proteins become segregated and highly 
concentrated in secretory vesicles by two 
mechanisms. First, they aggregate in the 
ionic environment of the TGN; often the 
aggregates become more condensed as 
secretory vesicles mature and their lumen 
becomes more acidic. Second, clathrin- 
coated vesicles retrieve excess membrane 
and lumenal content present in immature 
secretory vesicles as the secretory vesicles 
mature. (B) This electron micrograph shows 
secretory vesicles forming from the TGN in 
an insulin-secreting B cell of the pancreas. 
Anti-clathrin antibodies conjugated to gold 
spheres (black dots) have been used to 
locate clathrin molecules. The immature 
secretory vesicles, which contain insulin 
precursor protein (proinsulin), contain 
clathrin patches, which are no longer 

seen on the mature secretory vesicle. (B, 
courtesy of Lelio Orci.) 
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as well as many secreted hydrolytic enzymes, are synthesized as inactive precur- 
sors. Proteolysis is necessary to liberate the active molecules from these precursor 
proteins. The cleavages occur in the secretory vesicles and sometimes in the extra- 
cellular fluid after secretion. Additionally, many of the precursor proteins have an 
N-terminal pro-peptide that is cleaved off to yield the mature protein. These pro- 
teins are synthesized as pre-pro-proteins, the pre-peptide consisting of the ER sig- 
nal peptide that is cleaved off earlier in the rough ER (see Figure 12-36). In other 
cases, peptide signaling molecules are made as polyproteins that contain multiple 
copies of the same amino acid sequence. In still more complex cases, a variety of 
peptide signaling molecules are synthesized as parts of a single polyprotein that 
acts as a precursor for multiple end products, which are individually cleaved from 
the initial polypeptide chain. The same polyprotein may be processed in various 
ways to produce different peptides in different cell types (Figure 13-66). 

Why is proteolytic processing so common in the secretory pathway? Some 
of the peptides produced in this way, such as the enkephalins (five-amino-acid 
neuropeptides with morphine-like activity), are undoubtedly too short in their 
mature forms to be co-translationally transported into the ER lumen or to include 
the necessary signal for packaging into secretory vesicles. In addition, for secreted 
hydrolytic enzymes—or any other protein whose activity could be harmful inside 
the cell that makes it—delaying activation of the protein until it reaches a secre- 
tory vesicle, or until after it has been secreted, has a clear advantage: the delay 
prevents the protein from acting prematurely inside the cell in which it is synthe- 
sized. 


secretory Vesicles Wait Near the Plasma Membrane Until Signaled 
to Release Their Contents 


Once loaded, a secretory vesicle has to reach the site of secretion, which in some 
cells is far away from the TGN. Nerve cells are the most extreme example. Secre- 
tory proteins, such as peptide neurotransmitters (neuropeptides), which will be 
released from nerve terminals at the end of the axon, are made and packaged into 
secretory vesicles in the cell body. They then travel along the axon to the nerve 
terminals, which can be a meter or more away. As discussed in Chapter 16, motor 
proteins propel the vesicles along axonal microtubules, whose uniform orienta- 
tion guides the vesicles in the proper direction. Microtubules also guide transport 
vesicles to the cell surface for constitutive exocytosis. 

Whereas transport vesicles containing materials for constitutive release fuse 
with the plasma membrane once they arrive there, secretory vesicles in the regu- 
lated pathway wait at the membrane until the cell receives a signal to secrete, and 
they then fuse. The signal can be an electrical nerve impluse (an action potential) 
or an extracellular signal molecule, such as a hormone: in either case, it leads to a 
transient increase in the concentration of free Ca** in the cytosol. 


For Rapid Exocytosis, Synaptic Vesicles Are Primed at the 
Presynaptic Plasma Membrane 

Nerve cells (and some endocrine cells) contain two types of secretory vesicles. As 
for all secretory cells, these cells package proteins and neuropeptides in dense- 


cored secretory vesicles in the standard way for release by the regulated secretory 
pathway. In addition, however, they use another specialized class of tiny (50 nm 
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Figure 13-65 Exocytosis of secretory 
vesicles. The process is illustrated 
schematically (top) and in an electron 
micrograph that shows the release of 
insulin from a secretory vesicle of a 
pancreatic B cell. (Courtesy of Lelio 
Orci, from L. Orci, J.-D. Vassalli and 

A. Perrelet, Sci. Am. 259:85-94, 1988.) 


Figure 13-66 Alternative processing 
pathways for the prohormone 
polyprotein proopiomelanocortin. The 
initial cleavages are made by proteases 
that cut next to pairs of positively charged 
amino acids (Lys-Arg, Lys-Lys, Arg-Lys, 
or Arg-Arg pairs). Trimming reactions 
then produce the final secreted products. 
Different cell tyoes produce different 
concentrations of individual processing 
enzymes, so that the same prohormone 
precursor is cleaved to produce different 
peptide hormones. In the anterior lobe 

of the pituitary gland, for example, only 
corticotropin (ACTH) and B-lipotropin 

are produced from proopiomelanocortin, 
whereas in the intermediate lobe of the 
pituitary gland mainly a-melanocyte 
stimulating hormone (a-MSh), y-lipotropin, 
B-MSH, and B-endorphin are produced — 
a-MSH from ACTH and the other three 
from B-lipotropin, as shown. 
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Figure 13-67 Exocytosis of synaptic vesicles. For orientation at a synapse, see Figure 11-36. (A) The trans-SNARE complex 
responsible for docking synaptic vesicles at the plasma membrane of nerve terminals consists of three proteins. The v-SNARE 
synaptobrevin and the t-SNARE syntaxin are both transmembrane proteins, and each contributes one a helix to the complex. 
By contrast to other SNAREs discussed earlier, the tt SNARE SNAP25 is a peripheral membrane protein that contributes two a 
helices to the four-helix bundle; the two helices are connected by a loop (dashed line) that lies parallel to the membrane and has 
fatty acyl chains (not shown) attached to anchor it there. The four a helices are shown as rods for simplicity. (B) At the synapse, 
the basic SNARE machinery is modulated by the Ca?+ sensor synaptotagmin and an additional protein called complexin. 
Synaptic vesicles first dock at the membrane (Step 1), and the SNARE bundle partially assembles (Step 2), resulting in a 
“orimed vesicle” that is already drawn close to the membrane. The SNARE bundle assembles further but the additional binding 
of complexin prevents fusion (Step 3). Upon arrival of an action potential, Ca2* enters the cell and binds to synaptotagmin, 
which releases the block and opens a fusion pore (Step 4). Further rearrangements complete the fusion reaction (Step 5) and 
release the fusion machinery, which now can be reused. This elaborate arrangement allows the fusion machinery to respond 

on the millisecond time scale essential for rapid and repetitive synaptic signaling. (A, adapted from R.B. Sutton et al., Nature 
395:347-353, 1998. With permission from Macmillan Publishers Ltd.; B, adapted from Z.P. Pang and T.C. Sudhof, Curr Opin. 


Cell Biol. 22:496-505, 2010. With permission from Elsevier.) 


diameter) secretory vesicles called synaptic vesicles. These vesicles store small 
neurotransmitter molecules, such as acetylcholine, glutamate, glycine, and y-ami- 
nobutyric acid (GABA), which mediate rapid signaling from nerve cell to its target 
cell at chemical synapses. When an action potential arrives at a nerve terminal, it 
causes an influx of Ca** through voltage-gated Ca** channels, which triggers the 
synaptic vesicles to fuse with the plasma membrane and release their contents 
to the extracellular space (see Figure 11-36). Some neurons fire more than 1000 
times per second, releasing neurotransmitters each time. 

The speed of transmitter release (taking only milliseconds) indicates that the 
proteins mediating the fusion reaction do not undergo complex, multistep rear- 
rangements. Rather, after vesicles have been docked at the presynaptic plasma 
membrane, they undergo a priming step, which prepares them for rapid fusion. 
In the primed state, the SNAREs are partly paired, their helices are not fully 
wound into the final four-helix bundle required for fusion (Figure 13-67). Pro- 
teins called complexins freeze the SNARE complexes in this metastable state. The 
brake imposed by the complexins is released by another synaptic vesicle protein, 
synaptotagmin, which contains Ca**-binding domains. A rise in cytosolic Ca** 
triggers binding of synaptotagmin to phospholipids and to the SNAREs, displac- 
ing the complexins. As the SNARE bundle zippers up completely, a fusion pore 
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opens and the neurotransmitters are released. At a typical synapse, only a small 
number of the docked vesicles are primed and ready for exocytosis. The use of 
only a small number of vesicles at a time allows each synapse to fire over and 
over again in quick succession. With each firing, new synaptic vesicles dock and 
become primed to replace those that have fused and released their contents. 


Synaptic Vesicles Can Form Directly from Endocytic Vesicles 


For the nerve terminal to respond rapidly and repeatedly, synaptic vesicles need 
to be replenished very quickly after they discharge. Thus, most synaptic vesicles 
are generated not from the Golgi membrane in the nerve cell body but by local 
recycling from the presynaptic plasma membrane in the nerve terminals (Figure 
13-68). Similarly, newly made membrane components of the synaptic vesicles are 
initially delivered to the plasma membrane by the constitutive secretory pathway 
and then retrieved by endocytosis. But instead of fusing with endosomes, most of 
the endocytic vesicles immediately fill with neurotransmitter to become synaptic 
vesicles. 

The membrane components of a synaptic vesicle include transporters special- 
ized for the uptake of neurotransmitter from the cytosol, where the small-mole- 
cule neurotransmitters that mediate fast synaptic signaling are synthesized. Once 
filled with neurotransmitter, the synaptic vesicles can be used again (see Figure 
13-68). Because synaptic vesicles are abundant and relatively uniform in size, 
they can be purified in large numbers and, consequently, are the best-character- 
ized organelle of the cell, in that all of their membrane components have been 
identified by quantitative proteomic analyses (Figure 13-69). Extending this anal- 
ysis to a complete presynaptic terminal, allows us to model the crowded environ- 
ment in which these reactions occur. 


Secretory Vesicle Membrane Components Are Quickly Removed 
from the Plasma Membrane 


When a secretory vesicle fuses with the plasma membrane, its contents are 
discharged from the cell by exocytosis, and its membrane becomes part of the 
plasma membrane. Although this should greatly increase the surface area of the 
plasma membrane, it does so only transiently, because membrane components 
are removed from the surface by endocytosis almost as fast as they are added by 
exocytosis, a process reminiscent of the endocytic-exocytic cycle discussed ear- 
lier. After their removal from the plasma membrane, the proteins of the secretory 











Figure 13-68 The formation of synaptic 
vesicles in a nerve cell. These tiny uniform 
vesicles are found only in nerve cells and in 
some endocrine cells, where they store and 
secrete small-molecule neurotransmitters. 
The import of neurotransmitter directly 

into the small endocytic vesicles that form 
from the plasma membrane is mediated 

by membrane transporters that function as 
antiports and are driven by an H* gradient 
maintained by V-ATPase H+ pumps in the 
vesicle membrane (discussed in Chapter 11). 
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v-SNARE (synaptobrevin) Figure 13-69 Scale models of a brain presynaptic 
terminal and a synaptic vesicle. The illustrations show 
sections through a pre-synaptic terminal (A; enlarged in 
B) and a synaptic vesicle (C) in which proteins and lipids 
are drawn to scale based on their known stoichiometry 
and either known or approximated structures. The relative 
localization of protein molecules in different regions of the 
presynaptic terminal was inferred from super-resolution 
imaging and electron microscopy. The model in (A) 
contains 300,000 proteins of 60 different kinds that vary in 
abundance from 150 copies to 20,000 copies. In the model 
in (C), only 70% of the membrane proteins present in the 
membrane are shown; a complete model would therefore 
show a membrane that is even more crowded than this 
picture suggests ( ). Each synaptic vesicle 
membrane contains 7000 phospholipid molecules and 
' 5700 cholesterol molecules. Each also contains close to 
salt 50 different integral membrane protein molecules, which 
vary widely in their relative abundance and together 
glutamate contribute about 600 transmembrane a helices. The 
transmembrane v-SNARE synaptobrevin is the most 
abundant protein in the vesicle (~70 copies per vesicle). 
By contrast, the V-ATPase, which uses ATP hydrolysis to 
pump Ht into the vesicle lumen, is present in 1-2 copies 
per vesicle. The H+ gradient provides the energy for 
neurotransmitter import by an H*/neurotransmitter antiport, 
which loads each vesicle with 1800 neurotransmitter 
; molecules, such as glutamate, one of which is shown 
ipid bilayer Py 2 PY a i a a to scale. (A and B, from B.G. Wilhelm et al., Science 

” Gar a ie 344:1023-1028, 2014. With permission from AAAS; 
C, adapted from S. Takamori et al., Cell 127:831-846, 
2006. With permission from Elsevier.) 
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vesicle membrane are either recycled or shuttled to lysosomes for degradation. 
The amount of secretory vesicle membrane that is temporarily added to the 
plasma membrane can be enormous: in a pancreatic acinar cell discharging 
digestive enzymes for delivery to the gut lumen, about 900 um? of vesicle mem- 
brane is inserted into the apical plasma membrane (whose area is only 30 um?) 
when the cell is stimulated to secrete. 

Control of membrane traffic thus has a major role in maintaining the composi- 
tion of the various membranes of the cell. To maintain each membrane-enclosed 
compartment in the secretory and endocytic pathways at a constant size, the bal- 
ance between the outward and inward flows of membrane needs to be precisely 
regulated. For cells to grow, however, the forward flow needs to be greater than 
the retrograde flow, so that the membrane can increase in area. For cells to main- 
tain a constant size, the forward and retrograde flows must be equal. We still know 
very little about the mechanisms that coordinate these flows. 


some Regulated Exocytosis Events Serve to Enlarge the Plasma 
Membrane 


An important task of regulated exocytosis is to deliver more membrane to enlarge 
the surface area of a cell’s plasma membrane when such a need arises. A spectac- 
ular example is the plasma membrane expansion that occurs during the cellular- 
ization process in a fly embryo, which initially is a syncytium—a single cell con- 
taining about 6000 nuclei surrounded by a single plasma membrane (see Figure 
21-15). Within tens of minutes, the embryo is converted into the same number of 
cells. This process of cellularization requires a vast amount of new plasma mem- 
brane, which is added by a carefully orchestrated fusion of cytoplasmic vesicles, 
eventually forming the plasma membranes that enclose the separate cells. Similar 
vesicle fusion events are required to enlarge the plasma membrane when other 
animal cells or plant cells divide during cytokinesis (discussed in Chapter 17). 

Many animal cells, especially those subjected to mechanical stresses, fre- 
quently experience small ruptures in their plasma membrane. In a remarkable 
process thought to involve both homotypic vesicle-vesicle fusion and exocytosis, 
a temporary cell-surface patch is quickly fashioned from locally available inter- 
nal-membrane sources, such as lysosomes. In addition to providing an emergency 
barrier against leaks, the patch reduces membrane tension over the wounded 
area, allowing the bilayer to flow back together to restore continuity and seal the 
puncture. This membrane repair process, the fusion and exocytosis of vesicles is 
triggered by the sudden increase of Ca**, which is abundant in the extracellular 
space and rushes into the cell as soon as the plasma membrane is punctured. 
Figure 13-70 shows four examples in which regulated exocytosis leads to plasma 
membrane expansion. 


Polarized Cells Direct Proteins from the Trans Golgi Network to the 
Appropriate Domain of the Plasma Membrane 


Most cells in tissues are polarized, with two or more molecularly and functionally 
distinct plasma membrane domains. This raises the general problem of how the 





Figure 13-70 Four examples of regulated 
exocytosis leading to plasma membrane 
enlargement. The vesicles fusing with the 
plasma membrane during cytokinesis (A) 
and phagocytosis (B) are thought to be 
derived from endosomes, whereas those 
involved in wound repair (C) may be derived 
from plasma membranes. The vast amount 
of new plasma membrane inserted during 
cellularization in a fly embryo occurs by the 
fusion of cytoplasmic vesicles (D). 
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delivery of membrane from the Golgi apparatus is organized so as to maintain 
the differences between one cell-surface domain and another. A typical epithe- 
lial cell, for example, has an apical domain, which faces either an internal cavity 
or the outside world and often has specialized features such as cilia or a brush 
border of microvilli. It also has a basolateral domain, which covers the rest of the 
cell. The two domains are separated by a ring of tight junctions (see Figure 19-21), 
which prevent proteins and lipids (in the outer leaflet of the lipid bilayer) from dif- 
fusing between the two domains, so that the differences between the two domains 
are maintained. 

In principle, differences between plasma membrane domains need not 
depend on the targeted delivery of the appropriate membrane components. 
Instead, membrane components could be delivered to all regions of the cell sur- 
face indiscriminately but then be selectively stabilized in some locations and 
selectively eliminated in others. Although this strategy of random delivery fol- 
lowed by selective retention or removal seems to be used in certain cases, deliver- 
ies are often specifically directed to the appropriate membrane domain. Epithelial 
cells lining the gut, for example, secrete digestive enzymes and mucus at their api- 
cal surface and components of the basal lamina at their basolateral surface. Such 
cells must have ways of directing vesicles carrying different cargoes to different 
plasma membrane domains. Proteins from the ER destined for different domains 
travel together until they reach the TGN, where they are separated and dispatched 
in secretory or transport vesicles to the appropriate plasma membrane domain 
(Figure 13-71). 

The apical plasma membrane of most epithelial cells is greatly enriched in 
glycosphingolipids, which help protect this exposed surface from damage—for 
example, from the digestive enzymes and low pH in sites such as the gut or stom- 
ach, respectively. Similarly, plasma membrane proteins that are linked to the lipid 
bilayer by a GPI anchor (see Figure 12-52) are found predominantly in the api- 
cal plasma membrane. If recombinant DNA techniques are used to attach a GPI 
anchor to a protein that would normally be delivered to the basolateral surface, 
the protein is usually delivered to the apical surface instead. GPI-anchored pro- 
teins are thought to be directed to the apical membrane because they associate 
with glycosphingolipids in lipid rafts that form in the membrane of the TGN. As 
discussed in Chapter 10, lipid rafts form in the TGN and plasma membrane when 
glycosphingolipids and cholesterol molecules self-associate (see Figure 10-13). 
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transport transport plasma early endosome 
vesicle vesicle membrane 
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Figure 13-71 Two ways of sorting 
plasma membrane proteins in a 
polarized epithelial cell. (A) In the direct 
pathway, proteins destined for different 
plasma membrane domains are sorted and 
packaged into different transport vesicles. 
The lipid-raft-dependent delivery system 
to the apical domain described in the 

text is an example of the direct pathway. 
(B) In the indirect pathway, a protein is 
retrieved from the inappropriate plasma 
membrane domain by endocytosis and 
then transported to the correct domain via 
early endosomes —that is, by transcytosis. 
The indirect pathway, for example, is used 
in liver hepatocytes to deliver proteins to 
the apical domain that lines bile ducts. 
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Having selected a unique set of cargo molecules, the rafts then bud from the TGN 
into transport vesicles destined for the apical plasma membrane. This process is 
similar to the selective partitioning of some membrane proteins into the special- 
ized lipid domains in caveolae at the plasma membrane discussed earlier. 
Membrane proteins destined for delivery to the basolateral membrane con- 
tain sorting signals in their cytosolic tail. When present in an appropriate struc- 
tural context, these signals are recognized by coat proteins that package them into 
appropriate transport vesicles in the TGN. The same basolateral signals that are 
recognized in the TGN also function in early endosomes to redirect the proteins 
back to the basolateral plasma membrane after they have been endocytosed. 


Summary 


Cells can secrete molecules by exocytosis in either a constitutive or a regulated fash- 
ion. Whereas the regulated pathways operate only in specialized secretory cells, 
a constitutive secretory pathway operates in all eukaryotic cells, characterized by 
continual vesicle transport from the TGN to the plasma membrane. In the regulated 
pathways, the molecules are stored either in secretory vesicles or in synaptic vesicles, 
which do not fuse with the plasma membrane to release their contents until they 
receive an appropriate signal. Secretory vesicles containing proteins for secretion 
bud from the TGN. The secretory proteins become concentrated during the forma- 
tion and maturation of the secretory vesicles. Synaptic vesicles, which are confined 
to nerve cells and some endocrine cells, form from both endocytic vesicles and from 
endosomes, and they mediate the regulated secretion of small-molecule neurotrans- 
mitters at the axon terminals of nerve cells. 

Proteins are delivered from the TGN to the plasma membrane by the constitu- 
tive pathway unless they are diverted into other pathways or retained in the Golgi 
apparatus. In polarized cells, the transport pathways from the TGN to the plasma 
membrane operate selectively to ensure that different sets of membrane proteins, 
secreted proteins, and lipids are delivered to the different domains of the plasma 
membrane. 


WHAT WE DON’T KNOW 


e How are targeting and fusion 
proteins such as SNAREs regulated, 
so that they can be returned to their 
respective donor compartments in an 
inactive state? 


e How does a cell balance exocytic 
and endocytic events to keep its 
plasma membrane a constant size? 


e Can newly formed daughter cells 
generate a Golgi apparatus de novo, 
or do they have to inherit it? 


e How do lysosomes avoid digesting 
their own membranes? 


e How does a cell maintain the 

right amount of every component 
(organelles, molecules), and how does 
it change these amounts as needed 
(for example, to greatly expand the 
endoplasmic reticulum when the cell 
needs to produce large amounts of 
secreted proteins)? 


PROBLEMS 


Which statements are true? Explain why or why not. 


13-1 Inall events involving fusion of a vesicle to a target 
membrane, the cytosolic leaflets of the vesicle and target 
bilayers always fuse together, as do the leaflets that are not 
in contact with the cytosol. 


13-2 ‘There is one strict requirement for the exit of a pro- 
tein from the ER: it must be correctly folded. 


13-3 All the glycoproteins and glycolipids in intracel- 
lular membranes have oligosaccharide chains facing the 
lumenal side, and all those in the plasma membrane have 
oligosaccharide chains facing the outside of the cell. 


Discuss the following problems. 


13-4 Inanondividing cell such as a liver cell, why must 
the flow of membrane between compartments be bal- 
anced, so that the retrieval pathways match the outward 
flow? Would you expect the same balanced flow in a gut 
epithelial cell, which is actively dividing? 


13-5 Enveloped viruses, which have a membrane coat, 
gain access to the cytosol by fusing with a cell membrane. 
Why do you suppose that these viruses encode their own 
special fusion protein, rather than making use of a cell's 
SNAREs? 


13-6 For fusion of a vesicle with its target membrane to 
occur, the membranes have to be brought to within 1.5 nm 
so that the two bilayers can join (Figure Q13-1). Assum- 
ing that the relevant portions of the two membranes at the 
fusion site are circular regions 1.5 nm in diameter, calcu- 
late the number of water molecules that would remain 
between the membranes. (Water is 55.5 M and the volume 
of a cylinder is mr*h.) Given that an average phospholipid 
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Figure Q13-1 Close approach of a vesicle and its target membrane 
in preparation for fusion (Problem 13-6). 
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occupies a membrane surface area of 0.2 nm’, how many 
phospholipids would be present in each of the opposing 
monolayers at the fusion site? Are there sufficient water 
molecules to bind to the hydrophilic head groups of this 
number of phospholipids? (It is estimated that 10-12 water 
molecules are normally associated with each phospho- 
lipid head group at the exposed surface of a membrane.) 


13-7 SNAREs exist as complementary partners that 
carry out membrane fusions between appropriate ves- 
icles and their target membranes. In this way, a vesicle 
with a particular variety of v-SNARE will fuse only with a 
membrane that carries the complementary t-SNARE. In 
some instances, however, fusions of identical membranes 
(homotypic fusions) are known to occur. For example, 
when a yeast cell forms a bud, vesicles derived from the 
mother cell’s vacuole move into the bud where they fuse 
with one another to form a new vacuole. These vesicles 
carry both v-SNAREs and t-SNAREs. Are both types of 
SNARES essential for this homotypic fusion event? 

To test this point, you have developed an ingenious 
assay for fusion of vacuolar vesicles. You prepare vesicles 
from two different mutant strains of yeast: strain B has a 
defective gene for vacuolar alkaline phosphatase (Pase); 
strain A is defective for the protease that converts the pre- 
cursor of alkaline phosphatase (pro-Pase) into its active 
form (Pase) (Figure Q13-2A). Neither strain has active 
alkaline phosphatase, but when extracts of the strains are 
mixed, vesicle fusion generates active alkaline phospha- 
tase, which can be easily measured (Figure Q13-2). 

Now you delete the genes for the vacuolar 
v-SNARE, t-SNARE, or both in each of the two yeast strains. 
You prepare vacuolar vesicles from each and test them for 
their ability to fuse, as measured by the alkaline phospha- 
tase assay (Figure Q13-2B). 

What do these data say about the requirements for 
v-SNAREs and t-SNARESs in the fusion of vacuolar vesicles? 
Does it matter which kind of SNARE is on which vesicle? 


13-8 Ifyou were to remove the ER retrieval signal from 
protein disulfide isomerase (PDI), which is normally a sol- 
uble resident of the ER lumen, where would you expect the 
modified PDI to be located? 


13-9 The KDEL receptor must shuttle back and forth 
between the ER and the Golgi apparatus to accomplish 
its task of ensuring that soluble ER proteins are retained 
in the ER lumen. In which compartment does the KDEL 
receptor bind its ligands more tightly? In which compart- 
ment does it bind its ligands more weakly? What is thought 
to be the basis for its different binding affinities in the two 
compartments? If you were designing the system, in which 
compartment would you have the highest concentration of 
KDEL receptor? Would you predict that the KDEL receptor, 
which is a transmembrane protein, would itself possess an 
ER retrieval signal? 


13-10 How does the low pH of lysosomes protect the rest 
of the cell from lysosomal enzymes in case the lysosome 
breaks? 
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Figure Q13-2 SNARE requirements for vesicle fusion (Problem 13-7). 
(A) Scheme for measuring the fusion of vacuolar vesicles. (B) Results 
of fusions of vesicles with different combinations of v-SNAREs and 
t-SNAREs. The SNAREs present on the vesicles of the two strains 
are indicated as v (v-SNARE) and t (t-SNARE). 


13-11 Melanosomes are specialized lysosomes that store 
pigments for eventual release by exocytosis. Various cells 
such as skin and hair cells then take up the pigment, which 
accounts for their characteristic pigmentation. Mouse 
mutants that have defective melanosomes often have pale 
or unusual coat colors. One such light-colored mouse, the 
Mocha mouse (Figure Q13-3), has a defect in the gene for 
one of the subunits of the adaptor protein complex AP3, 
which is associated with coated vesicles budding from the 
trans Golgi network. How might the loss of AP3 cause a 
defect in melanosomes? 


Figure Q13-3 A normal 
mouse and the Mocha 
mouse (Problem 13-11). 
In addition to its light 
coat color, the Mocha 
mouse has a poor sense 
of balance. (Courtesy of 
Margit Burmeister.) 
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Energy Conversion: 
Mitochondria and Chloroplasts 


To maintain their high degree of organization in a universe that is constantly drift- 
ing toward chaos, cells have a constant need for a plentiful supply of ATP, as we 
have explained in Chapter 2. In eukaryotic cells, most of the ATP that powers life 
processes is produced by specialized, membrane-enclosed, energy-converting 
organelles. These are of two types. Mitochondria, which occur in virtually all cells 
of animals, plants, and fungi, burn food molecules to produce ATP by oxidative 
phosphorylation. Chloroplasts, which occur only in plants and green algae, har- 
ness solar energy to produce ATP by photosynthesis. In electron micrographs, the 
most striking features of both mitochondria and chloroplasts are their extensive 
internal membrane systems. These internal membranes contain sets of mem- 
brane protein complexes that work together to produce most of the cell’s ATP. In 
bacteria, simpler versions of essentially the same protein complexes produce ATP, 
but they are located in the cell’s plasma membrane (Figure 14-1). 

Comparisons of DNA sequences indicate that the energy-converting organ- 
elles in present-day eukaryotes originated from prokaryotic cells that were endo- 
cytosed during the evolution of eukaryotes (discussed in Chapter 1). This explains 
why mitochondria and chloroplasts contain their own DNA, which still encodes 
a subset of their proteins. Over time, these organelles have lost most of their own 
genomes and become heavily dependent on proteins that are encoded by genes 
in the nucleus, synthesized in the cytosol, and then imported into the organelle. 
And the eukaryotic cells now rely on these organelles not only for the ATP they 
need for biosynthesis, solute transport, and movement, but also for many import- 
ant biosynthetic reactions that occur inside each organelle. 

The common evolutionary origin of the energy-converting machinery in mito- 
chondria, chloroplasts, and prokaryotes (archaea and bacteria) is reflected in the 
fundamental mechanism that they share for harnessing energy. This is known 
as chemiosmotic coupling, signifying a link between the chemical bond-form- 
ing reactions that generate ATP (“chemi”) and membrane transport processes 
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Figure 14-1 The membrane systems of 
bacteria, mitochondria, and chloroplasts 
are related. Mitochondria and chloroplasts 
are cell organelles that have originated from 
bacteria and have retained the bacterial 
energy-conversion mechanisms. Like 

their bacterial ancestors, mitochondria 

and chloroplasts have an outer and an 
inner membrane. Each of the membranes 
colored in this diagram contains energy- 
harvesting electron-transport chains. The 
deep invaginations of the mitochondrial 
inner membrane and the internal 
membrane system of the chloroplast 
harbor the machinery for cellular respiration 
and photosynthesis, respectively. 
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(“osmotic”). The chemiosmotic process occurs in two linked stages, both of which 
are performed by protein complexes in a membrane. 


Stage 1: High-energy electrons (derived from the oxidation of food mole- 
cules, from pigments excited by sunlight, or from other sources described 
later) are transferred along a series of electron-transport protein com- 
plexes that form an electron-transport chain embedded in a membrane. 
Each electron transfer releases a small amount of energy that is used to 
pump protons (H+) and thereby generate a large electrochemical gradient 
across the membrane (Figure 14-2). As discussed in Chapter 11, such an 
electrochemical gradient provides a way of storing energy, and it can be 
harnessed to do useful work when ions flow back across the membrane. 


Stage 2: The protons flow back down their electrochemical gradient 
through an elaborate membrane protein machine called ATP synthase, 
which catalyzes the production of ATP from ADP and inorganic phosphate 
(Pi). This ubiquitous enzyme works like a turbine in the membrane, driven 
by protons, to synthesize ATP (Figure 14-3). In this way, the energy derived 
from food or sunlight in stage 1 is converted into the chemical energy of a 
phosphate bond in ATP. 


Electrons move through protein complexes in biological systems via tightly 
bound metal ions or other carriers that take up and release electrons easily, or by 
special small molecules that pick electrons up at one location and deliver them to 
another. For mitochondria, the first of these electron carriers is NAD‘, a water-sol- 
uble small molecule that takes up two electrons and one H* derived from food 
molecules (fats and carbohydrates) to become NADH. NADH transfers these elec- 
trons from the sites where the food molecules are degraded to the inner mito- 
chondrial membrane. There, the electrons from the energy-rich NADH are passed 
from one membrane protein complex to the next, passing to a lower-energy com- 
pound at each step, until they reach a final complex in which they combine with 
molecular oxygen (O2) to produce water. The energy released at each step as the 
electrons flow down this path from the energy-rich NADH to the low-energy water 
molecule drives H+ pumps in the inner mitochondrial membrane, utilizing three 
different membrane protein complexes. Together, these complexes generate the 
proton-motive force harnessed by ATP synthase to produce the ATP that serves as 
the universal energy currency throughout the cell (see Chapter 2). 

Figure 14-4 compares the electron-transport processes in mitochondria, 
which harness energy from food molecules, with those in chloroplasts, which har- 
ness energy from sunlight. The energy-conversion systems of mitochondria and 
chloroplasts can be described in similar terms, and we shall see later in the chap- 
ter that two of their key components are closely related. One of these is the ATP 
synthase, and the other is a proton pump (colored green in Figure 14-4). 

Among the crucial constituents that are unique to photosynthetic organisms 
are the two photosystems. These use the green pigment chlorophyll to capture 


Figure 14-2 Stage 1 of chemiosmotic 
coupling. Energy from sunlight or the 
oxidation of food compounds is captured 
to generate an electrochemical proton 
gradient across a membrane. The 
electrochemical gradient serves as a 
versatile energy store that drives energy- 
requiring reactions in mitochondria, 
chloroplasts, and bacteria. 
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Figure 14-3 Stage 2 of chemiosmotic 
coupling. An ATP synthase (yellow) 
embedded in the lipid bilayer of a 
membrane harnesses the electrochemical 
proton gradient across the membrane, 
using it as a local energy store to drive 
ATP synthesis. The red arrows show the 
direction of proton movement through the 
ATP synthase. 
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Figure 14-4 Electron-transport processes. (A) The mitochondrion converts energy from chemical fuels. (B) The chloroplast 
converts energy from sunlight. In both cases, electron flow is indicated by blue arrows. Each of the protein complexes (green) 
is embedded in a membrane. In the mitochondrion, fats and carbohydrates from food molecules are fed into the citric acid 
cycle and provide electrons to generate the energy-rich compound NADH from NAD+. These electrons then flow down an 
energy gradient as they pass from one complex to the next in the electron-transport chain, until they combine with molecular 
Oo in the last complex to produce water. The energy released at each stage is harnessed to pump H* across the membrane. 

In the chloroplast, by contrast, electrons are extracted from water through the action of light in the photosystem II complex and 
molecular Oo is released. The electrons pass on to the next complex in the chain, which uses some of their energy to pump 
protons across the membrane, before passing to photosystem |, where sunlight generates high-energy electrons that combine 
with NADP* to produce NADPH. NADPH then enters the carbon-fixation cycle along with COs to generate carbohydrates. 


light energy and power the transfer of electrons, not unlike a photocell in a solar 
panel. The chloroplasts drive electron transfer in the direction opposite to that in 
mitochondria: electrons are taken from water to produce Oz, and these electrons 
are used (via NADPH, a molecule closely related to the NADH used in mitochon- 
dria) to synthesize carbohydrates from CO, and water. These carbohydrates then 
serve as the source for all other compounds a plant cell needs. 

Thus, both mitochondria and chloroplasts make use of an electron-transfer 
chain to produce an H* gradient that powers reactions that are critical for the cell. 
However, chloroplasts generate O2 and take up CO2, whereas mitochondria con- 
sume O% and release CO» (see Figure 14-4). 


THE MITOCHONDRION 


Mitochondria occupy up to 20% of the cytoplasmic volume of a eukaryotic cell. 
Although they are often depicted as short, bacterium-like bodies with a diame- 
ter of 0.5-1 um, they are in fact remarkably dynamic and plastic, moving about 
the cell, constantly changing shape, dividing, and fusing (Movie 14.1). Mito- 
chondria are often associated with the microtubular cytoskeleton (Figure 14-5), 
which determines their orientation and distribution in different cell types. Thus, 
in highly polarized cells such as neurons, mitochondria can move long distances 
(up to a meter or more in the extended axons of neurons), being propelled along 
the tracks of the microtubular cytoskeleton. In other cells, mitochondria remain 
fixed at points of high energy demand; for example, in skeletal or cardiac muscle 
cells, they pack between myofibrils, and in sperm cells they wrap tightly around 
the flagellum (Figure 14-6). 

Mitochondria also interact with other membrane systems in the cell, most 
notably the endoplasmic reticulum (ER). Contacts between mitochondria and ER 
define specialized domains thought to facilitate the exchange of lipids between 
the two membrane systems. These contacts also appear to induce mitochondrial 
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fission, which, as we discuss later, is involved in the distribution and partitioning 
of mitochondria within cells (Figure 14-7). 

The acquisition of mitochondria was a prerequisite for the evolution of com- 
plex animals. Without mitochondria, present-day animal cells would have had to 
generate all of their ATP through anaerobic glycolysis. When glycolysis converts 
glucose to pyruvate, it releases only a small fraction of the total free energy that 
is potentially available from glucose oxidation (see Chapter 2). In mitochondria, 
the metabolism of sugars is complete: pyruvate is imported into the mitochon- 
drion and ultimately oxidized by O2 to COz and H20, which allows 15 times more 
ATP to be made from a sugar than by glycolysis alone. As explained later, this 
became possible only when enough molecular oxygen accumulated in the Earth’s 
atmosphere to allow organisms to take full advantage, via respiration, of the large 
amounts of energy potentially available from the oxidation of organic compounds. 

Mitochondria are large enough to be seen in the light microscope, and they 
were first identified in the nineteenth century. Real progress in understanding 
their internal structure and function, however, depended on biochemical pro- 
cedures developed in 1948 for isolating intact mitochondria, and on electron 
microscopy, which was first used to look at cells at about the same time. 
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Figure 14-6 Localization of mitochondria near sites of high ATP demand in cardiac muscle 
and a sperm tail. (A) Cardiac muscle in the wall of the heart is the most heavily used muscle in the 
body, and its continual contractions require a reliable energy supply. It has limited built-in energy 
stores and has to depend on a steady supply of ATP from the copious mitochondria aligned close 
to the contractile myofibrils (See Figure 16-32). (B) During sperm development, microtubules wind 
helically around the flagellar axoneme, where they are thought to help localize the mitochondria in 
the tail to produce the structure shown. 
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Figure 14-5 The relationship between 
mitochondria and microtubules. (A) A 
light micrograph of chains of elongated 
mitochondria in a living mammalian 

cell in culture. The cell was stained 

with a fluorescent dye (rhodamine 123) 
that specifically labels mitochondria in 
living cells. (B) An immunofluorescence 
micrograph of the same cell stained 
(after fixation) with fluorescent antibodies 
that bind to microtubules. Note that the 
mitochondria tend to be aligned along 
microtubules. (Courtesy of Lan Bo Chen.) 
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Figure 14-7 Interaction of mitochondria with the endoplasmic reticulum. 
(A) Fluorescence light microscopy shows that tubules of the ER (green) wrap 
around parts of the mitochondrial network (red) in mammalian cells. The 
mitochondria then divide at the contact sites. After contact is established, 
fission occurs within less than a minute, as indicated by time-lapse 
microscopy. (B) Schematic drawing of an ER tubule wrapped around part of 
the mitochondrial reticulum. It is thought that ER—mitochondrial contacts also 
mediate the exchange of lipids between the two membrane systems. 

(A, adapted from J.R. Friedman et al., Science 334:358-362, 2011.) 
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The Mitochondrion Has an Outer Membrane and an Inner 
Membrane 


Like the bacteria from which they originated, mitochondria have an outer and 
an inner membrane. The two membranes have distinct functions and properties, 
and delineate separate compartments within the organelle. The inner membrane, 
which surrounds the internal mitochondrial matrix compartment (Figure 14-8), 
is highly folded to form invaginations known as cristae (the singular is crista), 
which contain in their membranes the proteins of the electron-transport chain. 
Where the inner membrane runs parallel to the outer membrane, between the 
cristae, it is known as the inner boundary membrane. The narrow (20-30 nm) gap 
between the inner boundary membrane and the outer membrane is known as 
the intermembrane space. The cristae are about 20 nm-wide membrane discs 
or tubules that protrude deeply into the matrix and enclose the crista space. The 
crista membrane is continuous with the inner boundary membrane, and where 
their membranes join, the membrane forms narrow membrane tubes or slits, 
known as crista junctions. 

Like the bacterial outer membrane, the outer mitochondrial membrane is 
freely permeable to ions and to small molecules as large as 5000 daltons. This 
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Figure 14-8 Structure of a mitochondrion. (A) Tomographic slice through a three-dimensional map of a mouse heart mitochondrion determined by 
electron-microscope tomography. The outer membrane envelops the inner boundary membrane. The inner membrane is highly folded into tubular 

or lamellar cristae, which crisscross the matrix. The dense matrix, which contains most of the mitochondrial protein, appears dark in the electron 
microscope, whereas the intermembrane space and the crista space appear light due to their lower protein content. The inner boundary membrane 
follows the outer membrane closely at a distance of =20 nm. The inner membrane turns sharply at the cristae junctions, where the cristae join the 
inner boundary membrane. (B) Tomographic surface-rendered portion of a yeast mitochondrion, showing how flattened cristae project into the matrix 
from the inner membrane (Movie 14.2). (C) Schematic drawing of a mitochondrion showing the outer membrane (gray), and the inner membrane 
(yellow). Note that the inner membrane is compartmentalized into the inner boundary membrane and the crista membrane. There are three distinct 
spaces: the inner membrane space, the crista space, and the matrix. (A, courtesy of Tobias Brandt; B, from K. Davies et al., Proc. Nat! Acad. Sci. 
USA 109:13602-13607, 2012. With permission from the National Academy of Sciences.) 
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Figure 14-9 Biochemical fractionation of purified mitochondria into INTACT matrix 
separate components. Large numbers of mitochondria are isolated from MITOCHONDRION 
homogenized tissue by centrifugation and then suspended in a medium = 
of low osmotic strength. In such a medium, water flows into mitochondria 
and greatly expands the matrix space (yellow). While the cristae of the inner 
membrane unfold to accommodate the swelling, the outer membrane— 
which has no folds—breaks, releasing structures composed of the inner 
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membrane surrounding the matrix. These techniques have made it possible the influx of water causes the 
to study the protein composition of the inner membrane (comprising a mitochondrion to swell and the 
mixture of cristae, boundary membranes, and cristae junctions), the outer outer membrane to rupture, releasing 
membrane. and the matrix. the contents of the intermembrane 
space; the inner membrane remains 
intact 
is because it contains many porin molecules, a special class of B-barrel-type fe SS 


membrane protein that creates aqueous pores across the membrane (see Figure 
10-23). As a consequence, the intermembrane space between the outer and inner 
membrane has the same pH and ionic composition as the cytoplasm, and there is 
no electrochemical gradient across the outer membrane. 

If purified mitochondria are gently disrupted and then fractionated (Figure 
14-9), the biochemical composition of membranes and mitochondrial compart- 
ments can be determined. 





centrifugation leaves the contents 
of the intermembrane space in the 
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The Inner Membrane Cristae Contain the Machinery for Electron 
Transport and ATP Synthesis 


Unlike the outer mitochondrial membrane, the inner mitochondrial membrane 








is a diffusion barrier to ions and small molecules, just like the bacterial inner INTERMEMBRANE 
membrane. However, selected ions, most notably protons and phosphate, as well SPACE 
as essential metabolites such as ATP and ADP, can pass through it by means of 
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The inner mitochondrial membrane is highly differentiated into functionally 
distinct regions with different protein compositions. As discussed in Chapter 
10, the lateral segregation of membrane regions with different protein and lipid 
compositions is a key feature of cells. In the inner mitochondrial membrane, 
the boundary membrane region is thought to contain the machinery for protein 
import, new membrane insertion, and assembly of the respiratory-chain com- 
plexes. The membranes of the cristae, which are continuous with the boundary 
membrane, contain the ATP synthase enzyme that produces most of the cell’s 
ATP; they also contain the large protein complexes of the respiratory chain—the 
name given to the mitochondrion’s electron-transport chain. 

At the cristae junctions, where the membranes of the cristae join the boundary 
membrane, special protein complexes provide a diffusion barrier that segregates 
the membrane proteins in the two regions of the inner membrane; these com- 
plexes are also thought to anchor the cristae to the outer membrane, thus main- ial ei Sait ty aa 
taining the highly folded topology of the inner membrane. Cristae membranes matrix components 
have one of the highest protein densities of all biological membranes, with a lipid 
content of 25% and a protein content of 75% by weight. The folding of the inner 
membrane into cristae greatly increases the membrane area available for oxida- 
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area of cristae membranes can be up to 20 times larger than the area of the cell’s 
plasma membrane. In total, the surface area of cristae membranes in each human INNER MATRIX OUTER 
body adds up to roughly the size of a football field. MEMBRANE MEMBRANE 


The Citric Acid Cycle in the Matrix Produces NADH 


Together with the cristae that project into it, the matrix is the major working part 
of the mitochondrion. Mitochondria can use both pyruvate and fatty acids as fuel. 
Pyruvate is derived from glucose and other sugars, whereas fatty acids are derived 
from fats. Both of these fuel molecules are transported across the inner mitochon- 
drial membrane by specialized transport proteins, and they are then converted 
to the crucial metabolic intermediate acetyl CoA by enzymes located in the mito- 
chondrial matrix (see Chapter 2). 
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The acetyl groups in acetyl CoA are oxidized in the matrix via the citric acid 
cycle, also called the Krebs cycle (see Figure 2-57 and Movie 2.6). The oxidation of 
these carbon atoms in acetyl CoA produces CO», which diffuses out of the mito- 
chondrion to be released to the environment as a waste product. More impor- 
tantly, the citric acid cycle saves a great deal of the bond energy released by this 
oxidation in the form of electrons carried by NADH. This NADH transfers its elec- 
trons from the matrix to the electron-transport chain in the inner mitochondrial 
membrane, where—through the chemiosmotic coupling process described pre- 
viously (see Figures 14-2 and 14-3)—the energy that was carried by NADH elec- 
trons is converted into phosphate-bond energy in ATP. Figure 14-10 outlines this 
sequence of reactions schematically. 

The matrix contains the genetic system of the mitochondrion, including the 
mitochondrial DNA and the ribosomes. The mitochondrial DNA (see section on 
genetic systems, p. 800) is organized into compact bodies—the nucleoids—by 
special scaffolding proteins that also function as transcription regulatory proteins. 
The large number of enzymes required for the maintenance of the mitochondrial 
genetic system, as well as for many other essential reactions to be outlined next, 
accounts for the very high protein concentration in the matrix; at more than 500 
mg/mL, this concentration is close to that in a protein crystal. 


Mitochondria Have Many Essential Roles in Cellular Metabolism 


Mitochondria not only generate most of the cell’s ATP; they also provide many 
other essential resources for biosynthesis and cell growth. Before describing in 
detail the remarkable machinery of the respiratory chain, we diverge briefly to 
touch on some of these important roles. 
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Figure 14-10 A summary of the energy- 
converting metabolism in mitochondria. 
Pyruvate and fatty acids enter the 
mitochondrion (top of the figure) and are 
broken down to acetyl CoA. The acetyl 
CoA is metabolized by the citric acid cycle, 
which reduces NADt to NADH, which then 
passes its high-energy electrons to the first 
complex in the electron-transport chain. In 
the process of oxidative phosphorylation, 
these electrons pass along the electron- 
transport chain in the inner membrane 
cristae to oxygen (Ob). This electron 
transport generates a proton gradient, 
which drives the production of ATP by the 
ATP synthase (see Figure 14-3). Electrons 
from the oxidation of succinate, a reaction 
intermediate in the citric acid cycle (see 
Panel 2-9, pp. 106-107), take a separate 
path to enter this electron-transport chain 
(not shown, see p. 772). 

The membranes that comprise the 
mitochondrial inner membrane —the 
inner boundary membrane and the crista 
membrane—contain different mixtures of 
proteins and they are therefore shaded 
differently in this diagram. 
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Mitochondria are critical for buffering the redox potential in the cytosol. Cells 
need a constant supply of the electron acceptor NAD* for the central reaction in 
glycolysis that converts glyceraldehyde 3-phosphate to 1,3-bisphosphoglycerate 
(see Figure 2-48). This NAD* is converted to NADH in the process, and the NAD+ 
needs to be regenerated by transferring the high-energy NADH electrons some- 
where. The NADH electrons will eventually be used to help drive oxidative phos- 
phorylation inside the mitochondrion. But the inner mitochondrial membrane 
is impermeable to NADH. The electrons are therefore passed from the NADH to 
smaller molecules in the cytosol that can move through the inner mitochondrial 
membrane. Once in the matrix, these smaller molecules transfer their electrons to 
NAD* to form mitochondrial NADH, after which they are returned to the cytosol 
for recharging—creating a so-called shuttle system for the NADH electrons. 

In addition to ATP, biosynthesis in the cytosol requires both a constant sup- 
ply of reducing power in the form of NADPH and small carbon-rich molecules 
to serve as building blocks (discussed in Chapter 2). Descriptions of biosynthe- 
sis often state that the needed carbon skeletons come directly from the break- 
down of sugars, whereas the NADPH is produced in the cytosol by a side pathway 
for the breakdown of sugars (the pentose phosphate pathway, an alternative to 
glycolysis). But under conditions where nutrients abound and plenty of ATP is 
available, mitochondria help to generate both the reducing power and the car- 
bon-rich building blocks (the “carbon skeletons” in Panel 2-1, pp. 90-91) needed 
for cell growth. For this purpose, excess citrate produced in the mitochondrial 
matrix by the citric acid cycle (see Panel 2-9, pp. 106-107) is transported down 
its electrochemical gradient to the cytosol, where it is metabolized to produce 
essential components of the cell. Thus, for example, as part of a cell’s response 
to growth signals, large amounts of acetyl CoA are produced in the cytosol from 
citrate exported from mitochondria, accelerating the production of the fatty acids 
and sterols that build new membranes (described in Chapter 10). Cancer cells are 
frequently mutated in ways that enhance this pathway, as part of their program of 
abnormal growth (see Figure 20-26). 

The urea cycle is a central metabolic pathway in mammals that converts the 
ammonia (NH,*) produced by the breakdown of nitrogen-containing compounds 
(such as amino acids) to the urea excreted in urine. Two critical steps of the urea 
cycle are carried out in the mitochondria of liver cells, while the remaining steps 
occur in the cytosol. Mitochondria also play an essential part in the metabolic 
adaptation of cells to different nutritional conditions. For example, under condi- 
tions of starvation, proteins in our bodies are broken down to amino acids, and 
the amino acids are imported into mitochondria and oxidized to produce NADH 
for ATP production. 

The biosynthesis of heme groups—which, as we shall see in the next section, 
play a central part in electron transfer—is another critical process that is shared 
between the mitochondrion and the cytoplasm. Iron-sulfur clusters, which are 
essential not only for electron transfer in the respiratory chain (see p. 766), but 
also for the maintenance and stability of the nuclear genome, are produced in 
mitochondria (and chloroplasts). Nuclear genome instability, a hallmark of can- 
cer, can sometimes be linked to the decreased function of cellular proteins that 
contain iron-sulfur clusters. 

Mitochondria also have a central role in membrane biosynthesis. Cardiolipin 
is a two-headed phospholipid (Figure 14-11) that is confined to the inner mito- 
chondrial membrane, where itis also produced. But mitochondria are also a major 
source of phospholipids for the biogenesis of other cell membranes. Phosphati- 
dylethanolamine, phosphatidylglycerol, and phosphatidic acid are synthesized in 
the mitochondrion, while phosphatidylinositol, phosphatidylcholine, and phos- 
phatidylserine are primarily synthesized in the endoplasmic reticulum (ER). As 
described in Chapter 12, most of the cell’s membranes are assembled in the ER. 
The exchange of lipids between the ER and mitochondria is thought to occur at 
special sites of close contact (see Figure 14-7) by an as-yet unknown mechanism. 
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Figure 14-11 The structure of 
cardiolipin. Cardiolipin consists of two 
covalently linked phospholipid units, with a 
total of four rather than the usual two fatty 
acid chains (see Figure 10-3). Cardiolipin 
is only produced in the mitochondrial inner 
membrane, where it interacts closely with 
membrane proteins involved in oxidative 
phosphorylation and ATP transport. In 
cristae, its two juxtaposed phosphate 
groups may act as a local proton trap on 
the membrane surface. 
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Finally, mitochondria are important calcium buffers, taking up calcium from 
the ER and sarcoplasmic reticulum at special membrane junctions. Cellular cal- 
cium levels control muscle contraction (see Chapter 16) and alterations are impli- 
cated in neurodegeneration and apoptosis. Clearly, cells and organisms depend 
on mitochondria in many different ways. 

We now return to the central function of the mitochondrion in respiratory ATP 
generation. 


A Chemiosmotic Process Couples Oxidation Energy to ATP 
Production 


Although the citric acid cycle that takes place in the mitochondrial matrix is con- 
sidered to be part of aerobic metabolism, it does not itself use oxygen. Only the 
final step of oxidative metabolism consumes molecular oxygen (O2) directly (see 
Figure 14-10). Nearly all the energy available from metabolizing carbohydrates, 
fats, and other foodstuffs in earlier stages is saved in the form of energy-rich com- 
pounds that feed electrons into the respiratory chain in the inner mitochondrial 
membrane. These electrons, most of which are carried by NADH, finally combine 
with O2 at the end of the respiratory chain to form water. The energy released 
during the complex series of electron transfers from NADH to Oz is harnessed in 
the inner membrane to generate an electrochemical gradient that drives the con- 
version of ADP + P; to ATP. For this reason, the term oxidative phosphorylation is 
used to describe this final series of reactions (Figure 14-12). 

The total amount of energy released by biological oxidation in the respiratory 
chain is equivalent to that released by the explosive combustion of hydrogen 
when it combines with oxygen in a single step to form water. But the combustion 
of hydrogen in a single-step chemical reaction, which has a strongly negative AG, 
releases this large amount of energy unproductively as heat. In the respiratory 
chain, the same energetically favorable reaction Hz + %2 O2 — H20 is divided into 
small steps (Figure 14-13). This stepwise process allows the cell to store nearly 
half of the total energy that is released in a useful form. At each step, the electrons, 
which can be thought of as having been removed from a hydrogen molecule to 
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Figure 14-12 The major net energy 
conversion catalyzed by the 
mitochondrion. In the process of oxidative 
phosphorylation, the mitochondrial inner 
membrane serves as a device that changes 
one form of chemical-bond energy to 
another, converting a major part of the 
energy of NADH oxidation into phosphate- 
bond energy in ATP. 


Figure 14-13 A comparison of biological 
oxidation with combustion. (A) If 
hydrogen were simply burned, nearly all of 
the energy would be released in the form of 
heat. (B) In biological oxidation reactions, 
about half of the released energy is stored 
in a form useful to the cell by means of the 
electron-transport chain (the respiratory 
chain) in the crista membrane of the 
mitochondrion. Only the rest of the energy 
is released as heat. In the cell, the protons 
and electrons shown here as being derived 
from He are removed from hydrogen 
atoms that are covalently linked to NADH 
molecules. 
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produce two protons, pass through a series of electron carriers in the inner mito- 
chondrial membrane. At each of three distinct steps along the way (marked by the 
three electron-transport complexes of the respiratory chain, see below), much of 
the energy is utilized for pumping protons across the membrane. At the end of the 
electron-transport chain, the electrons and protons recombine with molecular 
oxygen into water. 

Water is a very low-energy molecule and is thus very stable; it can serve as an 
electron donor only when a large amount of energy from an external source is 
spent on splitting it into protons, electrons, and molecular oxygen. This is exactly 
what happens in oxygenic photosynthesis, where the external energy source is the 
sun, as we shall see later in the section on chloroplasts (p. 782). 


The Energy Derived from Oxidation Is Stored as an 
Electrochemical Gradient 


In mitochondria, the process of electron transport begins when two electrons 
and a proton are removed from NADH (to regenerate NAD*). These electrons are 
passed to the first of about 20 different electron carriers in the respiratory chain. 
The electrons start at a large negative redox potential (see Panel 14-1, p. 765)— 
that is, at a high energy level—which gradually drops as they pass along the chain. 
The proteins involved are grouped into three large respiratory enzyme complexes, 
each composed of protein subunits that sit in the inner mitochondrial membrane. 
Each complex in the chain has a higher affinity for electrons than its predecessor, 
and electrons pass sequentially from one complex to the next until they are finally 
transferred to molecular oxygen, which has the highest electron affinity of all. 

The net result is the pumping of H* out of the matrix across the inner mem- 
brane, driven by the energetically favorable flow of electrons. This transmem- 
brane movement of H* has two major consequences: 


1. It generates a pH gradient across the inner mitochondrial membrane, with 
a high pH in the matrix (close to 8) and a lower pH in the intermembrane 
space. Since ions and small molecules equilibrate freely across the outer 
mitochondrial membrane, the pH in the intermembrane space is the same 
as in the cytosol (generally around pH 7.4). 


2. It generates a voltage gradient across the inner mitochondrial membrane, 
creating a membrane potential with the matrix side negative and the crista 
space side positive. 


The pH gradient (ApH) reinforces the effect of the membrane potential (AV), 
because the latter acts to attract any positive ion into the matrix and to push any 
negative ion out. Together, ApH and AV make up the electrochemical gradient, 
which is measured in units of millivolts (mV). This gradient exerts a proton- 
motive force, which tends to drive H* back into the matrix (Figure 14-14). 
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Figure 14-14 The electrochemical proton 
gradient across the inner mitochondrial 
membrane. This gradient is composed 

of a large force due to the membrane 
potential (AV) and a smaller force due to 
the H* concentration gradient—that is, the 
oH gradient (ApH). Both forces combine 
to generate the proton-motive force, which 
pulls H+ back into the mitochondrial matrix. 
The exact relationship between these 
forces is expressed by the Nernst equation 
(see Panel 11-1, p. 616). 


THE PROTON PUMPS OF THE ELECTRON-TRANSPORT CHAIN 


The electrochemical gradient across the inner membrane of a respiring mito- 
chondrion is typically about 180 mV (inside negative), and it consists of a mem- 
brane potential of about 150 mV and a pH gradient of about 0.5 to 0.6 pH units 
(each ApH of 1 pH unit is equivalent to a membrane potential of about 60 mV). 
The electrochemical gradient drives not only ATP synthesis but also the transport 
of selected molecules across the inner mitochondrial membrane, including the 
import of selected proteins from the cytoplasm (discussed in Chapter 12). 


Summary 


The mitochondrion performs most cellular oxidations and produces the bulk of the 
animal cell’s ATP. A mitochondrion has two separate membranes: the outer mem- 
brane and the inner membrane. The inner membrane surrounds the innermost 
space (the matrix) of the mitochondrion and forms the cristae, which project into 
the matrix. The matrix and the inner membrane cristae are the major working parts 
of the mitochondrion. The membranes that form cristae account for a major part 
of the membrane surface area in most cells, and they contain the mitochondrion’s 
electron-transport chain (the respiratory chain). 

The mitochondrial matrix contains a large variety of enzymes, including those 
that convert pyruvate and fatty acids to acetyl CoA and those that oxidize this acetyl 
CoA to CO2 through the citric acid cycle. These oxidation reactions produce large 
amounts of NADH, whose high-energy electrons are passed to the respiratory chain. 
The respiratory chain then uses the energy derived from transporting electrons from 
NADH to molecular oxygen to pump H* out of the matrix. This produces a large 
electrochemical proton gradient across the inner mitochondrial membrane, com- 
posed of contributions from both a membrane potential and a pH difference. This 
electrochemical gradient exerts a force to drive H* back into the matrix. This pro- 
ton-motive force is harnessed both to produce ATP and for the selective transport of 
metabolites across the inner mitochondrial membrane. 


THE PROTON PUMPS OF THE ELECTRON- 
TRANSPORT CHAIN 


Having considered in general terms how a mitochondrion uses electron transport 
to generate a proton-motive force, we now turn to the molecular mechanisms 
that underlie this membrane-based energy-conversion process. In describing the 
respiratory chain of mitochondria, we accomplish the larger purpose of explain- 
ing how an electron-transport process can pump protons across a membrane. As 
stated at the beginning of this chapter, mitochondria, chloroplasts, archaea, and 
bacteria use very similar chemiosmotic mechanisms. In fact, these mechanisms 
underlie the function of all living organisms—including anaerobes that derive all 
their energy from electron transfers between two inorganic molecules, as we shall 
see later. 

We start with some of the basic principles on which all of these processes 
depend. 


The Redox Potential ls a Measure of Electron Affinities 


In chemical reactions, any electrons removed from one molecule are always 
passed to another, so that whenever one molecule is oxidized, another is reduced. 
As with any other chemical reaction, the tendency of such redox reactions to 
proceed spontaneously depends on the free-energy change (AG) for the electron 
transfer, which in turn depends on the relative affinities of the two molecules for 
electrons. 

Because electron transfers provide most of the energy for life, itis worth taking 
the time to understand them. As discussed in Chapter 2, acids donate protons 
and bases accept them (see Panel 2-2, p. 93). Acids and bases exist in conjugate 
acid-base pairs, in which the acid is readily converted into the base by the loss ofa 
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proton. For example, acetic acid (CHCOOH) is converted into its conjugate base, 
the acetate ion (CH3COO ), in the reaction: 
CH3COOH = CH3COO™ + H* 

In an exactly analogous way, pairs of compounds such as NADH and NAD* 
are called redox pairs, since NADH is converted to NAD* by the loss of electrons 
in the reaction: 

NADH = NAD* + H* + 2e 

NADH is a strong electron donor: because two of its electrons are engaged in 
a covalent bond which releases energy when broken, the free-energy change for 
passing these electrons to many other molecules is favorable. Energy is required 
to form this bond from NAD+, two electrons, and a proton (the same amount of 
energy that was released when the bond was broken). Therefore NAD+, the redox 
partner of NADH, is of necessity a weak electron acceptor. 

We can measure the tendency to transfer electrons from any redox pair exper- 
imentally. All that is required is the formation of an electrical circuit linking a 1:1 
(equimolar) mixture of the redox pair to a second redox pair that has been arbi- 
trarily selected as a reference standard, so that we can measure the voltage differ- 
ence between them (Panel 14-1). This voltage difference is defined as the redox 
potential; electrons move spontaneously from a redox pair like NADH/NAD* with 
a lower redox potential (a lower affinity for electrons) to a redox pair like O2/H20 
with a higher redox potential (a higher affinity for electrons). Thus, NADH is a 
good molecule for donating electrons to the respiratory chain, while O2 is well 
suited to act as the “sink” for electrons at the end of the chain. As explained in 
Panel 14-1, the difference in redox potential, AF’, is a direct measure of the stan- 
dard free-energy change (AG°) for the transfer of an electron from one molecule 
to another. 


Electron Transfers Release Large Amounts of Energy 


As just discussed, those pairs of compounds that have the most negative redox 
potentials have the weakest affinity for electrons and therefore are useful as car- 
riers with a strong tendency to donate electrons. Conversely, those pairs that 
have the most positive redox potentials have the greatest affinity for electrons 
and therefore are useful as carriers with a strong tendency to accept electrons. 
A 1:1 mixture of NADH and NAD* has a redox potential of -320 mV, indicating 
that NADH has a strong tendency to donate electrons; a 1:1 mixture of H20 and 
7202 has a redox potential of +820 mV, indicating that Oz has a strong tendency to 
accept electrons. The difference in redox potential is 1140 mV, which means that 
the transfer of each electron from NADH to O» under these standard conditions is 
enormously favorable, since AG’ = -109 kJ/mole, and twice this amount of energy 
is gained for the two electrons transferred per NADH molecule (see Panel 14-1). 
If we compare this free-energy change with that for the formation of the phospho- 
anhydride bonds in ATP, where AG? = 30.6 kJ/mole (see Figure 2-50), we see that, 
under standard conditions, the oxidation of one NADH molecule releases more 
than enough energy to synthesize seven molecules of ATP from ADP and Pj. (In 
the cell, the number of ATP molecules generated will be lower because the stan- 
dard conditions are far from the physiological ones; in addition, small amounts of 
energy are inevitably dissipated as heat along the way.) 


Transition Metal lons and Quinones Accept and Release 
Electrons Readily 


The electron-transport properties of the membrane protein complexes in the 
respiratory chain depend upon electron-carrying cofactors, most of which are 
transition metals such as Fe, Cu, Ni, and Mn, bound to proteins in the complexes. 
These metals have special properties that allow them to promote both enzyme 
catalysis and electron-transfer reactions. Most relevant here is the fact that their 
ions exist in several different oxidation states with closely spaced redox poten- 
tials, which enables them to accept or give up electrons readily; this property is 


PANEL 14-1: Redox Potentials 


HOW REDOX POTENTIALS ARE MEASURED One beaker (left) contains substance A with an equimolar 


mixture of the reduced (Areduced) and oxidized (Aoxidized) 
members of its redox pair. The other beaker contains the 
hydrogen reference standard (2H+ + 2e- = H2), whose redox 
potential is arbitrarily assigned as zero by international 
agreement. (A salt bridge formed from a concentrated KCI 
solution allows K+ and Cl to move between the beakers, as 
required to neutralize the charges when electrons flow 
between the beakers.) The metal wire (dark blue) provides a 
resistance-free path for electrons, and a voltmeter then 
measures the redox potential of substance A. If electrons flow 
from Areduced to Ht, as indicated here, the redox pair formed 
by substance A is said to have a negative redox potential. If 
they instead flow from H2 to Aoxidizea, the redox pair is said to 
have a positive redox potential. 
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THE STANDARD REDOX POTENTIAL, E^ 
NADH = NAD* + H* + 2e -320 mV 

The standard redox potential for a redox pair, a 

defined as £E, is measured for a standard state a = a HU e +30 mV 

where all of the reactants are at a concentration of 

1 M, including H*. Since biological reactions occur at reduced — oxidized D See pai 

pH 7, biologists instead define the standard state as cytochrome c cytochrome c ame 

Areduced = Aoxidized and H* = 10-7 M. This standard 

redox potential is designated by the symbol Eù, in 

place of E,. 


H,O = %0, + 2H + 2e +820 mV 


CALCULATION OF AG? FROM EFFECT OF CONCENTRATION CHANGES 
REDOX POTENTIALS 


To determine the energy change for an electron 
transfer, the AG” of the reaction (kJ/mole) is calculated 


As explained in Chapter 2 (see p. 60), the actual free-energy 
as follows: 


change for a reaction, AG, depends on the concentration of the 
AG® = —n(0.096) AE, where n is the number of reactants and generally will be different from the standard free- 
electrons transferred across a redox potential energy change, AG’. The standard redox potentials are for a 1:1 
change of AE, millivolts (mV), and mixture of the redox pair. For example, the standard redox 
AE, = E,(acceptor) — E,(donor) potential of -320 mV is for a 1:1 mixture of NADH and NAD+. 
But when there is an excess of NADH over NAD+, electron 
EAAMPTE transfer from NADH to an electron acceptor becomes more 
favorable. This is reflected by a more negative redox potential 
and a more negative AG for electron transfer. 
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1:1 mixture of 1:1 mixture of oxidized 
NADH and NAD* and reduced ubiquinone 


For the transfer of one electron from NADH to 
ubiquinone: 
AE, = +30 - (-320) = +350 mV 
AG’ = —n(0.096)AE, = -1(0.096)(350) = -34 kJ/mole 
The same calculation reveals that the transfer of one 
electron from ubiquinone to oxygen has an even more 
favorable AG’ of -76 kJ/mole. The AG° value for the 


transfer of one electron from NADH to oxygen is the 
sum of these two values, —110 kJ/mole. 
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Figure 14-15 The structure of the heme group attached covalently to 
cytochrome c. The porphyrin ring of the heme is shown in red. There are six 
different cytochromes in the respiratory chain. Because the hemes in different 
cytochromes have slightly different structures and are kept in different local 
environments by their respective proteins, each has a different affinity for an 
electron, and a slightly different spectroscopic signature. 


exploited by the membrane protein complexes in the respiratory chain to move 
electrons both within and between complexes. 

Unlike the colorless atoms H, C, N, and O that constitute the bulk of biological 
molecules, transition metal ions are often brightly colored, which makes the pro- 
teins that contain them easy to study by spectroscopic methods using visible light. 
One family of such colored proteins, the cytochromes, contains a bound heme 
group, in which an iron atom is tightly held by four nitrogen atoms at the corners 
of a square in a porphyrin ring (Figure 14-15). Similar porphyrin rings are respon- 
sible both for the red color of blood and for the green color of leaves, binding an 
iron in hemoglobin or a magnesium in chlorophyll, respectively. 

Iron-sulfur proteins contain a second major family of electron-transfer cofac- 
tors. In this case, either two or four iron atoms are bound to an equal number of 
sulfur atoms and to cysteine side chains, forming iron-sulfur clusters in the pro- 
tein (Figure 14-16). Like the cytochrome hemes, these clusters carry one electron 
at a time. 

The simplest of the electron-transfer cofactors in the respiratory chain—and 
the only one that is not always bound to a protein—is a quinone (called ubiqui- 
none, or coenzyme Q). A quinone (Q) is a small hydrophobic molecule that is 
freely mobile in the lipid bilayer. This electron carrier can accept or donate either 
one or two electrons. Upon reduction (note that reduced quinones are called qui- 
nols), it picks up a proton from water along with each electron (Figure 14-17). 

In the mitochondrial electron-transport chain, six different cytochrome 
hemes, eight iron-sulfur clusters, three copper atoms, a flavin mononucleotide 
(another electron-transfer cofactor), and ubiquinone work in a defined sequence 
to carry electrons from NADH to Oz. In total, this pathway involves more than 
60 different polypeptides arranged in three large membrane protein complexes, 
each of which binds several of the above electron-carrying cofactors. 

As we would expect, the electron-transfer cofactors have increasing affinities 
for electrons (higher redox potentials) as the electrons move along the respiratory 
chain. The redox potentials have been fine-tuned during evolution by the protein 
environment of each cofactor, which alters the cofactor’s normal affinity for elec- 
trons. Because iron-sulfur clusters have a relatively low affinity for electrons, they 
predominate in the first half of the respiratory chain; in contrast, the heme cyto- 
chromes predominate further down the chain, where a higher electron affinity is 
required. 


NADH Transfers Its Electrons to Oxygen Through Three Large 
Enzyme Complexes Embedded in the Inner Membrane 


Membrane proteins are difficult to purify because they are insoluble in aqueous 
solutions, and they are easily disrupted by the detergents that are required to sol- 
ubilize them. But by using mild nonionic detergents, such as octylglucoside or 
dodecyl maltoside (see Figure 10-28), they can be solubilized and purified in their 
native form, and even crystallized for structure determination. Each of the three 
different detergent-solubilized respiratory-chain complexes can be re-inserted 


Figure 14-16 The structure of an iron-sulfur cluster. These dark brown 
clusters consist either of four iron and four sulfur atoms, as shown here, or 

of two irons and two sulfurs linked to cysteines in the polypeptide chain via 
covalent sulfur bridges, or to histidines. Although they contain several iron atoms, 
each iron-sulfur cluster can carry only one electron at a time. Nine different iron- 
sulfur clusters participate in electron transport in the respiratory chain. 








THE PROTON PUMPS OF THE ELECTRON-TRANSPORT CHAIN 
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into artificial lipid bilayer vesicles and shown to pump protons across the mem- 
brane as electrons pass through them. 

In the mitochondrion, the three complexes are linked in series, serving as elec- 
tron-transport-driven H+ pumps that pump protons out of the matrix to acidify 
the crista space (Figure 14-18): 


1. The NADH dehydrogenase complex (often referred to as Complex I) is 
the largest of these respiratory enzyme complexes. It accepts electrons 
from NADH and passes them through a flavin mononucleotide and eight 
iron-sulfur clusters to the lipid-soluble electron carrier ubiquinone. The 
reduced ubiquinol then transfers its electrons to cytochrome c reductase. 


2. The cytochrome c reductase (also called the cytochrome b-c, complex) is a 
large membrane protein assembly that functions as a dimer. Each mono- 
mer contains three cytochrome hemes and an iron-sulfur cluster. The com- 
plex accepts electrons from ubiquinol and passes them on to the small, sol- 
uble protein cytochrome c, which is located in the crista space and carries 
electrons one at a time to cytochrome c oxidase. 


3. The cytochrome c oxidase complex contains two cytochrome hemes and 
three copper atoms. The complex accepts electrons one at a time from 
cytochrome c and passes them to molecular oxygen. In total, four electrons 
and four protons are needed to convert one molecule of oxygen to water. 


We have previously discussed how the redox potential reflects electron affini- 
ties. Figure 14-19 presents an outline of the redox potentials measured along the 
respiratory chain. These potentials change in three large steps, one across each 
proton-translocating respiratory complex. The change in redox potential between 
any two electron carriers is directly proportional to the free energy released when 
an electron transfers between them. Each complex acts as an energy-conversion 
device by harnessing some of this free-energy change to pump H* across the inner 
membrane, thereby creating an electrochemical proton gradient as electrons pass 
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Figure 14-17 Quinone electron 

carriers. Ubiquinone in the lipid bilayer 
picks up one H+ (red) from the aqueous 
environment for each electron (blue) it 
accepts, in two steps, from respiratory- 
chain complexes. The first step involves 
the acquisition of a proton and an electron 
and converts the ubiquinone into an 
unstable ubisemiquinone radical. In the 
second step, it becomes a fully reduced 
ubiquinone (called ubiquinol), which is 
freely mobile as an electron carrier in the 
lipid bilayer of the membrane. When the 
ubiquinol donates its electrons to the next 
complex in the chain, the two protons 

are released. The long hydrophobic tail 
(green) that confines ubiquinone to the 
membrane consists of 6-10 five-carbon 
isoprene units, depending on the organism. 
The corresponding electron carrier in the 
photosynthetic membranes of chloroplasts 
is plastoquinone, which has almost the 
same structure and works in the same way. 
For simplicity, we refer to both ubiquinone 
and plastoquinone in this chapter as 
quinone (abbreviated as Q). 


Figure 14-18 The path of electrons 
through the three respiratory-chain 
proton pumps. (Movie 14.3) The 
approximate size and shape of each 
complex is shown. During the transfer 
of electrons from NADH to oxygen (blue 
arrows), ubiquinone and cytochrome c 
serve as mobile carriers that ferry electrons 
from one complex to the next. During the 
electron-transfer reactions, protons are 
pumped across the membrane by each 
of the respiratory enzyme complexes, as 
indicated (red arrows). 

For historical reasons, the three 
proton pumps in the respiratory chain 
are sometimes denoted as Complex |, 
Complex Ill, and Complex IV, according 
to the order in which electrons pass 
through them from NADH. Electrons from 
the oxidation of Succinate by succinate 
dehydrogenase (designated as Complex ll) 
are fed into the electron-transport chain in 
the form of reduced ubiquinone. Although 
embedded in the crista membrane, 
succinate dehydrogenase does not pump 
protons and thus does not contribute to 
the proton-motive force; it is therefore not 
considered to be an integral part of the 
respiratory chain. 
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X-ray crystallography has elucidated the structure of each of the three respira- 
tory-chain complexes in great detail, and we next examine each of them in turn to 
see how they work. 


The NADH Dehydrogenase Complex Contains Separate Modules 
for Electron Transport and Proton Pumping 


The NADH dehydrogenase complex is a massive assembly of membrane and 
nonmembrane proteins that receives electrons from NADH and passes them to 
ubiquinone. In animal mitochondria, it consists of more than 40 different pro- 
tein subunits, with a molecular mass of nearly a million daltons. The x-ray struc- 
tures of the NADH dehydrogenase complex from fungi and bacteria show that it 
is L-shaped, with both a hydrophobic membrane arm and a hydrophilic arm that 
projects into the mitochondrial matrix (Figure 14-20). 

Electron transfer and proton pumping are physically separated in the NADH 
dehydrogenase complex, with electron transfer occurring in the matrix arm and 
proton pumping in the membrane arm. The NADH docks near the tip of the matrix 
arm, where it transfers its electrons via a bound flavin mononucleotide to a string 
of iron-sulfur clusters that runs down the arm, acting like a wire to carry electrons 
to a protein-bound molecule of ubiquinone. Electron transfer to the quinone 
is thought to trigger proton translocation in a set of proton pumps in the mem- 
brane arm, and for this to happen the two processes must be energetically and 
mechanically linked. A mechanical link is thought to be provided by a 6-nm long, 
amphipathic a helix that runs parallel to the membrane surface on the matrix 
side of the membrane arm. This helix may act like the connecting rod in a steam 
engine to generate a mechanical, energy-transducing power stroke that links the 
quinone-binding site to the proton-translocating modules in the membrane (see 
Figure 14-20). 

The reduction of each quinone by the transfer of two electrons can cause four 
protons to be pumped out of the matrix into the crista space. In this way, NADH 
dehydrogenase generates roughly half of the total proton-motive force in mito- 
chondria. 


Cytochrome c Reductase Takes Up and Releases Protons on the 
Opposite Side of the Crista Membrane, Thereby Pumping Protons 


As described previously, when a quinone molecule (Q) accepts its two elec- 
trons, it also takes up two protons to form a quinol (QH3; see Figure 14-17). In 


Figure 14-19 Redox potential changes 
along the mitochondrial electron- 
transport chain. The redox potential 
(designated £'o) increases as electrons flow 
down the respiratory chain to oxygen. The 
standard free-energy change in kilojoules, 
AG”, for the transfer of each of the two 
electrons donated by an NADH molecule 
can be obtained from the left-hand 
ordinate [AG° = -n(0.096) AE’, where 
n is the number of electrons transferred 
across a redox potential change of AE’ 
mV]. Electrons flow through a respiratory 
enzyme complex by passing in Sequence 
through the multiple electron carriers in 
each complex (blue arrows). As indicated, 
part of the favorable free-energy change 
is harnessed by each enzyme complex to 
pump H* across the inner mitochondrial 
membrane (red arrows). The NADH 
dehydrogenase pumps up to four H+ per 
electron, the cytochrome c reductase 
complex pumps two, whereas the 
cytochrome c oxidase complex pumps one 
oer electron. 

Note that NADH is not the only source 
of electrons for the respiratory chain. 
The flavin FADH»2, which is generated 
by fatty acid oxidation (see Figure 2-56) 
and by succinate dehydrogenase in the 
citric acid cycle (see Figure 2-57), also 
contributes. Its two electrons are passed 
directly to ubiquinone, bypassing NADH 
dehydrogenase. 


THE PROTON PUMPS OF THE ELECTRON-TRANSPORT CHAIN 
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Figure 14-20 The structure of NADH dehydrogenase. (A) The model of the mitochondrial complex shown here is based on 
the x-ray structure of the smaller bacterial complex, which works in the same way. The matrix arm of NADH dehydrogenase 
(also known as Complex |) contains eight iron-sulfur (FeS) clusters that appear to participate in electron transport. The 
membrane contains more than 70 transmembrane helices, forming three distinct proton-pumping modules, while the matrix arm 
contains the electron-transport cofactors. (B) NADH donates two electrons, via a bound flavin mononucleotide (FMN; yellow), 

to a chain of seven iron-sulfur clusters (red and yellow spheres). From the terminal iron-sulfur cluster, the electrons pass to 
ubiquinone (orange). Electron transfer results in conformational changes (black arrows) that are thought to be transmitted to a 
long amphipathic a helix (ourple) on the matrix side of the membrane arm, which pulls on discontinuous transmembrane helices 
(red) in three membrane subunits, each of which resembles an antiporter (see Chapter 11). This movement is thought to change 
the conformation of charged residues in the three proton channels, resulting in the translocation of three protons out of the 
matrix. A fourth proton may be translocated at the interface of the two arms (dotted line). (C) This shows the symbol for NADH 
dehydrogenase used throughout this chapter. (Adapted from R.G. Efremov, R. Baradaran and L.A. Sazanov, Nature 


465:441-445, 2010. PDB code: 3M9S.) 


the respiratory chain, ubiquinol tranfers electrons from NADH dehydrogenase to 
cytochrome c reductase. Because the protons in this QH molecule are taken up 
from the matrix and released on the opposite side of the crista membrane, two 
protons are transferred from the matrix into the crista space per pair of electrons 
transferred (Figure 14-21). This vectorial transfer of protons supplements the 
electrochemical proton gradient that is created by the NADH dehydrogenase pro- 
ton pumping just discussed. 

Cytochrome c reductase is a large assembly of membrane protein subunits. 
Three subunits form a catalytic core that passes electrons from ubiquinol to 
cytochrome c, with a structure that has been highly conserved from bacterial 
ancestors (Figure 14-22). It pumps protons by a vectorial transfer of protons 
that involves a binding site for a second molecule of ubiquinone; the elaborate 
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Figure 14-21 How a directional release 
and uptake of protons by a quinone 
pumps protons across a membrane. 
Two protons are picked up on the matrix 
side of the inner mitochondrial membrane 
when the reaction Q + 2e7 + 2H* — QHs 
is catalyzed by the NADH dehydrogenase 
complex. This molecule of ubiquinol 
(QH»2) diffuses rapidly in the plane of the 
membrane, becoming bound to the crista 
side of cytochrome c reductase. When 

its oxidation by cytochrome c reductase 
generates two protons and two electrons 
(see Figure 14-17), the two protons are 
released into the crista space. The flow of 
electrons is not shown in this diagram. 
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Figure 14-22 The structure of cytochrome c reductase. Cytochrome c reductase (also known as the cytochrome b-C4 
complex) is a dimer of two identical 240,000-dalton halves, each composed of 11 different protein molecules in mammals. 

(A) A structure graphic of the entire dimer, showing in color the three proteins that form the functional core of the enzyme 
complex: cytochrome b (green) and cytochrome c4 (blue) are colored in one half, and the Rieske protein (purple) containing an 
FeoSe iron-sulfur cluster (red and yellow) is colored in the other. These three protein subunits interact across the two halves. 
(B) Transfer of electrons through cytochrome c reductase to the small, soluble carrier protein cytochrome c. Electrons entering 
from ubiquinol near the matrix side of the membrane are captured by the iron-sulfur cluster of the Rieske protein, which moves 
its iron-sulfur group back and forth to transfer these electrons to heme c (red). Heme c then transfers them to the carrier 


molecule cytochrome c. 


As detailed in Figure 14-23, only one of the two electrons from each ubiquinol is transferred through this path. To increase 
proton pumping, the second ubiquinol electron is passed to a molecule of ubiquinone bound to cytochrome c reductase on the 
opposite side of the membrane—near the matrix. (C) This shows the symbol for cytochrome c reductase used throughout this 


chapter. (PDB code: 1EZV.) 


redox loop mechanism used is called the Q cycle because while one of the elec- 
trons received from each QH2 molecule is transferred from ubiquinone through 
the complex to the carrier protein cytochrome c, the other electron is recycled 
back into the quinone pool. Through the mechanism illustrated in Figure 14-23, 
the Q cycle increases the total amount of redox energy that can be stored in the 
electrochemical proton gradient. As a result, two protons are pumped across the 
crista membrane for every electron that is transferred from NADH dehydrogenase 
to cytochrome c. 


The Cytochrome c Oxidase Complex Pumps Protons and 
Reduces Os Using a Catalytic Iron—-Copper Center 


The final link in the mitochondrial electron-transport chain is cytochrome c oxi- 
dase. The cytochrome c oxidase complex accepts electrons from the soluble elec- 
tron carrier cytochrome c, and it uses yet a different, third mechanism to pump 
protons across the inner mitochondrial membrane. The structure of the mam- 
malian complex is illustrated in Figure 14-24. The atomic-resolution structures, 
combined with studies of the effect of mutations introduced into the enzyme by 
genetic engineering of the yeast and bacterial proteins, have revealed the detailed 
mechanisms of this electron-driven proton pump. 
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Because oxygen has a high affinity for electrons, it can release a large amount 
of free energy when itis reduced to form water. Thus, the evolution of cellular res- 
piration, in which O% is converted to water, enabled organisms to harness much 
more energy than can be derived from anaerobic metabolism. As we discuss later, 
the availability of the large amount of energy released by the reduction of molec- 
ular oxygen to form water is thought to have been essential to the emergence of 
multicellular life: this would explain why all large organisms respire. The ability 
of biological systems to use O2 in this way, however, requires sophisticated chem- 
istry. Once a molecule of O2 has picked up one electron, it forms a superoxide 
radical anion (02°) that is dangerously reactive and rapidly takes up an additional 
three electrons wherever it can get them, with destructive effects on its imme- 
diate environment. We can tolerate oxygen in the air we breathe only because 
the uptake of the first electron by the O2 molecule is slow, allowing cells to use 
enzymes to control electron uptake by oxygen. Thus, cytochrome c oxidase holds 
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Figure 14—23 The two-step mechanism 

of the cytochrome c reductase Q-cycle. 
(A) In step 1, ubiquinol reduced by NADH 
dehydrogenase docks to the cytochrome c 
reductase complex. Oxidation of the quinol 
produces two protons and two electrons. 
The protons are released into the cristae 
space. One electron passes via an iron-sulfur 
cluster to heme cy, and then to the soluble 
electron carrier protein cytochrome c on the 
membrane surface. The second electron 
passes via hemes b, and by to a ubiquinone 
(red Q) bound at a separate site near the 
matrix side of the protein. Uptake of a proton 
from the matrix produces an ubisemiquinone 
radical (see Figure 14-17), which remains 
bound to this site (red QH? in B). 

(B) In step 2, a second ubiquinol (blue 
QH»2) docks and releases two protons and 
two electrons, as described for step 1. One 
electron is passed to a second cytochrome 
c, whereas the other electron is accepted 
by the ubisemiquinone. The ubisemiquinone 
takes up a proton from the matrix and is 
released into the lipid bilayer as fully reduced 
ubiquinol (red QHp). 

On balance, the oxidation of one ubiquinol 
in the Q cycle pumps two protons through 
the membrane by a directional release 
and uptake of protons (see Figure 14-21), 
while releasing another two into the cristae 
space. In addition, in each of the two steps 
(A) and (B), one electron is transferred to a 
cytochrome c carrier (Movie 14.4). 





Figure 14-24 The structure of cytochrome c oxidase. The final complex in the mitochondrial electron-transfer chain consists of 13 different 
protein subunits, with a total mass of 204,000 daltons. (A) The entire dimeric complex is shown, positioned in the crista membrane. The highly 
conserved subunits | (green), Il (ourple), and Ill (o/ue) are encoded by the mitochondrial genome, and they form the functional core of the enzyme. 

(B) The functional core of the complex. Electrons pass through this structure from cytochrome c via bound copper ions (blue spheres) and hemes 
(red) to an O2 molecule bound between heme a3 and a copper ion. The four protons needed to reduce Oz to water are taken up from the matrix; see 
also Figure 14-25. (C) This shows the symbol for cytochrome c oxidase used throughout this chapter. (PDB code: 20CC.) 
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on to oxygen at a special bimetallic center, where it remains clamped between a 
heme-linked iron atom and a copper ion until it has picked up a total of four elec- 
trons. Only then are the two oxygen atoms of the oxygen molecule safely released 
as two molecules of water (Figure 14-25). 

The cytochrome c oxidase reaction accounts for about 90% of the total oxygen 
uptake in most cells. This protein complex is therefore crucial for all aerobic life. 
Cyanide and azide are extremely toxic because they bind to the heme iron atoms 
in cytochrome c oxidase much more tightly than does oxygen, thereby greatly 
reducing ATP production. 


The Respiratory Chain Forms a Supercomplex in the Crista 
Membrane 


By using cryoelectron microscopy to examine proteins that have been very gently 
isolated, it can be shown that the three protein complexes that form the respira- 
tory chain assemble into an even larger supercomplex in the crista membrane. 
As illustrated in Figure 14-26, this structure is thought to help the mobile elec- 
tron carriers ubiquinone (in the crista membrane) and cytochrome c (in the crista 
space) transfer electrons with high efficiency. The formation of the supercom- 
plex depends on the presence of the mitochondrial lipid cardiolipin (see Figure 
14-11), which presumably works like a hydrophobic glue that holds the compo- 
nents together. 

In addition to the three proton pumps in the supercomplex just discussed, 
one of the enzymes in the citric acid cycle, succinate dehydrogenase, is embed- 
ded in the mitochondrial crista membrane. In the course of oxidizing succinate 
to fumarate in the matrix, this enzyme complex captures electrons in the form of 
a tightly bound FADH, molecule (see Panel 2-9, pp. 106-107) and passes them to 
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Figure 14-25 The reaction of O2 with electrons in cytochrome c oxidase. Electrons from cytochrome c pass through the complex via bound 
copper ions (blue spheres) and hemes (red) to an O2 molecule bound between heme a3 and a copper ion. Iron ions are shown as red spheres. The 
iron atom in heme a serves as an electron queuing point where electrons are held so that they can be released to an O2 molecule (not shown) that 

is held at the bimetallic center active site, which is formed by the central iron of the other heme (heme a3) and a closely apposed copper atom. The 
four protons needed to reduce Oo to water are removed from the matrix. For each O2 molecule that undergoes the reaction 4e7 + 4H* + O2 > 2H20, 
another four protons are pumped out of the matrix by mechanisms that are driven by allosteric changes in protein conformation (See Figure 14-28). 


THE PROTON PUMPS OF THE ELECTRON-TRANSPORT CHAIN 


iron-sulfur 


cytochrome c 
cluster ~ a 







CRISTA 
SPACE 


MATRIX 


iron-sulfur < cytochrome c 
clusters / oxidase 
cytochrome c 
reductase 


Nes NAD+ 


NADH dehydrogenase 


a molecule of ubiquinone. The reduced ubiquinol then passes its two electrons 
to the respiratory chain via cytochrome c reductase (see Figure 14-18). Succinate 
dehydrogenase is not a proton pump, and it does not contribute directly to the 
electrochemical potential utilized for ATP production in mitochondria. Thus, it is 
not considered to be an integral part of the respiratory chain. 


Protons Can Move Rapidly Through Proteins Along Predefined 
Pathways 


The protons in water are highly mobile: by rapidly dissociating from one water 
molecule and associating with its neighbor, they can rapidly flit through a hydro- 
gen-bonded network of water molecules (see Figure 2-5). But how can a pro- 
ton move through the hydrophobic interior of a protein embedded in the lipid 
bilayer? Proton-translocating proteins contain so-called proton wires, which are 
rows of polar or ionic side chains, or water molecules spaced at short distances, 
so that the protons can jump from one to the next (Figure 14-27). Along such 
predefined pathways, protons move up to 40 times faster than through bulk water. 
The three-dimensional structure of cytochrome c oxidase indicates two different 
proton-uptake pathways. This confirmed earlier mutagenesis studies, which had 
shown that replacing the side chains of particular aspartate or arginine residues, 
whose side chains can bind and release protons, made the cytochrome c oxidase 
less efficient as a proton pump. 

But how can electron transport cause allosteric changes in protein conforma- 
tions that pump protons? From the most basic point of view, if electron transport 
drives sequential allosteric changes in protein conformation that alter the redox 
state of the components, these conformational changes can be connected to pro- 
tein wires that allow the protein to pump H* across the crista membrane. This 
type of Ht pumping requires at least three distinct conformations for the pump 
protein, as schematically illustrated in Figure 14-28. 


Figure 14-27 Proton movement through water and proteins. (A) Protons 
move rapidly through water, hopping from one H20 molecule to the next 

by the continuous formation and dissociation of hydronium ions, H3O0* (see 
Chapter 2). In this diagram, proton jumps are indicated by red arrows. 

(B) Protons can move even more rapidly through a protein along “proton 
wires.” These are predefined proton paths consisting of suitably soaced 
amino acid side chains that accept and release protons easily (Asp, Glu) 

or carry a waterlike hydroxyl group (Ser, Thr), along with water molecules 
trapped in the protein interior. 
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Figure 14-26 The respiratory-chain 
supercomplex from bovine heart 
mitochondria. The three proton-pumping 
complexes of the mitochondrial respiratory 
chain of mammalian mitochondria 
assemble into large supercomplexes in the 
crista membrane. Supercomplexes can 

be isolated by mild detergent treatment 

of mitochondria, and their structure 

has been deciphered by single-particle 
cryoelectron microscopy. The bovine 

heart Supercomplex has a total mass of 
1.7 megadaltons. Shown is a schematic 
of such a complex that consists of 

NADH dehydrogenase, cytochrome c 
reductase, and cytochrome c oxidase, as 
indicated. The facing quinol-binding sites 
of NADH dehydrogenase and cytochrome 
c reductase, plus the short distance 
between the cytochrome c-binding sites in 
cytochrome c reductase and cytochrome 
c oxidase, facilitate fast, efficient electron 
transfer. Cofactors active in electron 
transport are marked as a yellow dot 
(flavin mononucleotide), red and yellow 
dots (iron-sulfur clusters), Q (quinone), red 
squares (hemes), and a blue dot (copper 
atom). Only cofactors participating in the 
linear flow of electrons from NADH to water 
are shown. Blue arrows indicate the path 
of the electrons through the supercomplex. 
(Adapted from T. Athoff et al., EMBO J. 
30:4652-4664, 2011.) 
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Summary 


The respiratory chain embedded in the inner mitochondrial membrane contains 
three respiratory enzyme complexes, through which electrons pass on their way 
from NADH to Os. In these complexes, electrons are transferred along a series of pro- 
tein-bound electron carriers, including hemes and iron-sulfur clusters. The energy 
released as the electrons move to lower and lower energy levels is used to pump 
protons by different mechanisms in the three respiratory enzyme complexes, each 
coupling lateral electron transport to vectorial proton transport across the mem- 
brane. Electrons are shuttled between enzyme complexes by the mobile electron car- 
riers ubiquinone and cytochrome c to complete the electron-transport chain. The 
path of electron flow is NADH — NADH dehydrogenase complex — ubiquinone 
— cytochrome c reductase — cytochrome c — cytochrome c oxidase complex — 
molecular oxygen (Oo). 


ATP PRODUCTION IN MITOCHONDRIA 


As we have just discussed, the three proton pumps of the respiratory chain each 
contribute to the formation of an electrochemical proton gradient across the inner 
mitochondrial membrane. This gradient drives ATP synthesis by ATP synthase, a 
large membrane-bound protein complex that performs the extraordinary feat of 
converting the energy contained in this electrochemical gradient into biologically 
useful, chemical-bond energy in the form of ATP (see Figure 14-10). Protons flow 
down their electrochemical gradient through the membrane part of this proton 
turbine, thereby driving the synthesis of ATP from ADP and P; in the extramem- 
branous part of the complex. As discussed in Chapter 2, the formation of ATP from 
ADP and inorganic phosphate is highly unfavorable energetically. As we shall see, 
ATP synthase can produce ATP only because of allosteric shape changes in this 
protein complex that directly couple ATP synthesis to the energetically favorable 
flow of protons across its membrane. 


The Large Negative Value of AG for ATP Hydrolysis Makes ATP 
Useful to the Cell 


An average person turns over roughly 50 kg of ATP per day. In athletes running a 
marathon, this figure can go up to several hundred kilograms. The ATP produced 
in mitochondria is derived from the energy available in the intermediates NADH, 
FADH,, and GTP. These three energy-rich compounds are produced both by the 
oxidation of glucose (Table 14-1A), and by the oxidation of fats (Table 14-1B; see 
also Figure 2-56). 

Glycolysis alone can produce only two molecules of ATP for every molecule of 
glucose that is metabolized, and this is the total energy yield for the fermentation 
processes that occur in the absence of O2 (discussed in Chapter 2). In oxidative 


Figure 14-28 A general model for Ht 
pumping coupled to electron transport. 
This mechanism for Ht pumping by a 
transmembrane protein is thought to 

be used by NADH dehydrogenase and 
cytochrome c oxidase, and by many 

other proton pumps. The protein is driven 
through a cycle of three conformations. In 
one of these conformations, the protein has 
a high affinity for H*, causing it to pick up 
an H* on the inside of the membrane. In 
another conformation, the protein has a low 
affinity for H*, causing it to release an H+ on 
the outside of the membrane. As indicated, 
the transitions from one conformation 

to another occur only in one direction, 
because they are being driven by being 
allosterically coupled to the energetically 
favorable process of electron transport 
(discussed in Chapter 11). 


ATP PRODUCTION IN MITOCHONDRIA 


TABLE 14-1 


In cytosol (glycolysis) 
1 glucose — 2 pyruvate + 2 NADH + 2 ATP 


In mitochondrion (pyruvate dehydrogenase and citric acid cycle) 


2 pyruvate — 2 acetyl CoA + 2 NADH 
2 acetyl CoA — 6 NADH + 2 FADHo + 2 GTP 


Net result in mitochondrion 
2 pyruvate — 8 NADH + 2 FADH + 2 GTP 


In mitochondrion (fatty acid oxidation and citric acid cycle) 


1 palmitoyl CoA — 8 acetyl CoA + 7 NADH + 7 FADH» 
8 acetyl CoA — 24 NADH + 8 FADH2 + 8 GIP 


Net result in mitochondrion 
1 palmitoyl CoA — 31 NADH + 15 FADHs + 8 GTP 


phosphorylation, each pair of electrons donated by the NADH produced in mito- 
chondria can provide energy for the formation of about 2.5 molecules of ATP. Oxi- 
dative phosphorylation also produces 1.5 ATP molecules per electron pair from 
the FADH2 produced by succinate dehydrogenase in the mitochondrial matrix, 
and from the NADH molecules produced by glycolysis in the cytosol. From the 
product yields of glycolysis and the citric acid cycle, we can calculate that the 
complete oxidation of one molecule of glucose—starting with glycolysis and end- 
ing with oxidative phosphorylation—gives a net yield of about 30 molecules of 
ATP. Nearly all this ATP is produced by the mitochondrial ATP synthase. 

In Chapter 2, we introduced the concept of free energy (G). The free-energy 
change for a reaction, AG, determines whether that reaction will occur in a cell. 
We showed on pp. 60-63 that the AG for a given reaction can be written as the sum 
of two parts: the first, called the standard free-energy change, AG’, depends only 
on the intrinsic characters of the reacting molecules; the second depends only on 
their concentrations. For the simple reaction A — B, 


À [B] 
AG = AG? + RT In Al 
where [A] and [B] denote the concentrations of A and B, and In is the natural log- 
arithm. AG” is the standard reference value, which can be seen to be equal to the 
value of AG when the molar concentrations of A and B are equal (since In 1 = 0). 

In Chapter 2, we discussed how the large, favorable free-energy change (large 
negative AG) for ATP hydrolysis is used, through coupled reactions, to drive 
many other chemical reactions in the cell that would otherwise not occur (see 
pp. 65-66). The ATP hydrolysis reaction produces two products, ADP and P;; it is 
therefore of the type A — B + C, where, as demonstrated in Figure 14-29, 

AG = AG? + RT In [B]IC] 
[A] 

When ATP is hydrolyzed to ADP and P; under the conditions that normally 
exist in a cell, the free-energy change is roughly -46 to -54 kJ/mole (-11 to -13 
kcal/mole). This extremely favorable AG depends on maintaining a high concen- 
tration of ATP compared with the concentrations of ADP and P;. When ATP, ADP, 
and P; are all present at the same concentration of 1 mole/liter (so-called standard 
conditions), the AG for ATP hydrolysis drops to the standard free-energy change 
(AG°), which is only -30.5 kJ/mole (-7.3 kcal/mole). At much lower concentra- 
tions of ATP relative to ADP and P;, AG becomes zero. At this point, the rate at 
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For the reaction 


W ————> ADP + ® 


the following equation applies: 





where AG and AG? are in Joules per mole, R is the gas 
constant (8.3 J/mole K), T is the absolute temperature 
(K), and all the concentrations are in moles per liter. 

When the concentrations of all reactants are at 1M, AG = AG? 
(since RT In 1 = 0). AG? is thus a constant defined as the 
standard free-energy change for the reaction. 


At equilibrium the reaction has no net effect on the disorder of 
the universe, so AG = 0. Therefore, at equilibrium, 


(ADP 1O) 


[ATP] 


= AG? 


But the concentrations of reactants at equilibrium must satisfy 
the equilibrium equation: 


[ADP LE) _ , 


[ATP] 





We thus see that whereas AG° indicates the equilibrium point for a 
reaction, AG reveals how far the reaction is from equilibrium. AG is 

a measure of the “driving force” for any chemical reaction, just as the 
proton-motive force is the driving force for the translocation of protons. 











which ADP and P; will join to form ATP will be equal to the rate at which ATP 
hydrolyzes to form ADP and Pj. In other words, when AG = 0, the reaction is at 
equilibrium (see Figure 14-29). 

It is AG, not AG, that indicates how far a reaction is from equilibrium and 
determines whether it can drive other reactions. Because the efficient conver- 
sion of ADP to ATP in mitochondria maintains such a high concentration of ATP 
relative to ADP and P;, the ATP hydrolysis reaction in cells is kept very far from 
equilibrium and AG is correspondingly very negative. Without this large disequi- 
librium, ATP hydrolysis could not be used to drive the reactions of the cell. At low 
ATP concentrations, many biosynthetic reactions would run backward and the 
cell would die. 


The ATP Synthase Is a Nanomachine that Produces ATP by 
Rotary Catalysis 


The ATP synthase is a finely tuned nanomachine composed of 23 or more sepa- 
rate protein subunits, with a total mass of about 600,000 daltons. The ATP synthase 
can work both in the forward direction, producing ATP from ADP and phosphate 
in response to an electrochemical gradient, or in reverse, generating an electro- 
chemical gradient by ATP hydrolysis. To distinguish it from other enzymes that 
hydrolyze ATP, it is also called an FıFo ATP synthase or F-type ATPase. 
Resembling a turbine, ATP synthase is composed of both a rotor and a stator 
(Figure 14-30). To prevent the catalytic head from rotating, a stalk at the periphery 
of the complex (the stator stalk) connects the head to stator subunits embedded 
in the membrane. A second stalk in the center of the assembly (the rotor stalk) is 
connected to the rotor ring in the membrane that turns as protons flow through it, 
driven by the electrochemical gradient across the membrane. As a result, proton 


Figure 14—29 The basic relationship 
between free-energy changes and 
equilibrium in the ATP hydrolysis 
reaction. The rate constants in boxes 1 
and 2 are determined from experiments in 
which product accumulation is measured 
as a function of time (conc., concentration). 
The equilibrium constant shown here, K, 
is in units of moles per liter. (Gee Panel 
2-7, pp. 102-103, for a discussion of 
free energy and see Figure 3-44 for a 
discussion of the equilibrium constant.) 


ATP PRODUCTION IN MITOCHONDRIA 


flow makes the rotor stalk rotate inside the stationary head, where the catalytic 
sites that assemble ATP from ADP and P; are located. Three a and three B subunits 
of similar structure alternate to form the head. Each of the three B subunits has a 
catalytic nucleotide-binding site at the a/f interface. These catalytic sites are all in 
different conformations, depending on their interaction with the rotor stalk. This 
stalk acts like a camshaft, the device that opens and closes the valves in a com- 
bustion engine. As it rotates within the head, the stalk changes the conformations 
of the B subunits sequentially. One of the possible conformations of the catalytic 
sites has high affinity for ADP and P;, and as the rotor stalk pushes the binding site 
into a different conformation, these two substrates are driven to form ATP. In this 
way, the mechanical force exerted by the central rotor stalk is directly converted 
into the chemical energy of the ATP phosphate bond. 

Serving as a proton-driven turbine, the ATP synthase is driven by H* flow into 
the matrix to spin at about 8000 revolutions per minute, generating three mole- 
cules of ATP per turn. In this way, each ATP synthase can produce roughly 400 
molecules of ATP per second. 


Proton-driven Turbines Are of Ancient Origin 


The membrane-embedded rotors of ATP synthases consist of a ring of identical c 
subunits (Figure 14-31). Each c subunit is a hairpin of two membrane-spanning 
a helices that contain a proton-binding site defined by a glutamate or aspartate in 
the middle of the lipid bilayer. The a subunit, which is part of the stator (see Figure 
14-30), makes two narrow channels at the interface between the rotor and stator, 
each spanning half of the membrane and converging on the proton-binding site 
at the middle of the rotor subunit. Protons flow through the two half-channels 
down their electrochemical gradient from the crista space back into the matrix. 
A negatively charged side chain in the binding site accepts a proton arriving from 
the crista space through the first half-channel, as it rotates past the a subunit. The 
bound proton then rides round in the ring for a full cycle, whereupon it is thought 
to be displaced by a positively charged arginine in the a subunit, and escapes 
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Figure 14-30 ATP synthase. The 
three-dimensional structure of the F4Fo 
ATP synthase, determined by x-ray 
crystallography. Also Known as an F-type 
ATPase, it consists of an Fo part (from 
“oligomycin-sensitive factor”) in the 
membrane and the large, catalytic F4 head 
in the matrix. Under mild dissociation 
conditions, this complex separates into 

its F4 and Fp components, which can 

be isolated and studied individually. (A) 
Diagram of the enzyme complex showing 
how its globular head portion (green) is 
kept stationary as proton-flow across the 
membrane drives a rotor (blue) that turns 
inside it. (B) In bovine heart mitochondria, 
the Fo rotor ring in the membrane (light 
blue) has eight c subunits. It is attached 

to the y subunit of the central stalk (dark 
blue) by the € subunit (ourple). The catalytic 
F4 head consists of a ring of three a and 
three B subunits (light and dark green), and 
it directly converts mechanical energy into 
chemical-bond energy in ATP, as described 
in the text. The elongated peripheral stalk 
of the stator (orange) is connected to the 
F4 head by the small 6 subunit (red) at one 
end, and to the a subunit in the membrane 
(oink oval) at the other. Together with the 

c subunits of the ring rotating past it, the a 
subunit creates a path for protons through 
the membrane. (C) The symbol for ATP 
synthase used throughout this book. 

The closely related ATP synthases of 
mitochondria, chloroplasts, and bacteria 
synthesize ATP by harnessing the proton- 
motive force across a membrane. This 
powers the rotation of the rotor against 
the stator in a counterclockwise direction, 
as seen from the F; head. The same 
enzyme complex can also pump protons 
against their electrochemical gradient 
by hydrolyzing ATP, which then drives 
the clockwise rotation of the rotor. The 
direction of operation depends on the net 
free-energy change (AG) for the coupled 
processes of H+ translocation across the 
membrane and the synthesis of ATP from 
ADP and P; (Movie 14.5 and Movie 14.6). 

Measurement of the torque that the ATP 
synthase can produce by ATP hydrolysis 
reveals that the ATP synthase is 60 times 
more powerful than a diesel engine of equal 
dimensions. (B, courtesy of K. Davies. 
PDB codes: 2WPD, 2CLY, 2WSS, 2BO5.) 
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through the second half-channel into the matrix. Thus proton flow causes the 
rotor ring to spin against the stator like a proton-driven turbine. 

The mitochondrial ATP synthase is of ancient origin: essentially the same 
enzyme occurs in plant chloroplasts and in the plasma membrane of bacteria or 
archaea. The main difference between them is the number of c subunits in the 
rotor ring. In mammalian mitochondria, the ring has 8 subunits. In yeast mito- 
chondria, the number is 10; in bacteria and archaea, it ranges from 11 to 13; in 
plant chloroplasts, there are 14; and the rings of some cyanobacteria contain 15 c 
subunits. 

The c subunits in the rotor ring can be thought of as cogs in the gears of a bicy- 
cle. A high gear, with a small number of cogs, is advantageous when the supply 
of protons is limited, as in mitochondria, but a low gear, with a large number of 
cogs in the wheel, is preferable when the proton gradient is high. This is the case 
in chloroplasts and cyanobacteria, where protons produced through the action of 
sunlight are plentiful. Because each rotation produces three molecules of ATP in 
the head, the synthesis of one ATP requires around three protons in mitochondria 
but up to five in photosynthetic organisms. It is the number of c subunits in the 
ring that defines how many protons need to pass through this marvelous device 
to make each molecule of ATP, and thereby how high a ratio of ATP to ADP can be 
maintained by the ATP synthase. 

In principle, ATP synthase can also run in reverse as an ATP-powered proton 
pump that converts the energy of ATP back into a proton gradient across the mem- 
brane. In many bacteria, the rotor of the ATP synthase in the plasma membrane 
changes direction routinely, from ATP synthesis mode in aerobic respiration, to 
ATP hydrolysis mode in anaerobic metabolism. In this latter case, ATP hydroly- 
sis serves to maintain the proton gradient across the plasma membrane, which 
is used to power many other essential cell functions including nutrient transport 
and the rotation of bacterial flagella. The V-type ATPases that acidify certain cel- 
lular organelles are architecturally similar to the F-type ATP synthases, but they 
normally function in reverse (see Figure 13-37). 


Mitochondrial Cristae Help to Make ATP Synthesis Efficient 


In the electron microscope, the mitochondrial ATP synthase complexes can be 
seen to project like lollipops on the matrix side of cristae membranes. Recent 
studies by cryoelectron microscopy and tomography have shown that this large 
complex is not distributed randomly in the membrane, but forms long rows of 
dimers along the cristae ridges (Figure 14-32). The dimer rows induce or stabi- 
lize these regions of high membrane curvature, which are otherwise energetically 
unfavorable. Indeed, the formation of ATP synthase dimers and their assembly 
into rows are required for cristae formation and have far-reaching consequences 
for cellular fitness. By contrast with bacterial or chloroplast ATP synthases, which 
do not form dimers, the mitochondrial complex contains additional subunits, 
located mostly near the membrane end of the stator stalk. Several of these sub- 
units are found to be dimer-specific. If these subunits are mutated in yeast, the 
ATP synthase in the membrane remains monomeric, the mitochondria have no 
cristae, cellular respiration drops by half, and the cells grow more slowly. 


Figure 14-31 Fo ATP synthase rotor rings. (A) Atomic force microscopy 
image of ATP synthase rotors from the cyanobacterium Synechococcus 
elongatus in a lipid bilayer. Whereas 8 c subunits form the rotor in Figure 
14-30, there are 13 c subunits in this ring. (B) The x-ray structure of the Fo 
ring of the ATP synthase from Spirulina platensis, another cyanobacterium, 
shows that this rotor has 15 c subunits. In all ATP synthases, the c subunits 
are hairpins of two membrane-spanning a helices (one subunit is highlighted 
in gray). The helices are highly hydrophobic, except for two glutamine 

and glutamate side chains (yellow) that create proton-binding sites in the 
membrane. (A, courtesy of Thomas Meier and Denys Pogoryelov; B, PDB 
code: 2WIE.) 
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Electron tomography suggests that the proton pumps of the respiratory chain 
are located in the membrane regions at either side of the dimer rows. Protons 
pumped into the crista space by these respiratory-chain complexes are thought 
to diffuse very rapidly along the membrane surface, with the ATP synthase rows 
creating a proton “sink” at the cristae tips (Figure 14-33). In vitro studies suggest 
that the ATP synthase needs a proton gradient of about 2 pH units to produce ATP 
at the rate required by the cell, irrespective of the membrane potential. The H* 
gradient across the inner mitochondrial membrane is only 0.5 to 0.6 pH units. The 
cristae thus seem to work as proton traps that enable the ATP synthase to make 
efficient use of the protons pumped out of the mitochondrial matrix. As we shall 
see in the next section, this elaborate arrangement of membrane protein com- 
plexes is absent in chloroplasts, where the H* gradient is much higher. 


Special Transport Proteins Exchange ATP and ADP Through the 
Inner Membrane 


Like all biological membranes, the inner mitochondrial membrane contains 
numerous specific transport proteins that allow particular substances to pass 
through. One of the most abundant of these is the ADP/ATP carrier protein (Fig- 
ure 14-34). This carrier shuttles the ATP produced in the matrix through the inner 
membrane to the intermembrane space, from where it diffuses through the outer 
mitochondrial membrane to the cytosol. In exchange, ADP passes from the cyto- 
sol into the matrix for recycling into ATP. ATP* has one more negative charge 
than ADP3-, and the exchange of ATP and ADP is driven by the electrochemical 
gradient across the inner membrane, so that the more negatively charged ATP is 
pushed out of the matrix, and the less negatively charged ADP is pulled in. The 
ADP/ATP carrier is but one member of a mitochondrial carrier family: the inner 
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Figure 14-32 Dimers of mitochondrial 
ATP synthase in cristae membranes. 

(A) A three-dimensional map of a small 
mitochondrion obtained by electron 
microscope tomography shows that ATP 
synthases form long paired rows along 
cristae ridges. The outer membrane is 
gray, the inner membrane and cristae 
membranes have been colored /ight blue. 
Each head of an ATP synthase is indicated 
by a yellow sphere. (B) A three-dimensional 
map of a mitochondrial ATP synthase 
dimer in the crista membrane obtained 

by subtomogram averaging, with fitted 
x-ray structures (Movie 14.7). (A, from 

K. Davies et al., Proc. Natl Acad. Sci. USA 
108:14121-14126, 2011. With permission 
from the National Academy of Sciences; 
B, from K. Davies et al., Proc. Natl Acad. 
Sci. USA 109:13602-13607, 2012. 

With permission from the National 
Academy of Sciences.) 


Figure 14-33 ATP synthase dimers at 
cristae ridges and ATP production. At the 
crista ridges, the ATP synthases (yellow) form 
a sink for protons (red). The proton pumps 
of the electron-transport chain (green) are 
located in the membrane regions on either 
side of the crista. As illustrated, protons 
tend to diffuse along the membrane from 
their Source to the proton sink created by 
the ATP synthase. This allows efficient ATP 
production despite the small H* gradient 
between the cytosol and matrix. Red arrows 
show the direction of the proton flow. 
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Figure 14-34 The ADP/ATP carrier protein. (A) The ADP/ATP carrier protein is a small 
membrane protein that carries the ATP produced on the matrix side of the inner membrane to 
the intermembrane space, and the ADP that is needed for ATP synthesis into the matrix. (B) In 
the ADP/ATP carrier, six transmembrane a helices define a cavity that binds either ADP or ATP. In 
this x-ray structure, the substrate is replaced by a tightly bound inhibitor instead (colored). When 
ADP binds from outside the inner membrane, it triggers a conformational change and is released 
into the matrix. In exchange, a molecule of ATP quickly binds to the matrix side of the carrier 

and is transported to the intermembrane space. From there the ATP diffuses through the outer 
mitochondrial membrane to the cytoplasm, where it powers the energy-requiring processes in the 
cell. (B, PDB code: 1OKC.) 


mitochondrial membrane contains about 20 related carrier proteins exchanging 
various other metabolites, including the phosphate that is required along with 
ADP for ATP synthesis. 

In some specialized fat cells, mitochondrial respiration is uncoupled from ATP 
synthesis by the uncoupling protein, another member of the mitochondrial carrier 
family. In these cells, known as brown fat cells, most of the energy of oxidation is 
dissipated as heat rather than being converted into ATP. In the inner membranes 
of the large mitochondria in these cells, the uncoupling protein allows protons 
to move down their electrochemical gradient without passing through ATP syn- 
thase. This process is switched on when heat generation is required, causing the 
cells to oxidize their fat stores at a rapid rate and produce heat rather than ATP. 
Tissues containing brown fat serve as “heating pads,” helping to revive hibernat- 
ing animals and to protect newborn human babies from the cold. 


Chemiosmotic Mechanisms First Arose in Bacteria 


Bacteria use enormously diverse energy sources. Some, like animal cells, are aer- 
obic; they synthesize ATP from sugars they oxidize to CO2 and H20 by glycolysis, 
the citric acid cycle, and a respiratory chain in their plasma membrane that is 
similar to the one in the inner mitochondrial membrane. Others are strict anaer- 
obes, deriving their energy either from glycolysis alone (by fermentation, see Fig- 
ure 2-47) or from an electron-transport chain that employs a molecule other than 
oxygen as the final electron acceptor. The alternative electron acceptor can be a 
nitrogen compound (nitrate or nitrite), a sulfur compound (sulfate or sulfite), or 
a carbon compound (fumarate or carbonate), for example. A series of electron 
carriers in the plasma membrane that are comparable to those in mitochondrial 
respiratory chains transfers the electrons to these acceptors. 

Despite this diversity, the plasma membrane of the vast majority of bacteria 
contains an ATP synthase that is very similar to the one in mitochondria. In bac- 
teria that use an electron-transport chain to harvest energy, the electron-trans- 
port chain pumps H* out of the cell and thereby establishes a proton-motive force 
across the plasma membrane that drives the ATP synthase to make ATP. In other 
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bacteria, the ATP synthase works in reverse, using the ATP produced by glycolysis 
to pump H* and establish a proton gradient across the plasma membrane. 

Bacteria, including the strict anaerobes, maintain a proton gradient across 
their plasma membrane that is harnessed to drive many other processes. It can 
be used to drive a flagellar motor, for example (Figure 14-35). This gradient is 
harnessed to pump Na* out of the bacterium via a Na*-H* antiporter that takes 
the place of the Na*-K* pump of eukaryotic cells. The gradient is also used for the 
active inward transport of nutrients, such as most amino acids and many sugars: 
each nutrient is dragged into the cell along with one or more protons through a 
specific symporter (Figure 14-36; see also Chapter 11). In animal cells, by con- 
trast, most inward transport across the plasma membrane is driven by the Na+ 
gradient (high Na* outside, low Na* inside) that is established by the Na+t-K* pump 
(see Figure 11-15). 

Some unusual bacteria have adapted to live in a very alkaline environment 
and yet must maintain their cytoplasm at a physiological pH. For these cells, any 
attempt to generate an electrochemical H* gradient would be opposed by a large 
H* concentration gradient in the wrong direction (H* higher inside than outside). 
Presumably for this reason, some of these bacteria substitute Na* for H* in all of 
their chemiosmotic mechanisms. The respiratory chain pumps Na* out of the 
cell, the transport systems and flagellar motor are driven by an inward flux of Na‘, 
and a Na*-driven ATP synthase synthesizes ATP. The existence of such bacteria 
demonstrates a critical point: the principle of chemiosmosis is more fundamental 
than the proton-motive force on which it is normally based. 

As we discuss next, an ATP synthase coupled to chemiosmotic processes is 
also a central feature of plants, where it plays critical roles in both mitochondria 
and chloroplasts. 


Figure 14-36 The importance of H*-driven transport in bacteria. 

A proton-motive force generated across the plasma membrane pumps 
nutrients into the cell and expels Nat. (A) In an aerobic bacterium, 

a respiratory chain fed by the oxidation of substrates produces an 
electrochemical proton gradient across the plasma membrane. This gradient 
is then harnessed to make ATP, as well as to transport nutrients (proline, 
succinate, lactose, and lysine) into the cell and to pump Na* out of the cell. 
(B) When the same bacterium grows under anaerobic conditions, it derives its 
ATP from glycolysis. As indicated, the ATP synthase in the plasma membrane 
then hydrolyzes some of this ATP to establish an electrochemical proton 
gradient that drives the same transport processes that depend on respiratory 
chain proton-pumping in (A). 
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Figure 14-35 The rotation of the 
bacterial flagellum driven by H+ flow. 
The flagellum is attached to a series of 
protein rings (oink), which are embedded in 
the outer and inner membranes and rotate 
with the flagellum. The rotation is driven 
by a flow of protons through an outer ring 
of proteins (the stator) by mechanisms 
that may resemble those used by the ATP 
synthase. However, the flow of protons 

in the flagellar motor is always toward 

the cytosol, both during clockwise and 
counterclockwise rotation, whereas in 
ATP synthase this flow reverses with the 
direction of rotation (Movie 14.8). 
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Summary 


The large amount of free energy released when H* flows back into the matrix from 
the cristae provides the basis for ATP production on the matrix side of mitochon- 
drial cristae membranes by a remarkable protein machine—the ATP synthase. The 
ATP synthase functions like a miniature turbine, and it is a reversible device that can 
couple proton flow to either ATP synthesis or ATP hydrolysis. The transmembrane 
electrochemical gradient that drives ATP production in mitochondria also drives 
the active transport of selected metabolites across the inner mitochondrial mem- 
brane, including an efficient ADP/ATP exchange between the mitochondrion and 
the cytosol that keeps the cell’s ATP pool highly charged. The resulting high cellular 
concentration of ATP makes the free-energy change for ATP hydrolysis extremely 
favorable, allowing this hydrolysis reaction to drive a very large number of ener- 
gy-requiring processes throughout the cell. The universal presence of ATP synthase 
in bacteria, mitochondria, and chloroplasts testifies to the central importance of 
chemiosmotic mechanisms in cells. 


CHLOROPLASTS AND PHOTOSYNTHESIS 


All animals and most microorganisms rely on the continual uptake of large 
amounts of organic compounds from their environment. These compounds pro- 
vide both the carbon-rich building blocks for biosynthesis and the metabolic 
energy for life. It is likely that the first organisms on the primitive Earth had access 
to an abundance of organic compounds produced by geochemical processes, but 
it is clear that these were used up billions of years ago. Since that time, virtually 
all of the organic materials required by living cells have been produced by pho- 
tosynthetic organisms, including plants and photosynthetic bacteria. The core 
machinery that drives all photosynthesis appears to have evolved more than 3 bil- 
lion years ago in the ancestors of present-day bacteria; today it provides the only 
major solar energy storage mechanism on Earth. 

The most advanced photosynthetic bacteria are the cyanobacteria, which have 
minimal nutrient requirements. They use electrons from water and the energy of 
sunlight to convert atmospheric CO% into organic compounds—a process called 
carbon fixation. In the course of the overall reaction nH2O0 + nCOz — (light) 
(CH20),, + nOs, they also liberate into the atmosphere the molecular oxygen that 
then powers oxidative phosphorylation. In this way, it is thought that the evolu- 
tion of cyanobacteria from more primitive photosynthetic bacteria eventually 
made possible the development of the many different aerobic life-forms that pop- 
ulate the Earth today. 


Chloroplasts Resemble Mitochondria But Have a Separate 
Thylakoid Compartment 


Plants (including algae) developed much later than cyanobacteria, and their pho- 
tosynthesis occurs in a specialized intracellular organelle—the chloroplast (Fig- 
ure 14-37). Chloroplasts use chemiosmotic mechanisms to carry out their energy 
interconversions in much the same way that mitochondria do. Although much 
larger than mitochondria, they are organized on the same principles. They have a 
highly permeable outer membrane; a much less permeable inner membrane, in 
which membrane transport proteins are embedded; and a narrow intermembrane 
space in between. Together, these two membranes form the chloroplast envelope 
(Figure 14-37D). The inner chloroplast membrane surrounds a large space called 
the stroma, which is analogous to the mitochondrial matrix. The stroma contains 
many metabolic enzymes and, as for the mitochondrial matrix, it is the place 
where ATP is made by the head of an ATP synthase. Like the mitochondrion, the 
chloroplast has its own genome and genetic system. The stroma therefore also 
contains a special set of ribosomes, RNAs, and the chloroplast DNA. 

An important difference between the organization of mitochondria and chlo- 
roplasts is highlighted in Figure 14-38. The inner membrane of the chloroplast is 
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not folded into cristae and does not contain electron-transport chains. Instead, 
the electron-transport chains, photosynthetic light-capturing systems, and ATP 
synthase are all contained in the thylakoid membrane, a separate, distinct mem- 
brane that forms a set of flattened, disc-like sacs, the thylakoids. The thylakoid 
membrane is highly folded into numerous local stacks of flattened vesicles called 
grana, interconnected by nonstacked thylakoids. The lumen of each thylakoid is 
connected with the lumen of other thylakoids, thereby defining a third internal 
compartment called the thylakoid space. This space represents a separate com- 
partment in each chloroplast that is not connected to either the intermembrane 
space or the stroma. 


Chloroplasts Capture Energy from Sunlight and Use It to Fix 
Carbon 

We can group the reactions that occur during photosynthesis in chloroplasts into 
two broad categories: 


1. The photosynthetic electron-transfer reactions (also called the “light 
reactions”) occur in two large protein complexes, called reaction centers, 
embedded in the thylakoid membrane. A photon (a quantum of light) 
knocks an electron out of the green pigment molecule chlorophyll in the 
first reaction center, creating a positively charged chlorophyll ion. This 
electron then moves along an electron-transport chain and through a sec- 
ond reaction center in much the same way that an electron moves along 
the respiratory chain in mitochondria. During this electron-transport 
process, H* is pumped across the thylakoid membrane, and the resulting 
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Figure 14-37 Chloroplasts in the cell. 
(A) Schematic cross section through the 
leaf of a green plant. (B) Light microscopy 
of a plant leaf cell—here, a mesophyll cell 
from Zinnia elegans — shows chloroplasts 
as bright green bodies, measuring several 
micrometers across, in the transparent 
cell interior. (C) The electron micrograph 
of a thin, stained section through a wheat 
leaf cell shows a thin rim of cytoplasm— 
containing chloroplasts, the nucleus, and 
mitochondria — surrounding a large, water- 
filled vacuole. (D) At higher magnification, 
electron microscopy reveals the chloroplast 
envelope membrane and the thylakoid 
membrane within the chloroplast that 

is highly folded into grana stacks 

(Movie 14.9). (B, courtesy of John Innes 
Foundation; C and D, courtesy of 

K. Plaskitt.) 


Figure 14-38 A mitochondrion and 
chloroplast compared. Chloroplasts 

are generally larger than mitochondria. In 
addition to an outer and inner envelope 
membrane, they contain the thylakoid 
membrane with its internal thylakoid space. 
The chloroplast thylakoid membrane, 
which is the site of solar energy conversion 
in plants and algae, corresponds to the 
mitochondrial cristae, which are the sites 
of energy conversion by cellular respiration. 
Unlike the crista membrane, which is 
continuous with the inner mitochondrial 
membrane at cristae junctions, the thylakoid 
membrane is not connected to the inner 
chloroplast membrane at any point. 
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electrochemical proton gradient drives the synthesis of ATP in the stroma. 
As the final step in this series of reactions, electrons are loaded (together 
with H*) onto NADP‘, converting it to the energy-rich NADPH molecule. 
Because the positively charged chlorophyll in the first reaction center 
quickly regains its electrons from water (H20), O2 gas is produced as a 
by-product. All of these reactions are confined to the chloroplast. 


2. The carbon-fixation reactions do not require sunlight. Here the ATP and 
NADPH generated by the light reactions serve as the source of energy and 
reducing power, respectively, to drive the conversion of CO% to carbohy- 
drate. These carbon-fixation reactions begin in the chloroplast stroma, 
where they generate the three-carbon sugar glyceraldehyde 3-phosphate. 
This simple sugar is exported to the cytosol, where it is used to produce 
sucrose and many other organic metabolites in the leaves of the plant. The 
sucrose is then exported to meet the metabolic needs of the nonphoto- 
synthetic plant tissues, serving as a source of both carbon skeletons and 
energy for growth. 


Thus, the formation of ATP, NADPH, and Oz (which requires light energy 
directly) and the conversion of CO2 to carbohydrate (which requires light energy 
only indirectly) are separate processes (Figure 14-39). However, they are linked 
by elaborate feedback mechanisms that allow a plant to manufacture sugars only 
when it is appropriate to do so. Several of the chloroplast enzymes required for 
carbon fixation, for example, are inactive in the dark and reactivated by light-stim- 
ulated electron-transport processes. 


Carbon Fixation Uses ATP and NADPH to Convert COs into 
Sugars 


We have seen earlier in this chapter how animal cells produce ATP by using the 
large amount of free energy released when carbohydrates are oxidized to COz and 
H20. The reverse reaction, in which plants make carbohydrate from CO2 and H20, 
takes place in the chloroplast stroma. The large amounts of ATP and NADPH pro- 
duced by the photosynthetic electron-transfer reactions are required to drive this 
energetically unfavorable reaction. 
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Figure 14-39 A summary of the energy- 
converting metabolism in chloroplasts. 
Chloroplasts require only water and 
carbon dioxide as inputs for their light- 
driven photosynthesis reactions, and 

they produce the nutrients for most other 
organisms on the planet. Each oxidation of 
two water molecules by a photochemical 
reaction center in the thylakoid membrane 
produces one molecule of oxygen, which 
is released into the atmosphere. At the 
same time, protons are concentrated 

in the thylakoid space. These protons 
create a large electrochemical gradient 
across the thylakoid membrane, which is 
utilized by the chloroplast ATP synthase to 
produce ATP from ADP and phosphate. 
The electrons withdrawn from water 

are transferred to a second type of 
photochemical reaction center to produce 
NADPH from NADP‘. As indicated, the 
NADPH and ATP are fed into the carbon- 
fixation cycle to reduce carbon dioxide, 
thereby producing the precursors for 
sugars, amino acids, and fatty acids. The 
CO» that is taken up from the atmosphere 
here is the source of the carbon atoms for 
most organic molecules on Earth. 

In a plant cell, a variety of metabolites 
produced in the chloroplast are exported to 
the cytoplasm for biosyntheses. Some of 
the sugar produced is stored in the form of 
starch granules in the chloroplast, but the 
rest is transported throughout the plant as 
sucrose or converted to starch in special 
storage tissues. These storage tissues 
serve as a major food source for animals. 
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Figure 14-40 illustrates the central reaction of carbon fixation, in which an 
atom of inorganic carbon is converted to organic carbon: CO» from the atmo- 
sphere combines with the five-carbon compound ribulose 1,5-bisphosphate plus 
water to yield two molecules of the three-carbon compound 3-phosphoglycer- 
ate. This carboxylation reaction is catalyzed in the chloroplast stroma by a large 
enzyme called ribulose bisphosphate carboxylase, or Rubisco for short. Because 
the reaction is so slow (each Rubisco molecule turns over only about 3 molecules 
of substrate per second, compared to 1000 molecules per second for a typical 
enzyme), an unusually large number of enzyme molecules are needed. Rubisco 
often constitutes more than 50% of the chloroplast protein mass, and it is thought 
to be the most abundant protein on Earth. In a global context, Rubisco also keeps 
the amount of the greenhouse gas CO: in the atmosphere at a low level. 

Although the production of carbohydrates from CO» and H20 is energetically 
unfavorable, the fixation of CO% catalyzed by Rubisco is an energetically favorable 
reaction. Carbon fixation is energetically favorable because a continuous supply of 
the energy-rich ribulose 1,5-bisphosphate is fed into the process. This compound 
is consumed by the addition of CO», and it must be replenished. The energy and 
reducing power needed to regenerate ribulose 1,5-bisphosphate come from the 
ATP and NADPH produced by the photosynthetic light reactions. 

The elaborate series of reactions in which CO» combines with ribulose 1,5-bis- 
phosphate to produce a simple sugar—a portion of which is used to regenerate 
ribulose 1,5-bisphosphate—forms a cycle, called the carbon-fixation cycle, or the 
Calvin cycle (Figure 14-41). This cycle was one of the first metabolic pathways to 
be worked out by applying radioisotopes as tracers in biochemistry. As indicated, 
each turn of the cycle converts six molecules of 3-phosphoglycerate to three mol- 
ecules of ribulose 1,5-bisphosphate plus one molecule of glyceraldehyde 3-phos- 
phate. Glyceraldehyde 3-phosphate, the three-carbon sugar produced by the cycle, 
then provides the starting material for the synthesis of many other sugars and all 
of the other organic molecules that form the plant. 


Sugars Generated by Carbon Fixation Can Be Stored as Starch or 
Consumed to Produce ATP 


The glyceraldehyde 3-phosphate generated by carbon fixation in the chloroplast 
stroma can be used in a number of ways, depending on the needs of the plant. 
During periods of excess photosynthetic activity, much of it is retained in the 
chloroplast stroma and converted to starch. Like glycogen in animal cells, starch 
is a large polymer of glucose that serves as a carbohydrate reserve, and it is stored 
as large granules in the chloroplast stroma. Starch forms an important part of the 
diet of all animals that eat plants. Other glyceraldehyde 3-phosphate molecules 
are converted to fat in the stroma. This material, which accumulates as fat drop- 
lets, likewise serves as an energy reserve. At night, this stored starch and fat can be 
broken down to sugars and fatty acids, which are exported to the cytosol to help 
support the metabolic needs of the plant. Some of the exported sugar enters the 


785 


Figure 14—40 The initial reaction in 
carbon fixation. This carboxylation 
reaction allows one molecule each 

of carbon dioxide and water to be 
incorporated into organic carbon 
molecules. It is catalyzed in the chloroplast 
stroma by the abundant enzyme ribulose 
bisphosphate carboxylase, or Rubisco. As 
indicated, the product is two molecules of 
3-phosphoglycerate. 
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glycolytic pathway (see Figure 2-46), where it is converted to pyruvate. Both that 
pyruvate and the fatty acids can enter the plant cell mitochondria and be fed into 
the citric acid cycle, ultimately leading to the production of large amounts of ATP 
by oxidative phosphorylation (Figure 14-42). Plants use this ATP in the same way 
that animal cells and other nonphotosynthetic organisms do to power a variety of 
metabolic reactions. 

The glyceraldehyde 3-phosphate exported from chloroplasts into the cytosol 
can also be converted into many other metabolites, including the disaccharide 
sucrose. Sucrose is the major form in which sugar is transported between the cells 
of a plant: just as glucose is transported in the blood of animals, so sucrose is 
exported from the leaves to provide carbohydrate to the rest of the plant. 


The Thylakoid Membranes of Chloroplasts Contain the Protein 
Complexes Required for Photosynthesis and ATP Generation 


We next need to explain how the large amounts of ATP and NADPH required for 
carbon fixation are generated in the chloroplast. Chloroplasts are much larger and 
less dynamic than mitochondria, but they make use of chemiosmotic energy con- 
version in much the same way. As we saw in Figure 14-38, chloroplasts and mito- 
chondria are organized on the same principles, although the chloroplast contains 
a separate thylakoid membrane system in which its chemiosmotic mechanisms 
occur. The thylakoid membranes contain two large membrane protein com- 
plexes, called photosystems, which endow plants and other photosynthetic organ- 
isms with the ability to capture and convert solar energy for their own use. Two 
other protein complexes in the thylakoid membrane that work together with the 
photosystems in photophosphorylation—the generation of ATP with sunlight— 
have mitochondrial equivalents. These are the heme-containing cytochrome 


Figure 14-41 The carbon-fixation 
cycle. This central metabolic pathway 
allows organic molecules to be produced 
from CO» and H20. In the first stage 

of the cycle (carboxylation), CO2 is 

added to ribulose 1,5-bisphosphate, as 
shown in Figure 14—40. In the second 
stage (reduction), ATP and NADPH are 
consumed to produce glyceraldehyde 
3-phosphate molecules. In the final stage 
(regeneration), some of the glyceraldehyde 
3-phosphate produced is used to 
regenerate ribulose 1,5-bisphosphate. 
Other glyceraldehyde 3-phosphate 
molecules are either converted to starch 
and fat in the chloroplast stroma, or 
transported out of the chloroplast into the 
cytosol. The number of carbon atoms in 
each type of molecule is indicated in yellow. 
There are many intermediates between 
glyceraldehyde 3-phosphate and ribulose 
5-phosphate, but they have been omitted 
here for clarity. The entry of water into the 
cycle is also not shown (but see Figure 
14—40). 
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Figure 14-42 How chloroplasts and mitochondria collaborate to supply cells with both metabolites and ATP. (A)The 
inner chloroplast membrane is impermeable to the ATP and NADPH that are produced in the stroma during the light reactions 
of photosynthesis. These molecules are therefore funneled into the carbon-fixation cycle, where they are used to make sugars. 
The resulting sugars and their metabolites are either stored within the chloroplast—in the form of starch or fat—or exported to 
the rest of the plant cell. There, they can enter the energy-generating pathway that ends in ATP synthesis linked to oxidative 
phosphorylation inside the mitochondrion. Unlike the chloroplast, mitochondrial membranes contain a specific transporter 

that makes them permeable to ATP (see Figure 14-34). Note that the O2 released to the atmosphere by photosynthesis 

in chloroplasts is used for oxidative phosphorylation in mitochondria; similarly, the COs released by the citric acid cycle in 
mitochondria is used for carbon fixation in chloroplasts. (B) In a leaf, mitochondria (red) tend to cluster close to the chloroplasts 


(green), as seen in this light micrograph. (B, courtesy of Olivier Grandjean.) 


be-f complex, which both functionally and structurally resembles cytochrome 
c reductase in the respiratory chain; and the chloroplast ATP synthase, which 
closely resembles the mitochondrial ATP synthase and works in the same way. 


Chlorophyll—Protein Complexes Can Transfer Either Excitation 
Energy or Electrons 


The photosystems in the thylakoid membrane are multiprotein assemblies of a 
complexity comparable to that of the protein complexes in the mitochondrial 
electron-transport chain. They contain large numbers of specifically bound chlo- 
rophyll molecules, in addition to cofactors that will be familiar from our discus- 
sion of mitochondria (heme, iron-sulfur clusters, and quinones). Chlorophyll, 
the green pigment of photosynthetic organisms, has a long hydrophobic tail that 
makes it behave like a lipid, plus a porphyrin ring that has a central Mg atom and 
an extensive system of delocalized electrons in conjugated double bonds (Figure 
14-43). When a chlorophyll molecule absorbs a quantum of sunlight (a photon), 
the energy of the photon causes one of these electrons to move from a low-energy 
molecular orbital to another orbital of higher energy. 

The excited electron in a chlorophyll molecule tends to return quickly to its 
ground state, which can occur in one of three ways: 


1. By converting the extra energy into heat (molecular motion) or to some 
combination of heat and light of a longer wavelength (fluorescence); this 
is what usually happens when light is absorbed by an isolated chlorophyll 
molecule in solution. 


2. By transferring the energy—but not the electron—directly to a neighboring 
chlorophyll molecule by a process called resonance energy transfer. 


3. By transferring the excited electron with its negative charge to another 
nearby molecule, an electron acceptor, after which the positively charged 
chlorophyll returns to its original state by taking up an electron from some 
other molecule, an electron donor. 


Figure 14-43 The structure of chlorophyll. A magnesium atom is held in a 
porphyrin ring, which is related to the porphyrin ring that binds iron in heme 
(see Figure 14-15). Electrons are delocalized over the bonds shaded in blue. 
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The latter two mechanisms occur when chlorophylls are attached to proteins 
in a chlorophyll-protein complex. The protein coordinates the central Mg atom in 
the chlorophyll porphyrin, most often through a histidine side chain located in the 
hydrophobic interior of amembrane, causing each of the chlorophylls in a protein 
complex to be held at exactly defined distances and orientations. The flow of exci- 
tation energy or electrons then depends on both the precise spatial arrangement 
and the local protein environment of the protein-bound chlorophylls. 

When excited by a photon, most protein-bound chlorophylls simply transmit 
the absorbed energy to another nearby chlorophyll by the process of resonance 
energy transfer. However, in a few specially positioned chlorophylls, the energy 
difference between the ground state and the excited state is just right for the pho- 
ton to trigger a light-induced chemical reaction. The special state of such chlo- 
rophyll molecules derives from their close interaction with a second chlorophyll 
molecule in the same chlorophyll-protein complex. Together, these two chloro- 
phylls form a special pair. 

The photosynthetic electron transfer process starts when a photon of suitable 
energy ionizes a chlorophyll molecule in such a special pair, dissociating it into an 
electron and a positively charged chlorophyll ion. The energized electron is passed 
rapidly to a quinone in the same protein complex, preventing its unproductive 
reassociation with the chlorophyll ion. This light-induced transfer of an electron 
from a chlorophyll to a mobile electron carrier is the central charge-separation 
step in photosynthesis, in which a chlorophyll becomes positively charged and 
an electron carrier becomes negatively charged (Figure 14-44). The chlorophyll 
ion is a very strong oxidant that is able to withdraw an electron from a low-energy 
substrate; in the first step of oxygenic photosynthesis, this low-energy substrate is 
water. 

Upon transfer to a mobile carrier in the electron-transport chain, the electron 
is stabilized as part of a strong electron donor and made available for subsequent 
reactions. These subsequent reactions require more time to complete, and they 
result in light-generated energy-rich compounds. 


A Photosystem Consists of an Antenna Complex and a Reaction 
Center 


There are two distinct types of chlorophyll-protein complexes in the photosyn- 
thetic membrane. One type, called a photochemical reaction center, contains the 
special pair of chlorophylls just described. The other type engages exclusively in 
light absorption and resonance energy transfer and is called an antenna complex. 
Together, the two types of complex make up a photosystem (Figure 14-45). 

The role of the antenna complex in the photosystem is to collect the energy of 
a sufficient number of photons for photosynthesis. Without it, the process would 
be slow and inefficient, as each reaction-center chlorophyll would absorb only 
about one light quantum per second, even in broad daylight, whereas hundreds 
per second are needed for effective photosynthesis. When light excites a chloro- 
phyll molecule in the antenna complex, the energy passes rapidly from one pro- 
tein-bound chlorophyll to another by resonance energy transfer until it reaches 
the special pair in the reaction center. The antenna complex is also known as a 
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Figure 14-44 A general scheme 

for the charge-separation step ina 
photosynthetic reaction center. In a 
reaction center, light energy is harnessed 
to generate electrons that are held at 

a high energy level by mobile electron 
carriers in a membrane. Light energy is 
thereby converted to chemical energy. The 
process starts when a photon absorbed 
by the special pair of chlorophylls in the 
reaction center knocks an electron out 

of one of the chlorophylls. The electron 

is taken up by a mobile electron carrier 
(orange) bound at the opposite membrane 
surface. A set of intermediary carriers 
embedded in the reaction center provide 
the path from the special pair to this 
carrier (not shown). The physical distance 
between the positively charged chlorophyll 
ion and the negatively charged electron 
carrier stabilizes the charge-separated 
state for a short time, during which the 
chlorophyll ion, a strong oxidant, withdraws 
an electron from a suitable compound 

(for example, from water, an event we 

will discuss in detail shortly). The electron 
carrier then diffuses away from the reaction 
center as a strong electron donor that 

will transfer its electron to an electron- 
transport chain. 
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light-harvesting complex, or LHC. In addition to many chlorophyll molecules, 
an LHC contains orange carotenoid pigments. The carotenoids collect light of 
a different wavelength from that absorbed by chlorophylls, helping to make the 
antenna complex more efficient. They also have an important protective role in 
preventing the formation of harmful oxygen radicals in the photosynthetic mem- 
brane. 


The Thylakoid Membrane Contains Two Different Photosystems 
Working in Series 


The excitation energy collected by the antenna complex is delivered to the special 
pair in the photochemical reaction center. The reaction center is a transmem- 
brane chlorophyll-protein complex that lies at the heart of photosynthesis. It har- 
bors the special pair of chlorophyll molecules, which acts as an irreversible trap 
for excitation energy (see Figure 14-45). 

Chloroplasts contain two functionally different although structurally related 
photosystems, each of which feeds electrons generated by the action of sunlight 
into an electron-transfer chain. In the chloroplast thylakoid membrane, photo- 
system I is confined to the unstacked stroma thylakoids, while the stacked grana 
thylakoids contain photosystem II. The two photosystems were named in order 
of their discovery, not of their actions in the photosynthetic pathway, and elec- 
trons are first activated in photosystem II before being transferred to photosystem 
I (Figure 14-46). The path of the electron through the two photosystems can be 
described as a Z-like trajectory and is known as the Z scheme. In the Z scheme, 
the reaction center of photosystem II first withdraws an electron from water. The 
electron passes via an electron-transport chain (composed of the electron carrier 
plastoquinone, the cytochrome bg-f complex, and the protein plastocyanin) to 
photosystem I, which propels the electron across the membrane in a second light- 
driven charge-separation reaction that leads to NADPH production. 
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Figure 14-45 A photosystem. Each 
photosystem consists of a reaction 

center plus a number of light-harvesting 
antenna complexes. The solar energy for 
photosynthesis is collected by the antenna 
complexes, which account for most of the 
chlorophyll in a plant cell. The energy hops 
randomly by resonance energy transfer (red 
arrows) from one chlorophyll molecule to 
another, until it reaches the reaction center 
complex, where it ionizes a chlorophyll in 
the special pair. The chlorophyll special pair 
holds its electrons at a lower energy than 
the chlorophyll in the antenna complexes, 
causing the energy transferred to it from 
the antenna complex to become trapped 
there. Note that it is only energy that 
moves from one chlorophyll molecule 

to another in the antenna complex, not 
electrons (Movie 14.10). 


Figure 14-46 The Z scheme for 
photosynthesis. The thylakoids of plants 
and cyanobacteria contain two different 
photosystems, known as photosystem | 
and photosystem II, which work in series. 
Each of the photosystem | and II reaction 
centers receives excitation energy from 

its own set of tightly associated antenna 
complexes, known as LHC-I and LHC-ll, 
by resonance energy transfer. Note that, for 
historical reasons, the two photosystems 
were named opposite to the order in which 
they act, with photosystem II passing its 
electrons to photosystem l. 
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The Z scheme is necessary to bridge the very large energy gap between water 
and NADPH (Figure 14-47). A single quantum of visible light does not contain 
enough energy both to withdraw electrons from water, which holds on to its elec- 
trons very tightly (redox potential +820 mV) and therefore is a very poor electron 
donor, and to force them on to NADP+, which is a very poor electron acceptor 
(redox potential -320 mV). The Z scheme first evolved in cyanobacteria to enable 
them to use water as a universally available electron source. Other, simpler pho- 
tosynthetic bacteria have only one photosystem. As we shall see, they cannot use 
water as an electron source and must rely on other, more energy-rich substrates 
instead, from which electrons are more readily withdrawn. The ability to extract 
electrons from water (and thereby to produce molecular oxygen) was acquired by 
plants when their ancestors took up the endosymbiotic cyanobacteria that later 
evolved into chloroplasts (see Figure 1-31). 


Photosystem I] Uses a Manganese Cluster to Withdraw Electrons 
From Water 


In biology, only photosystem II is able to withdraw electrons from water and to 
generate molecular oxygen as a waste product. This remarkable specialization of 
photosystem II is conferred by the unique properties of one of the two chlorophyll 
molecules of its special pair and by a manganese cluster linked to the protein. 
These chlorophyll molecules and the manganese cluster form the catalytic core of 
the photosystem II reaction center, whose mechanism is outlined in Figure 14-48. 

Water is an inexhaustible source of electrons, but it is also extremely stable; 
therefore a large amount of energy is required to make it part with its electrons. 
The only compound in living organisms that is able to achieve this feat after its ion- 
ization by light, is the chlorophyll special pair called Pego (Pégo/Peg0* redox poten- 
tial = +1270 mV). The reaction 2H2O + 4 photons — 4Ht + 4e + Og is catalyzed by 
its adjacent manganese cluster. The intermediates remain firmly attached to the 
manganese cluster until two water molecules have been fully oxidized to Oz, thus 


Figure 14-47 Changes in redox potential 
during photosynthesis. The redox 
potential for each molecule is indicated 

by its position along the vertical axis. 
Photosystem II passes electrons derived 
from water to photosystem |, which in turn 
passes them to NADP* through ferredoxin- 
NADP? reductase. The net electron flow 
through the two photosystems is from 
water to NADP*, and it produces NADPH 
as well as an electrochemical proton 
gradient. This proton gradient is used 

by the ATP synthase to produce ATP. 
Details in this figure will be explained in the 
subsequent text. 
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ensuring that no dangerous oxygen radicals are released as the reaction proceeds. 
The protons released by the two water molecules are discharged to the thylakoid 
space, contributing to the proton gradient across the thylakoid membrane (pH 
lower in the thylakoid space than in the stroma). The unique protein environment 
that endows life with this all-important ability to oxidize water has remained 
essentially unchanged throughout billions of years of evolution (Figure 14-49). 

All of the oxygen in the Earth’s atmosphere has been generated in this way. 
Although the exact details of the water-oxidation reaction in photosystem II are 
still not fully understood, scientists are trying to construct an artificial system that 
mimics the process. If successful, this might provide a virtually endless supply of 
clean energy, helping to solve the world’s energy crisis. 


The Cytochrome be-f Complex Connects Photosystem Il 
to Photosystem | 


Following the path shown previously in Figure 14-48, the electrons extracted from 
water by photosystem II are transferred to plastoquinol, a strong electron donor 
similar to ubiquinol in mitochondria. This quinol, which can diffuse rapidly in the 
lipid bilayer of the thylakoid membrane, transfers its electrons to the cytochrome 
be-f complex, whose structure is homologous to the cytochrome c reductase in 
mitochondria. The cytochrome bg-f complex pumps H* into the thylakoid space 
using the same Q cycle that is utilized in mitochondria (see Figure 14-21), thereby 
adding to the proton gradient across the thylakoid membrane. 

The cytochrome b¢-f complex forms the connecting link between photosys- 
tems II and I in the chloroplast electron-transport chain. It passes its electrons 
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Figure 14-48 The conversion of light 
energy to chemical energy in the 
photosystem II complex. (A) Schematic 
diagram of the photosystem II reaction 
center, whose special pair of chlorophyll 
molecules is designated as Pego based 

on the wavelength of its absorbance 
maximum (680 nm). (B) Cofactors and 
pigments at the core of the reaction 
center. Shown are the manganese (Mn) 
cluster, the tyrosine side chain that links it 
to the Pego special pair, four chlorophylls 
(green), two pheophytins (light blue), two 
plastoquinones (pink), and an iron atom 
(red). The path of electrons is shown by 
blue arrows. In the manganese cluster, four 
manganese atoms (light blue), one calcium 
atom (purple), and five oxygen atoms (red) 
work together to catalyze the oxidation of 
water. The water-splitting reaction occurs 
in four Successive steps, each requiring the 
energy of one photon. Each photon turns 
a Pego reaction-center chlorophyll into a 
positively charged chlorophyll ion. Through 
an ionized tyrosine side chain (yellow), this 
chlorophyll ion pulls an electron away from 
a water molecule bound at the manganese 
cluster. In this way, a total of four electrons 
are withdrawn from two water molecules 
to generate molecular oxygen, which is 
released into the atmosphere. 

Each electron that is energized by light 
passes from the special pair along an 
electron-transfer chain inside the complex, 
along the indicated path to the permanently 
bound plastoquinone Qa and then to 
plastoquinone Qg as electron acceptors. 
Once Qs has picked up two electrons 
(plus two protons; see Figure 14-17), 
it dissociates from its binding site in the 
complex and enters the lipid bilayer as a 
mobile electron carrier, being immediately 
replaced by a new, nonreduced molecule 
of plastoquinone. Note that the chlorophylls 
and pheophytins form two symmetrical 
branches of a potential electron-transport 
chain. Only one branch is active, thus 
ensuring that the plastoquinones become 
fully reduced in minimum time. 


Figure 14—49 The structure of the 
complete photosystem II complex. This 
photosystem contains at least 16 protein 
subunits, along with 36 chlorophylls, two 
oheophytins, two hemes, and a number of 
protective carotenoids (colored). Most of 
these pigments and cofactors are deeply 
buried, tightly complexed to protein (gray). 
The path of electrons is indicated by the 
blue arrows, and is explained in Figure 
14-48B. The photosystem II complex 
presented here is the cyanobacterial 
complex, which is simpler and more stable 
than the plant complex, which works in the 
same way. (PDB code: 3ARC.) 


792 Chapter 14: Energy Conversion: Mitochondria and Chloroplasts 









2H,0 


O, + 4 





HO © 


THYLAKOID SPACE 








one at a time to the mobile electron carrier plastocyanin (a small copper-contain- 
ing protein that takes the place of the cytochrome c in mitochondria), which will 
transfer them to photosystem I (Figure 14-50). As we discuss next, photosystem 
I then harnesses a second photon of light to further energize the electrons that it 
receives. 


Photosystem | Carries Out the Second Charge-Separation Step 
in the Z Scheme 


Photosystem I receives electrons from plastocyanin in the thylakoid space and 
transfers them, via a second charge-separation reaction, to the small protein 
ferredoxin on the opposite membrane surface (Figure 14-51). Then, in a final 
step, ferredoxin feeds its electrons to a membrane-associated enzyme complex, 
the ferredoxin-NADP* reductase, which uses the electrons to produce NADPH 
from NADP* (see Figure 14-50). 

The redox potential of the NADPt/NADPH pair (-320 mV) is already very low, 
and reduction of NADP* therefore requires a compound with an even lower redox 
potential. This turns out to be a chlorophyll molecule near the stromal membrane 
surface of photosystem I that has a redox potential of -1000 mV (chlorophyll Ao), 
making it the strongest known electron donor in biology. The reduced NADPH is 
released into the chloroplast stroma, where it is used for biosynthesis of glycer- 
aldehyde 3-phosphate, amino acid precursors, and fatty acids, much of it to be 


exported to the cytoplasm. 
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Figure 14-50 Electron flow through the 
cytochrome bg.-f complex to NADPH. 
The cytochrome be-f complex is the 
functional equivalent of cytochrome c 
reductase (the cytochrome b-c; complex) 
in mitochondria (See Figure 14—22). Like its 
mitochondrial homolog, the be-f complex 
receives its electrons from a quinone and 
engages in a complicated Q cycle that 
pumps two protons across the membrane 
(details not shown). It hands its electrons, 
one at atime, to plastocyanin (pC). 
Plastocyanin diffuses along the membrane 
surface to photosystem | and transfers 

the electrons via ferredoxin (Fd) to the 
ferredoxin-NADP* reductase (FNR), where 
they are utilized to produce NADPH. P7oọ is 
a special pair of chlorophylls that absorbs 
light of wavelength 700 nm. 


Figure 14-51 Structure and function 

of photosystem I. At the heart of the 
photosystem | complex assembly is the 
electron-transfer chain shown. At one 

end is a special pair of chlorophylls called 
P79 (because it absorbs light of 700 nm 
wavelength), receiving electrons from 
plastocyanin (pC). At the other end are the 
Ao chlorophylls, which hand the electrons 
on to ferredoxin via two plastoquinones 
(PQ; purple) and three iron-sulfur clusters. 
Even though the roles of photosystems | 
and Il in photosynthesis are very different, 
their central electron-transfer chains are 
structurally similar, indicating a common 
evolutionary origin (see Figure 14-53). 
Note that in photosystem | both branches 
of the electron-transfer chain are active, 
unlike in photosystem II (see Figure 14—48). 
(PDB code: 3LW5.) 
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The Chloroplast ATP Synthase Uses the Proton Gradient 
Generated by the Photosynthetic Light Reactions to Produce ATP 


The sequence of events that results in light-driven production of ATP and NADPH 
in chloroplasts and cyanobacteria is summarized in Figure 14-52. Starting with 
the withdrawal of electrons from water, the light-driven charge-separation steps 
in photosystems II and I enable the energetically unfavorable (uphill) flow of elec- 
trons from water to NADPH (see Figure 14-47). Three small mobile electron car- 
riers—plastoquinone, plastocyanin, and ferredoxin—participate in this process. 
Together with the electron-driven proton pump of the cytochrome bę-f complex, 
the photosystems generate a large proton gradient across the thylakoid mem- 
brane. The ATP synthase molecules embedded in the thylakoid membranes then 
harness this proton gradient to produce large amounts of ATP in the chloroplast 
stroma, mimicking the synthesis of ATP in the mitochondrial matrix. 

The linear Z scheme for photosynthesis thus far discussed can switch to a cir- 
cular mode of electron flow through photosystem I and the bg-f complex. Here, 
the reduced ferredoxin diffuses back to the bg-f complex to reduce plastoquinone, 
instead of passing its electrons to the ferredoxin-NADP* reductase enzyme com- 
plex. This, in effect, turns photosystem I into a light-driven proton pump, thereby 
increasing the proton gradient and thus the amount of ATP made by the ATP 
synthase. An elaborate set of regulatory mechanisms control this switch, which 
enables the chloroplast to generate either more NADPH (linear mode) or more 
ATP (circular mode), depending on the metabolic needs of the cell. 


All Photosynthetic Reaction Centers Have Evolved From a 
Common Ancestor 


Evidence for the prokaryotic origins of mitochondria and chloroplasts abounds 
in their genetic systems, as we will see in the next section. But strong and direct 
evidence for the evolutionary origins of chloroplasts can also be found in the 
molecular structures of photosynthetic reaction centers revealed in recent years 
by crystallography. The positions of the chlorophylls in the special pair and the 
two branches of the electron-transfer chain are basically the same in photosystem 
I, photosystem II, and the photochemical reaction centers of photosynthetic bac- 
teria (Movie 14.11). As a result, one can conclude that they all have evolved from 
a common ancestor. Evidently, the molecular architecture of the photosynthetic 
reaction center originated only once and has remained essentially unchanged 
during evolution. By contrast, the less critical antenna systems have evolved in 
several different ways and are correspondingly diverse in present-day photosyn- 
thetic organisms (Figure 14-53). 
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Figure 14-52 Summary of electron 

and proton movements during 
photosynthesis in the thylakoid 
membrane. Electrons are withdrawn, 
through the action of light energy, from 

a water molecule that is held by the 
manganese cluster in photosystem Il. 

The electrons pass on to plastoquinone, 
which delivers them to the cytochrome be-f 
complex that resembles the cytochrome 

c reductase of mitochondria and the 

b-c complex of bacteria. They are then 
carried to photosystem | by the soluble 
electron carrier plastocyanin, the functional 
equivalent of cytochrome c in mitochondria. 
From photosystem | they are transferred 
to ferredoxin-NADP*t reductase (FNR) by 
the soluble carrier ferredoxin (Fd; a small 
protein containing an iron-sulfur center). 
Protons are pumped into the thylakoid 
space by the cytochrome b¢-f complex, 

in the same way that protons are pumped 
into mitochondrial cristae by cytochrome c 
reductase (see Figure 14-21). In addition, 
the H* released into the thylakoid space 
by water oxidation, and the Ht consumed 
during NADPH formation in the stroma, 
contribute to the generation of the 
electrochemical H* gradient across the 
thylakoid membrane. As illustrated, this 
gradient drives ATP synthesis by an ATP 
synthase that sits in the same membrane 
(see Figure 14—47). 
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The Proton-Motive Force for ATP Production in Mitochondria and 
Chloroplasts Is Essentially the Same 


The proton gradient across the thylakoid membrane depends both on the pro- 
ton-pumping activity of the cytochrome bg-f complex and on the photosynthetic 
activity of the two photosystems, which in turn depends on light intensity. In chlo- 
roplasts exposed to light, H* is pumped out of the stroma (pH around 8, similar to 
the mitochondrial matrix) into the thylakoid space (pH 5-6), creating a gradient of 
2-3 pH units across the thylakoid membrane, representing a proton-motive force 
of about 180 mV. This is very similar to the proton-motive force in respiring mito- 
chondria. However, a membrane potential across the inner mitochondrial mem- 
brane makes the largest contribution to the proton-motive force that drives the 
mitochondrial ATP synthase to make ATP, whereas a H* gradient predominates 
for chloroplasts. 

In contrast to mitochondrial ATP synthase, which forms long rows of dimers 
along the cristae ridges, the chloroplast ATP synthase is monomeric and located 
in flat membrane regions (Figure 14-54). Evidently, the H* gradient across the 
thylakoid membrane is high enough for ATP synthesis without the need for the 
elaborate arrangement of ATP synthase seen in mitochondria. 


Chemiosmotic Mechanisms Evolved in Stages 


The first living cells on Earth may have consumed geochemically produced 
organic molecules and generated their ATP by fermentation. Because oxygen was 
not yet present in the atmosphere, such anaerobic fermentation reactions would 
have dumped organic acids—such as lactic or formic acids, for example—into the 


Figure 14-53 Evolution of 
photosynthetic reaction centers. 
Pigments involved in light-harvesting are 
colored green; those involved in the central 
photochemical events are colored red. 

(A) The primitive photochemical reaction 
center of purple bacteria contains two 
related protein subunits, L and M, that 
bind the pigments involved in the 

central process of photosynthesis, 
including a special pair of chlorophyll 
molecules. Electrons are fed into the 
excited chlorophylls by a cytochrome. 

LH1 is a bacterial antenna complex. 

(B) Photosystem II contains the D4 and 

Də proteins, which are homologous to 

the L and M subunits in (A). The excited 
Peso Chlorophyll in the special pair 
withdraws electrons from water held by 
the manganese cluster. LHC-II is the light- 
harvesting complex that feeds energy into 
the core antenna proteins. (C) Photosystem 
| contains the Psa A and Psa B proteins, 
each of which is equivalent to a fusion of 
the D; or De protein to a core antenna 
orotein of photosystem Il. The loosely 
bound plastocyanin (pC) feeds electrons 
into the excited chlorophyll pair. As 
indicated, in photosystem I, electrons are 
passed from a bound quinone (Q) through 
a series of three iron-sulfur centers (red 
circles). (Modified from K. Rhee, E. Morris, 
J. Barber and W. Kuhlbrandt, Nature 
396:283-286, 1998; and W. Kuhlbrandt, 
Nature 411:896-899, 2001. With 
permission from Macmillan Publishers Ltd.) 
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environment (see Figure 2-47). Perhaps such acids lowered the pH of the envi- 
ronment, favoring the survival of cells that evolved transmembrane proteins that 
could pump H* out of the cytosol, thereby preventing the cell from becoming too 
acidic (stage 1 in Figure 14-55). One of these pumps may have used the energy 
available from ATP hydrolysis to eject H* from the cell; such a proton pump could 
have been the ancestor of present-day ATP synthases. 

As the Earth’s supply of geochemically produced nutrients began to dwindle, 
organisms that could find a way to pump H* without consuming ATP would have 
been at an advantage: they could save the small amounts of ATP they derived 
from the fermentation of increasingly scarce foodstuffs to fuel other important 
activities. This need to conserve resources might have led to the evolution of 
electron-transport proteins that allowed cells to use the movement of electrons 
between molecules of different redox potentials as a source of energy for pump- 
ing H* across the plasma membrane (stage 2 in Figure 14-55). Some of these cells 
might have used the nonfermentable organic acids that neighboring cells had 
excreted as waste to provide the electrons needed to feed this electron-transport 
system. Some present-day bacteria grow on formic acid, for example, using the 
small amount of redox energy derived from the transfer of electrons from formic 
acid to fumarate to pump H+. 


Figure 14-55 How ATP synthesis by chemiosmosis might have evolved 
in stages. The first stage could have involved the evolution of an ATPase that 
pumped protons out of the cell using the energy of ATP hydrolysis. Stage 2 
could have involved the evolution of a different proton pump, driven by an 
electron-transport chain. Stage 3 would then have linked these two systems 
together to generate a primitive ATP synthase that used the protons pumped 
by the electron-transport chain to synthesize ATP. An early bacterium with 
this final system would have had a selective advantage over bacteria with 
neither of the systems or only one. 
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Figure 14-54 A comparison of H+ 
concentrations and the arrangement 

of ATP synthase in mitochondria and 
chloroplasts. In both organelles, the pH in 
the intermembrane space is 7.4, as in the 
cytoplasm. The pH of the mitochondrial 
matrix and the pH of the chloroplast stroma 
are both about 8 (light gray). The pH in the 
thylakoid space is around 5.5, depending 
on photosynthetic activity. This results 

in a high proton-motive force across the 
thylakoid membrane, consisting largely of 
the H+ gradient (a high permeability of this 
membrane to Mg?2* and CI ions allows the 
flow of these ions to dissipate most of the 
membrane potential). 

In contrast to chloroplasts, the H* 
gradient across the inner mitochondrial 
membrane is insufficient for ATP production, 
and mitochondria need a membrane 
potential to bring the proton-motive force 
to the same level as in chloroplasts. The 
arrangement of the mitochondrial ATP 
synthase in rows of dimers along the 
cristae ridges (see Figure 14-32) next to the 
respiratory-chain proton pumps may help 
the flow of protons along the membrane 
surface toward the ATP synthase, as the 
availability of protons is limiting for ATP 
production. In the chloroplast, the ATP 
synthase is distributed randomly in thylakoid 
membranes. 
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Eventually, some bacteria would have developed H*-pumping electron-trans- 
port systems that were so efficient that they could harvest more redox energy than 
they needed to maintain their internal pH. Such cells would probably have gen- 
erated large electrochemical proton gradients, which they could then use to pro- 
duce ATP. Protons could leak back into the cell through the ATP-driven H* pumps, 
essentially running them in reverse so that they synthesized ATP (stage 3 in Figure 
14-55). Because such cells would require much less of the dwindling supply of 
fermentable nutrients, they would have proliferated at the expense of their neigh- 
bors. 


By Providing an Inexhaustible Source of Reducing Power, 
Photosynthetic Bacteria Overcame a Major Evolutionary Obstacle 


The gradual depletion of nutrients from the environment on the early Earth meant 
that organisms had to find some alternative source of carbon to make the sugars 
that serve as the precursors for so many other cell components. Although the CO2 
in the atmosphere provides an abundant potential carbon source, to convert it 
into an organic molecule such as a carbohydrate requires reducing the fixed CO2 
with a strong electron donor, such as NADPH, which can generate (CH20) units 
from CO» (see Figure 14-41). Early in cellular evolution, strong reducing agents 
(electron donors) are thought to have been plentiful. But once an ancestor of ATP 
synthase began to generate most of the ATP, it would have become imperative for 
cells to evolve a new way of generating strong reducing agents. 

A major evolutionary breakthrough in energy metabolism came with the 
development of photochemical reaction centers that could use the energy of sun- 
light to produce molecules such as NADPH. It is thought that this occurred early 
in the process of cellular evolution in the ancestors of the green sulfur bacteria. 
Present-day green sulfur bacteria use light energy to transfer hydrogen atoms 
(as an electron plus a proton) from H2S to NADPH, thereby producing the strong 
reducing power required for carbon fixation. Because the redox potential of H2S is 
much lower than that of H20 (-230 mV for H2S compared with +820 mV for H20), 
one quantum of light absorbed by the single photosystem in these bacteria is suf- 
ficient to generate NADPH via a relatively simple photosynthetic electron-trans- 
port chain. 


The Photosynthetic Electron- Transport Chains of Cyanobacteria 
Produced Atmospheric Oxygen and Permitted New Life-Forms 


The next evolutionary step, which is thought to have occurred with the develop- 
ment of the cyanobacteria perhaps 3 billion years ago, was the evolution of organ- 
isms capable of using water as the electron source for CO» reduction. This entailed 
the evolution of a water-splitting enzyme and also required the addition of a sec- 
ond photosystem, acting in series with the first, to bridge the large gap in redox 
potential between H20 and NADPH. The biological consequences of this evolu- 
tionary step were far-reaching. For the first time, there would have been organisms 
that could survive on water, CO», and sunlight (plus a few trace elements). These 
cells would have been able to spread and evolve in ways denied to the earlier pho- 
tosynthetic bacteria, which needed H2S or organic acids as a source of electrons. 
Consequently, large amounts of biologically synthesized, reduced organic mate- 
rials accumulated and oxygen entered the atmosphere for the first time. 

Oxygen is highly toxic because the oxidation of biological molecules alters 
their structure and properties indiscriminately and irreversibly. Most anaerobic 
bacteria, for example, are rapidly killed when exposed to air. Thus, organisms 
on the primitive Earth would have had to evolve protective mechanisms against 
the rising O2 levels in the environment. Late evolutionary arrivals, such as our- 
selves, have numerous detoxifying mechanisms that protect our cells from the ill 
effects of oxygen. Even so, an accumulation of oxidative damage to our macro- 
molecules has been postulated to contribute to human aging, as we discuss in the 
next section. 
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Figure 14-56 Major events during the evolution of living organisms on Earth. With the evolution of the membrane-based 
process of photosynthesis, organisms were able to make their own organic molecules from CO» gas. The delay of more than 
10° years between the appearance of bacteria that split water and released O2 during photosynthesis and the accumulation 

of high levels of O2 in the atmosphere is thought to be due to the initial reaction of the oxygen with the abundant ferrous iron 
(Fe?+) that was dissolved in the early oceans. Only when the ferrous iron was used up would oxygen have started to accumulate 
in the atmosphere. In response to the rising oxygen levels, nonphotosynthetic oxygen-consuming organisms evolved, and the 
concentration of oxygen in the atmosphere equilibrated at its present-day level. 


The increase in atmospheric O2 was very slow at first and would have allowed 
a gradual evolution of protective devices. For example, the early seas contained 
large amounts of iron in its reduced, ferrous state (Fe**), and nearly all the O2 pro- 
duced by early photosynthetic bacteria would have been used up in oxidizing Fe** 
to ferric Fe**. This conversion caused the precipitation of huge amounts of stable 
oxides, and the extensive banded iron formations in sedimentary rocks, begin- 
ning about 2.7 billion years ago, help to date the spread of the cyanobacteria. By 
about 2 billion years ago, the supply of Fe** was exhausted, and the deposition 
of further iron precipitates ceased. Geological evidence reveals how Oz levels in 
the atmosphere have changed over billions of years, approximating current levels 
only about 0.5 billion years ago (Figure 14-56). 

The availability of O2 enabled the rise of bacteria that developed an aerobic 
metabolism to make their ATP. These organisms could harness the large amount 
of energy released by breaking down carbohydrates and other reduced organic 
molecules all the way to COz and H20, as explained when we discussed mito- 
chondria. Components of preexisting electron-transport complexes were modi- 
fied to produce a cytochrome oxidase, so that the electrons obtained from organic 
or inorganic substrates could be transported to O2 as the terminal electron accep- 
tor. Some present-day purple photosynthetic bacteria can switch between photo- 
synthesis and respiration depending on the availability of light and O2, with only 
relatively minor reorganizations of their electron-transport chains. 

In Figure 14-57, we relate these postulated evolutionary pathways to different 
types of bacteria. By necessity, evolution is always conservative, taking parts of 
the old and building on them to create something new. Thus, parts of the elec- 
tron-transport chains that were derived to service anaerobic bacteria 3-4 billion 
years ago survive, in altered form, in the mitochondria and chloroplasts of today’s 
higher eukaryotes. A good example is the overall similarity in structure and func- 
tion between the cytochrome c reductase that pumps Ht in the central segment of 
the mitochondrial respiratory chain and the analogous cytochrome b-f complex 
in the electron-transport chains of both bacteria and chloroplasts, revealing their 
common evolutionary origin (Figure 14-58). 
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Figure 14-57 Evolutionary scheme showing the postulated origins of mitochondria and chloroplasts and their bacterial 
ancestors. The consumption of oxygen by respiration is thought to have first developed about 2 billion years ago. Nucleotide- 
sequence analyses suggest that an endosymbiotic oxygen-evolving cyanobacterium (cyan) gave rise to chloroplasts (dark 
green), while mitochondria arose from an a-proteobacterium. The nearest relatives of mitochondria (oink) are members of 

three closely related groups of a-proteobacteria—the rhizobacteria, agrobacteria, and rickettsias —known to form intimate 
associations with present-day eukaryotic cells. Proteobacteria are pink, purple photosynthetic bacteria are purple, and other 
photosynthetic bacteria are light green. 


Summary 


Chloroplasts and photosynthetic bacteria have the unique ability to harness the 
energy of sunlight to produce energy-rich compounds. This is achieved by the pho- 
tosystems, in which chlorophyll molecules attached to proteins are excited when 
hit by a photon. Photosystems are composed of an antenna complex that collects 
solar energy and a photochemical reaction center, in which the collected energy is 
funneled to a chlorophyll molecule held in a special position, enabling it to with- 
draw electrons from an electron donor. Chloroplasts and cyanobacteria contain 
two distinct photosystems. The two photosystems are normally linked in series in 
the Z scheme, and they transfer electrons from water to NADP* to form NADPH, 
generating a transmembrane electrochemical potential. One of the two photosys- 
tems—photosystem II—can split water by removing electrons from this ubiqui- 
tous, low-energy compound. All the molecular oxygen (O2) in our atmosphere is 
a by-product of the water-splitting reaction in this photosystem. The three-dimen- 
sional structures of photosystems I and II are strikingly similar to the photosystems 
of purple photosynthetic bacteria, demonstrating a remarkable degree of conserva- 
tion over billions of years of evolution. 

The two photosystems and the cytochrome b¢-f complex reside in the thylakoid 
membrane, a separate membrane system in the central stroma compartment of the 
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chloroplast that is differentiated into stacked grana and unstacked stroma thyla- 
koids. Electron-transport processes in the thylakoid membrane cause protons to be 
released into the thylakoid space. The backflow of protons through the chloroplast 
ATP synthase then generates ATP. This ATP is used in conjunction with the NADPH 
produced by photosynthesis to drive a large number of biosynthetic reactions in 
the chloroplast stroma, including the carbon-fixation cycle, which generates large 
amounts of carbohydrates from COz. 

In the early evolution of life, cyanobacteria overcame a major obstacle in devis- 
ing a way to use solar energy to split water and fix carbon dioxide. Cyanobacteria 
produced both abundant organic nutrients and molecular oxygen, enabling the rise 
of a multitude of aerobic life-forms. The chloroplasts in plants have evolved from 
a cyanobacterium that was endocytosed long ago by an aerobic eukaryotic host 
organism. 
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Figure 14-58 A comparison of three electron-transport chains discussed in this chapter. 
Bacteria, chloroplasts, and mitochondria all contain a membrane-bound enzyme complex that 
resembles the cytochrome c reductase of mitochondria. These complexes all accept electrons from 
a quinone carrier (Q) and pump H* across their respective membranes. Moreover, in reconstituted 
in vitro systems, the different complexes can substitute for one another, and the structures of 

their protein components reveal that they are evolutionarily related. Note that the purple nonsulfur 
bacteria use a cyclic flow of electrons to produce a large electrochemical proton gradient that drives 
a reverse electron flow through NADH dehydrogenase to produce NADH from NAD+ + H* +e. 
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THE GENETIC SYSTEMS OF MITOCHONDRIA AND 
CHLOROPLASTS 


As we discussed in Chapter 1, mitochondria and chloroplasts are thought to have 
evolved from endosymbiotic bacteria (see Figures 1-29 and 1-31). Both types 
of organelles still contain their own genomes (Figure 14-59). As we will discuss 
shortly, they also retain their own biosynthetic machinery for making RNA and 
organellar proteins. 

Like bacteria, mitochondria and chloroplasts proliferate by growth and divi- 
sion of an existing organelle. In actively dividing cells, each type of organelle must 
double in mass in each cell generation and then be distributed into each daughter 
cell. In addition, nondividing cells must replenish organelles that are degraded as 
part of the continual process of organelle turnover, or produce additional organ- 
elles as the need arises. Organelle growth and proliferation are therefore care- 
fully controlled. The process is complicated because mitochondrial and chloro- 
plast proteins are encoded in two places: the nuclear genome and the separate 
genomes harbored in the organelles themselves. The biogenesis of mitochondria 
and chloroplasts thus requires contributions from two separate genetic systems, 
which must be closely coordinated. 

Most organellar proteins are encoded by the nuclear DNA. The organelle 
imports these proteins from the cytosol, after they have been synthesized on cyto- 
solic ribosomes, through the mitochondrial protein translocases of the outer and 
inner mitochondrial membrane—TOM and TIM. In Chapter 12, we discussed 
how this happens. Here, we describe the organelle genomes and genetic systems, 
and consider the consequences of separate organelle genomes for the cell and the 
organism as a whole. 


The Genetic Systems of Mitochondria and Chloroplasts Resemble 
Those of Prokaryotes 


As discussed in Chapter 12, it is thought that eukaryotic cells originated through 
a symbiotic relationship between an archaeon and an aerobic bacterium (a pro- 
teobacterium). The two organisms are postulated to have merged to form the 
ancestor of all nucleated cells, with the archeaon providing the nucleus and the 
proteobacterium serving as a respiring, ATP-producing endosymbiont—one that 
would eventually evolve into the mitochondrion (see Figure 12-3). This most likely 
occurred roughly 1.6 billion years ago, when oxygen had entered the atmosphere 
in substantial amounts (see Figure 14-56). The chloroplast was derived later, after 
the plant and animal lineages diverged, through endocytosis of an oxygen-pro- 
ducing cyanobacterium. 

This endosymbiont hypothesis of organelle development receives strong sup- 
port from the observation that the genetic systems of mitochondria and chloro- 
plasts are similar to those of present-day bacteria. For example, chloroplast ribo- 
somes are very similar to bacterial ribosomes, both in their structure and in their 
sensitivity to various antibiotics (such as chloramphenicol, streptomycin, eryth- 
romycin, and tetracyclin). In addition, protein synthesis in chloroplasts starts with 
N-formylmethionine, as in bacteria, and not with methionine as in the cytosol of 
eukaryotic cells. Although mitochondrial genetic systems are much less similar to 
those of present-day bacteria than are the genetic systems of chloroplasts, their 
ribosomes are also sensitive to antibacterial antibiotics, and protein synthesis in 
mitochondria also starts with N-formylmethionine. 


Figure 14-59 Staining of nuclear and mitochondrial DNA. In this confocal 
micrograph of a human fibroblast, the nuclear DNA is stained with the dye DAPI 
(blue) and mitochondrial DNA is visualized with fluorescent antibodies that bind 
DNA (green). The mitochondria are stained with fluorescent antibodies that 
recognize a specialized protein translocase specific to the outer mitochondrial 
membrane (red). Numerous copies of the mitochondrial genome are distributed 
in distinct nucleoids throughout the mitochondria that snake through the 
cytoplasm. (From C. Kukat et al., Proc. Nat! Acad. Sci. USA 108:135384-13539, 
2011. With permission from the National Academy of Sciences.) 5 um 
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The processes of organelle DNA transcription, protein synthesis, and DNA 
replication take place where the genome is located: in the matrix of mitochondria 
or the stroma of chloroplasts. Although the enzymes that mediate these genetic 
processes are unique to the organelle, and resemble those of bacteria (or even 
of bacterial viruses) rather than their eukaryotic analogs, the nuclear genome 
encodes the vast majority of these enzymes. Indeed, most present-day mitochon- 
drial and chloroplast proteins are encoded by genes that reside in the cell nucleus. 


Over Time, Mitochondria and Chloroplasts Have Exported Most of 
Their Genes to the Nucleus by Gene Transfer 


The nature of the organelle genes located in the nucleus of the cell demonstrates 
that an extensive transfer of genes from organelle to nuclear DNA has occurred 
in the course of eukaryotic evolution. Such successful gene transfer is expected 
to be rare, because any gene moved from the organelle needs to adapt to both 
nuclear transcription and cytoplasmic translation requirements. In addition, the 
protein needs to acquire a signal sequence that directs it to the correct organ- 
elle after its synthesis in the cytosol. By comparing the genes in the mitochon- 
dria from different organisms, we can infer that some of the gene transfers to the 
nucleus occurred relatively recently. The smallest and presumably most highly 
evolved mitochondrial genomes, for example, encode only a few hydrophobic 
inner-membrane proteins of the electron-transport chain, plus ribosomal RNAs 
(rRNAs) and transfer RNAs (tRNAs). Other mitochondrial genomes that have 
remained more complex tend to contain this same subset of genes along with 
others (Figure 14-60). The most complex mitochondrial genomes include genes 
that encode components of the mitochondrial genetic system, such as RNA poly- 
merase subunits and ribosomal proteins; these same genes are found in the cell 
nucleus in yeast and all animal cells. 

The proteins that are encoded by genes in the organellar DNA are synthesized 
on ribosomes within the organelle, using organelle-produced messenger RNA 
(mRNA) to specify their amino acid sequence (Figure 14-61). The protein traf- 
fic between the cytosol and these organelles seems to be unidirectional: proteins 
are normally not exported from mitochondria or chloroplasts to the cytosol. An 
important exception occurs when a cell is about to undergo apoptosis. As will be 
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Figure 14-60 Comparison of 
mitochondrial genomes. Less complex 
mitochondrial genomes encode subsets 
of the proteins and ribosomal RNAs that 
are encoded by larger mitochondrial 
genomes. In this comparison, there are 
only five genes that are shared by the six 
mitochondrial genomes; these encode 
ribosomal RNAs (rns and mil), cytochrome 
b (cob), and two cytochrome oxidase 
subunits (cox? and cox3). Blue indicates 
ribosomal RNAs; green, ribosomal proteins; 
and brown, components of the respiratory 
chain and other proteins. (Adapted from 
M.W. Gray, G. Burger and B.F. Lang, 
Science 283:1476-1481, 1999.) 
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Figure 14-61 Biogenesis of the 
respiratory-chain proteins in human 


mitochondria. Most of the protein 
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nucleus 





respiratory chain are encoded by nuclear 
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of nuclear genes are translated on 
cytoplasmic ribosomes (green), which 
are distinct from the mitochondrial 
ribosomes. The nuclear-encoded 
mitochondrial proteins (dark green) are 
imported into mitochondria through two 
protein translocases called TOM and 
TIM, and constitute the vast majority 
of the approximately 1000 different 
protein species present in mammalian 
mitochondria. The nuclear-encoded 
mitochondrial proteins in humans 
13 mtDNA- mitochondrial include the majority of the oxidative 
encoded proteins ribosome phosphorylation system subunits, all 
proteins needed for expression and 
discussed in detail in Chapter 18, during apoptosis the mitochondrion releases maintenance of mtDNA, and all proteins of 


proteins (most notably cytochrome c) from the crista space through its outer the mitochondrial ribosomes. 
mitochondrial membrane, as part of an elaborate signaling pathway that is trig- The A cece tea? 
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phosphorylation system. (Adapted from 
The Fission and Fusion of Mitochondria Are Topologically Complex 19. Larsson, Annu. Rev. Biochem. 


79:683-706, 2010.) 
Processes 





In mammalian cells, mitochondrial DNA makes up less than 1% of the total cel- 
lular DNA. In other cells, however, such as the leaves of higher plants or the very 
large egg cells of amphibians, a much larger fraction of the cellular DNA may be 
present in mitochondria or chloroplasts (Table 14-2), and a large fraction of the 
total RNA and protein synthesis takes place in the organelles. 

Mitochondria and chloroplasts are large enough to be visible by light micros- 
copy in living cells. For example, mitochondria can be visualized by expressing in 
cells a genetically engineered fusion of a mitochondrial protein linked to green 


TABLE 14-2 
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*The large variation in the number and size of mitochondria per cell in yeasts is due to mitochondrial fusion and fission.**In maize, the amount of 
chloroplast DNA drops precipitously in mature leaves, after cell division ceases: the chloroplast DNA is degraded and stable mRNAs persist to 
provide for protein synthesis. 
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Figure 14-62 The mitochondrial reticulum is dynamic. (A) In yeast cells, mitochondria form a continuous reticulum on the 
cytoplasmic side of the plasma membrane (stereo pair). (B) A balance between fission and fusion determines the arrangement 
of the mitochondria in different cells. (C) Time-lapse fluorescent microscopy shows the dynamic behavior of the mitochondrial 
network in a yeast cell. In addition to shape changes, fission and fusion constantly remodel the network (red arrows). These 
pictures were taken at 3-minute intervals. (A and C, from J. Nunnari et al., Mol. Biol. Cell 8:1233-1242, 1997. With permission 


from the American Society for Cell Biology.) 


fluorescent protein (GFP), or cells can be incubated with a fluorescent dye that 
is specifically taken up by mitochondria because of their membrane potential. 
Such images demonstrate that the mitochondria in living cells are dynamic— 
frequently dividing by fission, fusing, and changing shape (Figure 14-62 and 
Movie 14.12). The fission of mitochondria may be necessary so that small parts of 
the network can pinch off and reach remote regions of the cell—for example in the 
thin, extended axon and dendrites of a neuron. 

The fission and fusion ofmitochondria are topologically complex processes that 
must ensure the integrity of the separate mitochondrial compartments defined by 
the inner and outer membranes. These processes control the number and shape 
of mitochondria, which can vary dramatically in different cell types, ranging from 
multiple spherical or wormlike organelles to a highly branched, net-shaped single 
organelle called a reticulum. Each depends on its own special set of proteins. The 
mitochondrial fission machine works by assembling dynamin-related GTPases 
(discussed in Chapter 13) into helical oligomers that cause local constrictions in 
tubular mitochondria. GTP hydrolysis then generates the mechanical force that 
severs the inner and outer mitochondrial membranes in one step (Figure 14-63). 
Mitochondrial fusion requires two separate machineries, one each for the outer 
and the inner membrane (Figure 14-64). In addition to GTP hydrolysis for force 
generation, both mechanisms also depend on the mitochondrial proton-motive 
force for reasons that are still unknown. 


Animal Mitochondria Contain the Simplest Genetic Systems 
Known 


Comparisons of DNA sequences in different organisms reveal that, in vertebrates 
(including ourselves), the mutation rate during evolution has been roughly 100 
times greater in the mitochondrial genome than in the nuclear genome. ‘This 
difference is likely to be due to lower fidelity of mitochondrial DNA replication, 


Figure 14-63 A model for mitochondrial division. Dynamin-1 (yellow) 

exists as dimers in the cytosol, which form larger oligomeric structures in a 
process that requires GTP hydrolysis. Dynamin assemblies interact with the 
outer mitochondrial membrane through special adaptor proteins, forming a 
spiral of GTP-dynamin around the mitochondrion that causes a constriction. 

A concerted GTP-hydrolysis event in the dynamin subunits is then thought to 
produce the conformational changes that result in fission. (Adapted from S. 
Hoppins, L. Lackner and J. Nunnari, Annu. Rev. Biochem. 76:751-—780, 2007.) 
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inefficient DNA repair, or both, given that the mechanisms that perform these pro- 
cesses in the organelle are relatively simple compared with those in the nucleus. 
As discussed in Chapter 4, the relatively high rate of evolution of animal mito- 
chondrial genes makes a comparison of mitochondrial DNA sequences especially 
useful for estimating the dates of relatively recent evolutionary events, such as the 
steps in primate evolution. 

There are 13 protein-encoding genes in human mitochondrial DNA (Figure 
14-65). These code for hydrophobic components of the respiratory-chain com- 
plexes and of ATP synthase. In contrast, roughly 1000 mitochondrial proteins are 
encoded in the nucleus, produced on cytosolic ribosomes, and imported by the 
protein import machinery in the outer and inner membrane (discussed in Chap- 
ter 12). It has been suggested that the cytosolic production of hydrophobic mem- 
brane proteins and their import into the organelle may present a problem to the 
cell, and that this is the reason why their genes have remained in the mitochon- 
drion. However, some of the most hydrophobic mitochondrial proteins, such as 
the csubunit of the ATP synthase rotor ring, are imported from the cytosol in some 
species (though they are mitochondrially encoded in others). And the parasites 
Plasmodium falciparum and Leishmania tarentolae, which spend most of their 
life cycles inside cells of their host organisms, have retained only two or three 
mitochondrially encoded proteins. 

The size range of mitochondrial DNAs is similar to that of viral DNAs. The 
mitochondrial DNA in Plasmodium falciparum (the human malaria parasite) has 
less than 6000 nucleotide pairs, whereas the mitochondrial DNAs of some land 
plants contain more than 300,000 nucleotide pairs (Figure 14-66). In animals, the 
mitochondrial genome is a simple DNA circle of about 16,600 nucleotide pairs 
(less than 0.001% of the nuclear genome), and it is nearly the same size in organ- 
isms as different from us as Drosophila and sea urchins. 


Mitochondria Have a Relaxed Codon Usage and Can Have a 
Variant Genetic Code 


The human mitochondrial genome has several surprising features that distin- 
guish it from nuclear, chloroplast, and bacterial genomes: 


1. Dense gene packing. Unlike other genomes, the human mitochondrial 
genome seems to contain almost no noncoding DNA: nearly every nucle- 
otide seems to be part of a coding sequence, either for a protein or for one 
of the rRNAs or tRNAs. Since these coding sequences run directly into each 
other, there is very little room left for regulatory DNA sequences. 


2. Relaxed codon usage. Whereas 30 or more tRNAs specify amino acids in the 
cytosol and in chloroplasts, only 22 tRNAs are required for mitochondrial 
protein synthesis. The normal codon-anticodon pairing rules are relaxed 
in mitochondria, so that many tRNA molecules recognize any one of the 
four nucleotides in the third (wobble) position. Such “2 out of 3” pairing 
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Figure 14-64 A model for mitochondrial 
fusion. The fusions of the outer and inner 
mitochondrial membranes are coordinated 
sequential events, each of which requires 
a separate set of protein factors. Outer 
membrane fusion is brought about by an 
outer-membrane GTPase (purple), which 
forms an oligomeric complex that includes 
subunits anchored in the two membranes 
to be fused. Fusion of outer membranes 
requires GTP and an Ht gradient across 
the inner membrane. For fusion of the inner 
membrane, a dynamin-related protein 
forms an oligomeric tethering complex 
(blue) that includes subunits anchored in 
the two inner membranes to be fused. 
Fusion of the inner membranes requires 
GTP and the electrical component of the 
potential across the inner membrane. 
(Adapted from S. Hoppins, L. Lackner 
and J. Nunnari, Annu. Rev. Biochem. 
76:751-780, 2007.) 


Figure 14-65 The organization of the 
human mitochondrial genome. The 
human mitochondrial genome of ~16,600 
nucleotide pairs contains 2 rRNA genes, 
22 tRNA genes, and 13 protein-coding 
sequences. There are two transcriptional 
promoters, one for each strand of the 
mitochondrial DNA (mtDNA). The DNAs of 
many other animal mitochondrial genomes 
have been completely sequenced. Most of 
these animal mitochondrial DNAs encode 
precisely the same genes as humans, with 
the gene order being identical for animals 
ranging from fish to mammals. 


THE GENETIC SYSTEMS OF MITOCHONDRIA AND CHLOROPLASTS 





Marchantia 
C) © Schizosaccharomyces pombe 
Tetrahymena 
O Reclinomonas o Human 
O Acanthamoeba 
= Plasmodium 
— Chlamydomonas 
Rickettsia Arabidopsis 
bacterium 


allows one tRNA to pair with any one of four codons and permits protein 
synthesis with fewer tRNA molecules. 


3. Variant genetic code. Perhaps most surprising, comparisons of mitochon- 
drial gene sequences and the amino acid sequences of the corresponding 
proteins indicate that the genetic code is different: 4 of the 64 codons have 
different “meanings” from those of the same codons in other genomes 
(Table 14-3). 


The close similarity of the genetic code in all organisms provides strong evi- 
dence that they all have evolved from a common ancestor. How, then, do we 
explain the differences in the genetic code in many mitochondria? A hint comes 
from the finding that the mitochondrial genetic code in different organisms is not 
the same. In the mitochondrion with the largest number of genes in Figure 14-60, 
that of the protozoan Reclinomonas, the genetic code is unchanged from the stan- 
dard genetic code of the cell nucleus. Yet UGA, which is a stop codon elsewhere, 
is read as tryptophan in the mitochondria of mammals, fungi, and invertebrates. 
Similarly, the codon AGG normally codes for arginine, but it codes for stop in 
the mitochondria of mammals and codes for serine in the mitochondria of Dro- 
sophila (see Table 14-3). Such variation suggests that a random drift can occur 
in the genetic code in mitochondria. Presumably, the unusually small number of 
proteins encoded by the mitochondrial genome makes an occasional change in 
the meaning of a rare codon tolerable, whereas such a change in a larger genome 
would alter the function of many proteins and thereby destroy the cell. 

Interestingly, in many species, one or two tRNAs for mitochondrial protein 
synthesis are encoded in the nucleus. Some parasites, for example trypano- 
somes, have not retained any tRNA genes in their mitochondrial DNA. Instead, 
the required tRNAs are all produced in the cytosol and are thought to be imported 
into the mitochondrion by special tRNA translocases that are distinct from the 
mitochondrial protein import system. 


TABLE 14-3 


AGA 
AGG 


STOP 


*Red italics indicate that the code differs from the “Universal” code. 
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Figure 14-66 Comparison of various 
sizes of mitochondrial genomes with 
the genome of bacterial ancestors. The 
complete DNA sequences for thousands 
of mitochondrial genomes have been 
determined. The lengths of a few of these 
mitochondrial DNAs are shown to scale— 
as circles for those genomes thought to be 
circular and lines for linear genomes. The 
largest circle represents the genome of 
Rickettsia prowazekii, a small pathogenic 
bacterium whose genome most closely 
resembles that of mitochondria. The size of 
mitochondrial genomes does not correlate 
well with the number of proteins encoded 
in them: while human mitochondrial DNA 
encodes 13 proteins, the 22-fold larger 
mitochondrial DNA of Arabidopsis thaliana 
encodes only 32 proteins—that is, about 
2.0-fold as many as human mitochondrial 
DNA. The extra DNA that is found in 
Arabidopsis, Marchantia, and other plant 
mitochondria may be “junk DNA” —that is, 
noncoding DNA with no apparent function. 
The mitochondrial DNA of the protozoan 
Reclinomonas americana has 98 genes. 
(Adapted from M.W. Gray, G. Burger and 
B.F. Lang, Science 283:1476-1481, 1999.) 
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Chloroplasts and Bacteria Share Many Striking Similarities 


The chloroplast genomes of land plants range in size from 70,000 to 200,000 nucle- 
otide pairs. More than 300 chloroplast genomes have now been sequenced. Many 
are surprisingly similar, even in distantly related plants (such as tobacco and liv- 
erwort), and even those of green algae are closely related (Figure 14-67). Chloro- 
plast genes are involved in three main processes: transcription, translation, and 
photosynthesis. Plant chloroplast genomes typically encode 80-90 proteins and 
around 45 RNAs, including 37 or more tRNAs. As in mitochondria, most of the 
organelle-encoded proteins are part of larger protein complexes that also contain 
one or more subunits encoded in the nucleus and imported from the cytosol. 

The genomes of chloroplasts and bacteria have striking similarities. Basic reg- 
ulatory sequences, such as transcription promoters and terminators, are virtually 
identical. The amino acid sequences of the proteins encoded in chloroplasts are 
clearly recognizable as bacterial, and several clusters of genes with related func- 
tions (such as those encoding ribosomal proteins) are organized in the same way 
in the genomes of chloroplasts, the bacterium E. coli, and cyanobacteria. 

The mechanisms by which chloroplasts and bacteria divide are also similar. 
Both utilize FtsZ proteins, which are self-assembling GTPases related to tubulins 
(see Chapter 16). Bacterial FtsZ is a soluble protein that assembles into a dynamic 
ring of membrane-attached protofilaments beneath the plasma membrane in the 
middle of the dividing cell. The FtsZ ring acts as a scaffold for recruitment of other 
cell-division proteins and generates a contractile force that results in membrane 
constriction and eventually in cell division. Presumably, chloroplasts divide in 
very much the same way. Although both employ membrane-interacting GTPases, 
the mechanisms by which mitochondria and chloroplasts divide are fundamen- 
tally different. The machinery for chloroplast division acts from the inside, as in 
bacteria, while the dynamin-like GTPases divide mitochondria from the outside 
(see Figure 14-63). The chloroplasts have remained closer to their bacterial ori- 
gins than have mitochondria, since the eukaryotic mechanisms of membrane 
constriction and vesicle formation have been adapted for mitochondrial fission. 

The RNA editing and RNA processing that is prevalent in chloroplasts owes 
everything to their eukaryotic hosts. This RNA processing includes the genera- 
tion of transcript 5’ and 3’ termini and the cleavage of polycistronic transcripts. 
In addition, an RNA editing process converts specific C residues to U and can 
change the amino acid specified by the edited codon. These and other RNA-based 
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Figure 14-67 The organization 
of the liverwort chloroplast 
genome. The chloroplast 
genome organization is similar 
in all higher plants, although 


the size varies from species to 


species — depending on how 
much of the DNA surrounding the 
genes encoding the chloroplast’s 
16S and 23S ribosomal RNAs is 
present in two copies. 
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processes are catalyzed by protein families that are not found in prokaryotes. One 
can ask why the expression of so few chloroplast genes needs to be so complex. 
One explanation is that the expression of chloroplast and nuclear genes must 
be closely coordinated. More generally, the bacterial concept of the operon as 
a co-regulated set of genes in a single transcription unit has been largely aban- 
doned in chloroplasts. Polycistronic transcripts are cleaved into smaller frag- 
ments, which then require splicing or RNA editing to become functional. 


Organelle Genes Are Maternally Inherited in Animals and Plants 


In Saccharomyces cerevisiae (baker’s yeast), when two haploid cells mate, they 
are equal in size and contribute equal amounts of mitochondrial DNA to the dip- 
loid zygote. Mitochondrial inheritance in yeasts is therefore biparental: both par- 
ents contribute equally to the mitochondrial gene pool of the progeny. However, 
during the course of the subsequent asexual, vegetative growth, the mitochondria 
become distributed more or less randomly to daughter cells. After a few genera- 
tions, the mitochondria of any given cell contain only the DNA from one or the 
other parent cell, because only a small sample of the mitochondrial DNA passes 
from the mother cell to the bud of the daughter cell. This process is known as 
mitotic segregation, and it gives rise to a distinct form of inheritance that is called 
non-Mendelian, or cytoplasmic inheritance, in contrast to the Mendelian inheri- 
tance of nuclear genes. 

The inheritance of mitochondria in animals and plants is quite different. In 
these organisms, the egg cell contributes much more cytoplasm to the zygote than 
does the male gamete (sperm in animals, pollen in plants). For example, a typi- 
cal human oocyte contains about 100,000 copies of maternal mitochondrial DNA, 
whereas a sperm cell contains only a few. In addition, an active process ensures 
that the sperm mitochondria do not compete with those in the egg. As sperm 
mature, the DNA is degraded in their mitochondria. Sperm mitochondria are also 
specifically recognized then eliminated from the fertilized egg cell by autophagy 
in very much the same way that damaged mitochondria are removed (by ubiqui- 
tylation followed by delivery to lysosomes, as discussed in Chapter 13). Because of 
these two processes, the mitochondrial inheritance in both animals and plants is 
uniparental. More precisely, the mitochondrial DNA passes from one generation 
to the next by maternal inheritance. 

In about two-thirds of higher plants, the chloroplast precursors from the male 
parent (contained in pollen grains) fail to enter the zygote, so that chloroplast 
as well as mitochondrial DNA is maternally inherited. In other plants, the chlo- 
roplast precursors from the pollen grains enter the zygote, making chloroplast 
inheritance biparental. In such plants, defective chloroplasts are a cause of var- 
iegation: a mixture of normal and defective chloroplasts in a zygote may sort out 
by mitotic segregation during plant growth and development, thereby producing 
alternating green and white patches in leaves. Leaf cells in the green patches con- 
tain normal chloroplasts, while those in the white patches contain defective chlo- 
roplasts (Figure 14-68). 


Mutations in Mitochondrial DNA Can Cause Severe Inherited 
Diseases 


In humans, as we have explained, all the mitochondrial DNA in a fertilized egg 
cell is inherited from the mother. Some mothers carry a mixed population of both 
mutant and normal mitochondrial genomes. Their daughters and sons will inherit 
this mixture of normal and mutant mitochondrial DNAs and be healthy unless the 
process of mitotic segregation results in a majority of defective mitochondria in a 
particular tissue. Muscle and the nervous system are most at risk. Because they 
need particularly large amounts of ATP, muscle and nerve cells are particularly 
dependent on fully functional mitochondria. 

Numerous diseases in humans are caused by mutations in mitochondrial 
DNA. These diseases are recognized by their passage from affected mothers to 
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Figure 14-68 A variegated leaf. In the 
white patches, the plant cells have inherited 
a defective chloroplast. (Courtesy of John 
Innes Foundation.) 
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both their daughters and their sons, with the daughters but not the sons produc- 
ing children with the disease. As expected from the random nature of mitotic seg- 
regation, the symptoms of these diseases vary greatly between different family 
members—including not only the severity and age of onset, but also which tissue 
is affected. There are also mitochondrial diseases that are caused by mutations 
in nuclear-encoded mitochondrial proteins; these diseases are inherited in the 
regular, Mendelian fashion. 


The Accumulation of Mitochondrial DNA Mutations Is a 
Contributor to Aging 


Mitochondria are marvels of efficiency in energy conversion, and they supply the 
cells of our body with a readily available source of energy in the form of ATP. But in 
highly developed, long-lived animals such as ourselves, the cells in our body age 
and eventually die. A factor in this inevitable process is the accumulation of dele- 
tions and point mutations in mitochondrial DNA. Oxidative damage to the cell by 
reactive oxygen species (ROS) such as H209, superoxide, or hydroxyl radicals also 
increases with age. The mitochondrial respiratory chain is the main source of ROS 
in animal cells, and animals in which mitochondrial superoxide dismutase—the 
main ROS scavenger—has been knocked out, die prematurely. 

The less complex DNA replication and repair systems in mitochondria mean 
that accidents are corrected less efficiently. This results in a 100-fold higher occur- 
rence of deletions and point mutations than in nuclear DNA. Mathematical mod- 
eling suggests that most of these mutations and lesions are acquired in childhood 
or early adult life, and then proliferate by clonal expansion in later life. Due to 
mitotic segregation, some cells will accumulate higher levels of faulty mitochon- 
drial DNA than others. Above some threshold, serious deficiencies in respirato- 
ry-chain function will develop, producing cells that are senescent. In many organs 
of the human body, senescent cells with high levels of mitochondrial DNA dam- 
age are intermingled with normal cells, resulting in a mosaic of cells with and 
without respiratory-chain deficiency. 

The main role of mitochondrial fusion in cellular physiology is most likely to 
ensure an even distribution of mitochondrial DNA throughout the mitochondrial 
reticulum, and to prevent the accumulation of damaged DNA in one part of the 
network. When the fusion machinery is defective, DNA is lost from a subset of the 
mitochondria in the cell. Loss of mitochondrial DNA leads to a loss of respirato- 
ry-chain function, and it can cause disease. 

All of the considerations just discussed have suggested to some scientists that 
changes in our mitochondria are major contributors to human aging. However, 
there are many other processes that tend to go wrong as cells and tissues age, as 
one might expect given the incredible complexity of human cell biology. Despite 
intensive research, the issue remains unresolved. 


Why Do Mitochondria and Chloroplasts Maintain a Costly 
Separate System for DNA Transcription and Translation? 


Why do mitochondria and chloroplasts require their own separate genetic sys- 
tems, when other organelles that share the same cytoplasm, such as peroxisomes 
and lysosomes, do not? The question is not trivial, because maintaining a sepa- 
rate genetic system is costly: more than 90 proteins—including many ribosomal 
proteins, aminoacyl-tRNA synthetases, DNA polymerase, RNA polymerase, and 
RNA-processing and RNA-modifying enzymes—must be encoded by nuclear 
genes specifically for this purpose. Moreover, as we have seen, the mitochondrial 
genetic system entails the risk of aging and disease. 

A possible reason for maintaining this costly and potentially hazardous 
arrangement is the highly hydrophobic nature of the nonribosomal proteins 
encoded by organelle genes. This may make their production in and import from 
the cytoplasm simply too difficult and energy-consuming. It is also possible that 
the evolution (and eventual elimination) of the organellar genetic systems is still 
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ongoing, but for now there is no alternative for the cell than to maintain separate 
genetic systems for its nuclear, mitochondrial, and chloroplast genes. 


Summary 


Mitochondria are organelles that allow eukaryotes to carry out oxidative phosphor- 
ylation, while chloroplasts are organelles that allow plants to carry out photosyn- 
thesis. Presumably as a result of their prokaryotic origins, each organelle maintains 
and reproduces itself in a highly coordinated process that requires the contribu- 
tion of two separate genetic systems—one in the organelle and the other in the cell 
nucleus. The vast majority of the proteins in these organelles are encoded by nuclear 
DNA, synthesized in the cytosol, and then imported individually into the organ- 
elle. Other organelle proteins, as well as organelle ribosomal and transfer RNAs, are 
encoded by the organelle DNA; these are synthesized in the organelle itself. 

The ribosomes of chloroplasts closely resemble bacterial ribosomes, while the ori- 
gin of mitochondrial ribosomes is more difficult to trace. Extensive protein similar- 
ities, however, suggest that both organelles originated when a primitive eukaryotic 
cell entered into a stable endosymbiotic relationship with a bacterium. Although 
some of the genes of these former bacteria still function to make organelle proteins 
and RNA, most of them have been transferred into the nuclear genome, where they 
encode bacteria-like enzymes that are synthesized on cytosolic ribosomes and then 
imported into the organelle. The mitochondrial DNA replication and DNA repair 
processes are substantially less effective than the corresponding processes in the cell 
nucleus. Damage therefore accumulates in the genome of mitochondria over time; 
this damage may be a substantial contributor to the aging of cells and organisms, 
and it can cause serious diseases. 
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WHAT WE DON’T KNOW 


e What structures are needed to form 
the barriers that separate and maintain 
the differentiated membrane domains 
in a single continuous membrane—as 
for the cristae and inner boundary 
membrane in mitochondria? 


e How does a eukaryotic cell regulate 
the many functions of mitochondria, 
including ATP production? 


e What are the origins and evolutionary 
history of photosynthetic complexes? 
Are there undiscovered types of 
photosynthesis present on Earth to 
helo answer this question? 


e Why is the mutation rate so much 
higher in mitochondria than in the 
nucleus (and chloroplasts)? Could this 
high rate have been useful to the cell? 


e What mechanisms and pathways 
have been used during evolution to 
transfer genes from the mitochondrion 
to the nucleus? 


PROBLEMS 


Which statements are true? Explain why or why not. 


14-1 ‘The three respiratory enzyme complexes in the 
mitochondrial inner membrane tend to associate with 
each other in ways that facilitate the correct transfer of 
electrons between appropriate complexes. 


14-2 ‘The number ofc subunits in the rotor ring of ATP 
synthase defines how many protons need to pass through 
the turbine to make each molecule of ATP. 


14-3 Mutations that are inherited according to Mende- 
lian rules affect nuclear genes; mutations whose inheri- 
tance violates Mendelian rules are likely to affect organelle 
genes. 


Discuss the following problems. 


14-4 In the 1860s, Louis Pasteur noticed that when he 
added O% to a culture of yeast growing anaerobically on 
glucose, the rate of glucose consumption declined dra- 
matically. Explain the basis for this result, which is known 
as the Pasteur effect. 


14-5 Heart muscle gets most of the ATP needed to power 
its continual contractions through oxidative phosphory- 
lation. When oxidizing glucose to CO», heart muscle con- 
sumes Oz at arate of 10 umol/min per g of tissue, in order 
to replace the ATP used in contraction and give a steady- 
state ATP concentration of 5 umol/g of tissue. At this rate, 


how many seconds would it take the heart to consume 
an amount of ATP equal to its steady-state levels? (Com- 
plete oxidation of one molecule of glucose to CO% yields 
30 ATP, 26 of which are derived by oxidative phosphoryla- 
tion using the 12 pairs of electrons captured in the electron 
carriers NADH and FADH3.) 


14-6 Both H* and Ca” are ions that move through the 
cytosol. Why is the movement of H* ions so much faster 
than that of Ca*+ ions? How do you suppose the speed of 
these two ions would be affected by freezing the solution? 
Would you expect them to move faster or slower? Explain 
your answer. 


14-7 If isolated mitochondria are incubated with a 
source of electrons such as succinate, but without oxygen, 
electrons enter the respiratory chain, reducing each of the 
electron carriers almost completely. When oxygen is then 
introduced, the carriers become oxidized at different rates 
(Figure Q14-1). How does this result allow you to order 
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the electron carriers in the respiratory chain? What is their 
order? 


14-8 Normally, the flow of electrons to Oz is tightly 
linked to the production of ATP via the electrochemical 
gradient. If ATP synthase is inhibited, for example, elec- 
trons do not flow down the electron-transport chain and 
respiration ceases. Since the 1940s, several substances— 
such as 2,4-dinitrophenol—have been known to uncou- 
ple electron flow from ATP synthesis. Dinitrophenol was 
once prescribed as a diet drug to aid in weight loss. How 
would an uncoupler of oxidative phosphorylation pro- 
mote weight loss? Why do you suppose dinitrophenol is 
no longer prescribed? 


14-9 In actively respiring liver mitochondria, the pH in 
the matrix is about half a pH unit higher than it is in the 
cytosol. Assuming that the cytosol is at pH 7 and the matrix 
is a sphere with a diameter of 1 um [V = (4/3)ar’], calcu- 
late the total number of protons in the matrix of a respiring 
liver mitochondrion. If the matrix began at pH 7 (equal to 
that in the cytosol), how many protons would have to be 
pumped out to establish a matrix pH of 7.5 (a difference of 
0.5 pH units)? 


14-10 ATP synthase is the world’s smallest rotary motor. 
Passage of H* ions through the membrane-embedded 
portion of ATP synthase (the Fy component) causes rota- 
tion of the single, central, axle-like y subunit inside the 
head group. The tripartite head is composed of the three 
aß dimers, the B subunit of which is responsible for syn- 
thesis of ATP. The rotation of the y subunit induces con- 
formational changes in the af dimers that allow ADP and 
Pi to be converted into ATP. A variety of indirect evidence 
had suggested rotary catalysis by ATP synthase, but seeing 
is believing. 

To demonstrate rotary motion, a modified form of 
the a3h3y complex was used. The B subunits were modified 
so they could be firmly anchored to a solid support and the 
y subunit was modified (on the end that normally inserts 
into the Fo component in the inner membrane) so that a 
fluorescently tagged, readily visible filament of actin could 
be attached (Figure Q14-2A). This arrangement allows 
rotations of the y subunit to be visualized as revolutions 
of the long actin filament. In these experiments, ATP syn- 
thase was studied in the reverse of its normal mechanism 
by allowing it to hydrolyze ATP. At low ATP concentrations, 
the actin filament was observed to revolve in steps of 120° 
and then pause for variable lengths of time, as shown in 
Figure Q14-2B. 

A. Why does the actin filament revolve in steps with 
pauses in between? What does this rotation correspond to 
in terms of the structure of the a3B3y complex? 

B. In its normal mode of operation inside the cell, 
how many ATP molecules do you suppose would be syn- 
thesized for each complete 360° rotation of the y subunit? 
Explain your answer. 


14-11 Howmuch energy is available in visible light? How 
much energy does sunlight deliver to Earth? How efficient 
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Figure Q14-2 Experimental set-up for observing rotation of the 

y subunit of ATP synthase (Problem 14-10). (A) The immobilized 

asBsy complex. The B subunits are anchored to a solid support and 

a fluorescent actin filament is attached to the y subunit. (B) Stepwise 
revolution of the actin filament. The indicated trace is a typical example 
from one experiment. The inset shows the positions in the revolution 
at which the actin filament pauses. (B, from R. Yasuda et al., Cell 
93:1117-1124, 1998. With permission from Elsevier.) 


are plants at converting light energy into chemical energy? 
The answers to these questions provide an important 
backdrop to the subject of photosynthesis. 

Each quantum or photon of light has energy hv, 
where h is Planck’s constant (6.6 x 10797 kJ sec/photon) 
and v is the frequency in sec”!. The frequency of light is 
equal to c/A, where c is the speed of light (3.0 x 10!” nm/ 
sec) and A is the wavelength in nm. Thus, the energy (E) of 
a photon is 


E=hv=hc/xr 


A. Calculate the energy of a mole of photons (6 x 1 
photons/mole) at 400 nm (violet light), at 680 nm (red 
light), and at 800 nm (near-infrared light). 

B. Bright sunlight strikes Earth at the rate of about 1.3 
kJ/sec per square meter. Assuming for the sake of calcula- 
tion that sunlight consists of monochromatic light of wave- 
length 680 nm, how many seconds would it take for a mole 
of photons to strike a square meter? 

C. Assuming that it takes eight photons to fix one 
molecule of COz as carbohydrate under optimal condi- 
tions (8-10 photons is the currently accepted value), cal- 
culate how long it would take a tomato plant with a leaf 
area of 1 square meter to make a mole of glucose from CO2. 
Assume that photons strike the leaf at the rate calculated 
above and, furthermore, that all the photons are absorbed 
and used to fix CO». 
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D. If it takes 468 kJ/mole to fix a mole of CO,» into 
carbohydrate, what is the efficiency of conversion of light 
energy into chemical energy after photon capture? Assume 
again that eight photons of red light (680 nm) are required 
to fix one molecule of COs. 


14-12 In chloroplasts, protons are pumped out of the 
stroma across the thylakoid membrane, whereas in mito- 
chondria, they are pumped out of the matrix across the 
crista membrane. Explain how this arrangement allows 
chloroplasts to generate a larger proton gradient across 
the thylakoid membrane than mitochondria can generate 
across the inner membrane. 


14-13 Examine the variegated leaf shown in Figure Q14-3. 
Yellow patches surrounded by green are common, but 
there are no green patches surrounded by yellow. Propose 
an explanation for this phenomenon. 
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Cell Signaling 


When things change, cells respond. Every cell, from the humble bacterium to 
the most sophisticated eukaryotic cell, monitors its intracellular and extracellu- 
lar environment, processes the information it gathers, and responds accordingly. 
Unicellular organisms, for example, modify their behavior in response to changes 
in environmental nutrients or toxins. The cells of multicellular organisms detect 
and respond to countless internal and extracellular signals that control their 
growth, division, and differentiation during development, as well as their behav- 
ior in adult tissues. At the heart of all these communication systems are regulatory 
proteins that produce chemical signals, which are sent from one place to another 
in the body or within a cell, usually being processed along the way and integrated 
with other signals to provide clear and effective communication. 

The study of cell signaling has traditionally focused on the mechanisms by 
which eukaryotic cells communicate with each other using extracellular signal 
molecules such as hormones and growth factors. In this chapter, we describe the 
features of some of these cell-cell communication systems, and we use them to 
illustrate the general principles by which any regulatory system, inside or outside 
the cell, is able to generate, process, and respond to signals. Our main focus is on 
animal cells, but we end by considering the special features of cell signaling in 
plants. 


PRINCIPLES OF CELL SIGNALING 


Long before multicellular creatures roamed the Earth, unicellular organisms 
had developed mechanisms for responding to physical and chemical changes in 
their environment. These almost certainly included mechanisms for responding 
to the presence of other cells. Evidence comes from studies of present-day uni- 
cellular organisms such as bacteria and yeasts. Although these cells lead mostly 
independent lives, they can communicate and influence one another’s behavior. 
Many bacteria, for example, respond to chemical signals that are secreted by their 
neighbors and accumulate at higher population density. This process, called quo- 
rum sensing, allows bacteria to coordinate their behavior, including their motility, 
antibiotic production, spore formation, and sexual conjugation. Similarly, yeast 
cells communicate with one another in preparation for mating. The budding yeast 
Saccharomyces cerevisiae provides a well-studied example: when a haploid indi- 
vidual is ready to mate, it secretes a peptide mating factor that signals cells of the 
Opposite mating type to stop proliferating and prepare to mate. The subsequent 
fusion of two haploid cells of opposite mating type produces a diploid zygote. 
Intercellular communication achieved an astonishing level of complexity dur- 
ing the evolution of multicellular organisms. These organisms are tight-knit soci- 
eties of cells, in which the well-being of the individual cell is often set aside for the 
benefit of the organism as a whole. Complex systems of intercellular communica- 
tion have evolved to allow the collaboration and coordination of different tissues 
and cell types. Bewildering arrays of signaling systems govern every conceivable 
feature of cell and tissue function during development and in the adult. 
Communication between cells in multicellular organisms is mediated mainly 
by extracellular signal molecules. Some of these operate over long distances, 
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signaling to cells far away; others signal only to immediate neighbors. Most cells 
in multicellular organisms both emit and receive signals. Reception of the signals 
depends on receptor proteins, usually (but not always) at the cell surface, which 
bind the signal molecule. The binding activates the receptor, which in turn acti- 
vates one or more intracellular signaling pathways or systems. These systems 
depend on intracellular signaling proteins, which process the signal inside the 
receiving cell and distribute it to the appropriate intracellular targets. The tar- 
gets that lie at the end of signaling pathways are generally called effector proteins, 
which are altered in some way by the incoming signal and implement the appro- 
priate change in cell behavior. Depending on the signal and the type and state of 
the receiving cell, these effectors can be transcription regulators, ion channels, 
components of a metabolic pathway, or parts of the cytoskeleton (Figure 15-1). 

The fundamental features of cell signaling have been conserved throughout the 
evolution of the eukaryotes. In budding yeast, for example, the response to mating 
factor depends on cell-surface receptor proteins, intracellular GTP-binding pro- 
teins, and protein kinases that are clearly related to functionally similar proteins 
in animal cells. Through gene duplication and divergence, however, the signaling 
systems in animals have become much more elaborate than those in yeasts; the 
human genome, for example, contains more than 1500 genes that encode recep- 
tor proteins, and the number of different receptor proteins is further increased by 
alternative RNA splicing and post-translational modifications. 


Extracellular Signals Can Act Over Short or Long Distances 


Many extracellular signal molecules remain bound to the surface of the signal- 
ing cell and influence only cells that contact it (Figure 15-2A). Such contact- 
dependent signaling is especially important during development and in immune 
responses. Contact-dependent signaling during development can sometimes 
operate over relatively large distances if the communicating cells extend long thin 
processes to make contact with one another. 


Figure 15-1 A simple intracellular 
signaling pathway activated by an 
extracellular signal molecule. The signal 
molecule usually binds to a receptor 
protein that is embedded in the plasma 
membrane of the target cell. The receptor 
activates one or more intracellular signaling 
pathways, involving a series of signaling 
proteins. Finally, one or more of the 
intracellular signaling proteins alters the 
activity of effector proteins and thereby the 
behavior of the cell. 
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In most cases, however, signaling cells secrete signal molecules into the extra- 
cellular fluid. Often, the secreted molecules are local mediators, which act only 
on cells in the local environment of the signaling cell. This is called paracrine 
signaling (Figure 15-2B). Usually, the signaling and target cells in paracrine 
signaling are of different cell types, but cells may also produce signals that they 
themselves respond to: this is referred to as autocrine signaling. Cancer cells, for 
example, often produce extracellular signals that stimulate their own survival and 
proliferation. 

Large multicellular organisms like us need long-range signaling mechanisms 
to coordinate the behavior of cells in remote parts of the body. Thus, they have 
evolved cell types specialized for intercellular communication over large dis- 
tances. The most sophisticated of these are nerve cells, or neurons, which typically 
extend long, branching processes (axons) that enable them to contact target cells 
far away, where the processes terminate at the specialized sites of signal trans- 
mission known as chemical synapses. When a neuron is activated by stimuli from 
other nerve cells, it sends electrical impulses (action potentials) rapidly along its 
axon; when the impulse reaches the synapse at the end of the axon, it triggers 
secretion of a chemical signal that acts as a neurotransmitter. The tightly orga- 
nized structure of the synapse ensures that the neurotransmitter is delivered spe- 
cifically to receptors on the postsynaptic target cell (Figure 15-2C). The details of 
this synaptic signaling process are discussed in Chapter 11. 

A quite different strategy for signaling over long distances makes use of endo- 
crine cells, which secrete their signal molecules, called hormones, into the 
bloodstream. The blood carries the molecules far and wide, allowing them to act 
on target cells that may lie anywhere in the body (Figure 15-2D). 


target cell 














Extracellular Signal Molecules Bind to Specific Receptors 


Cells in multicellular animals communicate by means of hundreds of kinds of 
extracellular signal molecules. These include proteins, small peptides, amino 
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Figure 15-2 Four forms of intercellular 
signaling. (A) Contact-dependent signaling 
requires cells to be in direct membrane- 
membrane contact. (B) Paracrine signaling 
depends on local mediators that are 
released into the extracellular space and 
act on neighboring cells. (C) Synaptic 
signaling is performed by neurons that 
transmit signals electrically along their 
axons and release neurotransmitters at 
synapses, which are often located far 
away from the neuronal cell body. 

(D) Endocrine signaling depends on 
endocrine cells, which secrete hormones 
into the bloodstream for distribution 
throughout the body. Many of the same 
types of signaling molecules are used 

in paracrine, synaptic, and endocrine 
signaling; the crucial differences lie in the 
speed and selectivity with which the signals 
are delivered to their targets. 
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acids, nucleotides, steroids, retinoids, fatty acid derivatives, and even dissolved 
gases such as nitric oxide and carbon monoxide. Most of these signal molecules 
are released into the extracellular space by exocytosis from the signaling cell, as 
discussed in Chapter 13. Some, however, are emitted by diffusion through the 
signaling cell’s plasma membrane, whereas others are displayed on the external 
surface of the cell and remain attached to it, providing a signal to other cells only 
when they make contact. Transmembrane signal proteins may operate in this 
way, or their extracellular domains may be released from the signaling cell’s sur- 
face by proteolytic cleavage and then act at a distance. 

Regardless of the nature of the signal, the target cell responds by means of a 
receptor, which binds the signal molecule and then initiates a response in the 
target cell. The binding site of the receptor has a complex structure that is shaped 
to recognize the signal molecule with high specificity, helping to ensure that the 
receptor responds only to the appropriate signal and not to the many other sig- 
naling molecules surrounding the cell. Many signal molecules act at very low con- 
centrations (typically < 107 M), and their receptors usually bind them with high 
affinity (dissociation constant Kg < 1078 M; see Figure 3-44). 

In most cases, receptors are transmembrane proteins on the target-cell sur- 
face. When these proteins bind an extracellular signal molecule (a ligand), they 
become activated and generate various intracellular signals that alter the behav- 
ior of the cell. In other cases, the receptor proteins are inside the target cell, and 
the signal molecule has to enter the cell to bind to them: this requires that the 
signal molecule be sufficiently small and hydrophobic to diffuse across the target 
cell’s plasma membrane (Figure 15-3). This chapter focuses primarily on signal- 
ing through cell-surface receptors, but we will briefly describe signaling through 
intracellular receptors later in the chapter. 


Each Cell Is Programmed to Respond to Specific Combinations of 
Extracellular Signals 


A typical cell in a multicellular organism is exposed to hundreds of different signal 
molecules in its environment. The molecules can be soluble, bound to the extra- 
cellular matrix, or bound to the surface of a neighboring cell; they can be stimula- 
tory or inhibitory; they can act in innumerable different combinations; and they 
can influence almost any aspect of cell behavior. The cell responds to this blizzard 
of signals selectively, in large part by expressing only those receptors and intracel- 
lular signaling systems that respond to the signals that are required for the regula- 
tion of that cell. 

Most cells respond to many different signals in the environment, and some 
of these signals may influence the response to other signals. One of the key chal- 
lenges in cell biology is to determine how a cell integrates all of this signaling 
information in order to make decisions—to divide, to move, to differentiate, and 
so on. Many cells, for example, require a specific combination of extracellular sur- 
vival factors to allow the cell to continue living; when deprived of these signals, 
the cell activates a suicide program and kills itself—usually by apoptosis, a form 
of programmed cell death, as discussed in Chapter 18. Cell proliferation often 
depends on a combination of signals that promote both cell division and survival, 
as well as signals that stimulate cell growth (Figure 15-4). On the other hand, dif- 
ferentiation into a nondividing state (called terminal differentiation) frequently 
requires a different combination of survival and differentiation signals that must 
override any signal to divide. 

In principle, the hundreds of signal molecules that an animal makes can 
be used in an almost unlimited number of combinations to control the diverse 
behaviors of its cells in highly specific ways. Relatively small numbers of types of 
signal molecules and receptors are sufficient. The complexity lies in the ways in 
which cells respond to the combinations of signals that they receive. 

A signal molecule often has different effects on different types of target cells. 
The neurotransmitter acetylcholine (Figure 15-5A), for example, decreases the 
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Figure 15-3 The binding of extracellular 
signal molecules to either cell-surface 

or intracellular receptors. (A) Most signal 
molecules are hydrophilic and are therefore 
unable to cross the target cell’s plasma 
membrane directly; instead, they bind to 
cell-surface receptors, which in turn generate 
signals inside the target cell (see Figure 
15-1). (B) Some small signal molecules, 

by contrast, diffuse across the plasma 
membrane and bind to receptor proteins 
inside the target cell—either in the cytosol 

or in the nucleus (as shown here). Many of 
these small signal molecules are hydrophobic 
and poorly soluble in aqueous solutions; they 
are therefore transported in the bloodstream 
and other extracellular fluids bound to carrier 
proteins, from which they dissociate before 
entering the target cell. 
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rate of action potential firing in heart pacemaker cells (Figure 15-5B) and stimu- 
lates the production of saliva by salivary gland cells (Figure 15-5C), even though 
the receptors are the same on both cell types. In skeletal muscle, acetylcholine 
causes the cells to contract by binding to a different receptor protein (Figure 
15-5D). The different effects of acetylcholine in these cell types result from dif- 
ferences in the intracellular signaling proteins, effector proteins, and genes that 
are activated. Thus, an extracellular signal itself has little information content; it 
simply induces the cell to respond according to its predetermined state, which 
depends on the cell’s developmental history and the specific genes it expresses. 
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Figure 15-4 An animal cell’s 
dependence on multiple extracellular 
signal molecules. Each cell type displays 
a set of receptors that enables it to respond 
to a corresponding set of signal molecules 
produced by other cells. These signal 
molecules work in various combinations 

to regulate the behavior of the cell. As 
shown here, an individual cell often requires 
multiple signals to survive (blue arrows) 

and additional signals to grow and divide 
(red arrows) or differentiate (green arrows). 
If deprived of appropriate survival signals, 

a cell will undergo a form of cell suicide 
known as apoptosis. The actual situation is 
even more complex. Although not shown, 
some extracellular signal molecules act to 
inhibit these and other cell behaviors, or 
even to induce apoptosis. 
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Figure 15-5 Various responses induced by the neurotransmitter acetylcholine. (A) The chemical structure of acetylcholine. (B-D) Different 
cell tyoes are specialized to respond to acetylcholine in different ways. In some cases (B and C), acetylcholine binds to similar receptor proteins 
(G-protein-coupled receptors; see Figure 15-6), but the intracellular signals produced are interpreted differently in cells specialized for different 
functions. In other cases (D), the receptor protein is also different (here, an ion-channel-coupled receptor; see Figure 15-6). 
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There Are Three Major Classes of Cell-Surface Receptor Proteins 


Most extracellular signal molecules bind to specific receptor proteins on the sur- 
face of the target cells they influence and do not enter the cytosol or nucleus. 
These cell-surface receptors act as signal transducers by converting an extracel- 
lular ligand-binding event into intracellular signals that alter the behavior of the 
target cell. 

Most cell-surface receptor proteins belong to one of three classes, defined by 
their transduction mechanism. Ion-channel-coupled receptors, also known as 
transmitter-gated ion channels or ionotropic receptors, are involved in rapid synap- 
tic signaling between nerve cells and other electrically excitable target cells such 
as nerve and muscle cells (Figure 15-6A). This type of signaling is mediated by a 
small number of neurotransmitters that transiently open or close an ion channel 
formed by the protein to which they bind, briefly changing the ion permeability 
of the plasma membrane and thereby changing the excitability of the postsyn- 
aptic target cell. Most ion-channel-coupled receptors belong to a large family of 
homologous, multipass transmembrane proteins. Because they are discussed in 
detail in Chapter 11, we will not consider them further here. 

G-protein-coupled receptors act by indirectly regulating the activity of a 
separate plasma-membrane-bound target protein, which is generally either an 
enzyme or an ion channel. A trimeric GTP-binding protein (G protein) mediates 
the interaction between the activated receptor and this target protein (Figure 
15-6B). The activation of the target protein can change the concentration of one or 


Figure 15-6 Three classes of cell-surface receptors. (A) lon-channel- 
coupled receptors (also called transmitter-gated ion channels), 
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more small intracellular signaling molecules (if the target protein is an enzyme), 
or it can change the ion permeability of the plasma membrane (if the target pro- 
tein is an ion channel). The small intracellular signaling molecules act in turn to 
alter the behavior of yet other signaling proteins in the cell. 

Enzyme-coupled receptors either function as enzymes or associate directly 
with enzymes that they activate (Figure 15-6C). They are usually single-pass 
transmembrane proteins that have their ligand-binding site outside the cell and 
their catalytic or enzyme-binding site inside. Enzyme-coupled receptors are het- 
erogeneous in structure compared with the other two classes; the great majority, 
however, are either protein kinases or associate with protein kinases, which phos- 
phorylate specific sets of proteins in the target cell when activated. 

There are also some types of cell-surface receptors that do not fit easily into 
any of these classes but have important functions in controlling the specializa- 
tion of different cell types during development and in tissue renewal and repair in 
adults. We discuss these in a later section, after we explain how G-protein-coupled 
receptors and enzyme-coupled receptors operate. First, we continue our general 
discussion of the principles of signaling via cell-surface receptors. 


Cell-Surface Receptors Relay Signals Via Intracellular Signaling 
Molecules 


Numerous intracellular signaling molecules relay signals received by cell-surface 
receptors into the cell interior. The resulting chain of intracellular signaling events 
ultimately alters effector proteins that are responsible for modifying the behavior 
of the cell (see Figure 15-1). 

Some intracellular signaling molecules are small chemicals, which are often 
called second messengers (the “first messengers” being the extracellular signals). 
They are generated in large amounts in response to receptor activation and diffuse 
away from their source, spreading the signal to other parts of the cell. Some, such 
as cyclic AMP and Ca**, are water-soluble and diffuse in the cytosol, while oth- 
ers, such as diacylglycerol, are lipid-soluble and diffuse in the plane of the plasma 
membrane. In either case, they pass the signal on by binding to and altering the 
behavior of selected signaling or effector proteins. 

Most intracellular signaling molecules are proteins, which help relay the sig- 
nal into the cell by either generating second messengers or activating the next 
signaling or effector protein in the pathway. Many of these proteins behave like 
molecular switches. When they receive a signal, they switch from an inactive to an 
active state, until another process switches them off, returning them to their inac- 
tive state. The switching off is just as important as the switching on. If a signaling 
pathway is to recover after transmitting a signal so that it can be ready to transmit 
another, every activated molecule in the pathway must return to its original, unac- 
tivated state. 

The largest class of molecular switches consists of proteins that are activated 
or inactivated by phosphorylation (discussed in Chapter 3). For these proteins, 
the switch is thrown in one direction by a protein kinase, which covalently adds 
one or more phosphate groups to specific amino acids on the signaling protein, 
and in the other direction by a protein phosphatase, which removes the phos- 
phate groups (Figure 15-7A). The activity of any protein regulated by phosphory- 
lation depends on the balance between the activities of the kinases that phos- 
phorylate it and of the phosphatases that dephosphorylate it. About 30-50% of 
human proteins contain covalently attached phosphate, and the human genome 
encodes about 520 protein kinases and about 150 protein phosphatases. A typical 
mammalian cell makes use of hundreds of distinct types of protein kinases at any 
moment. 

Protein kinases attach phosphate to the hydroxyl group of specific amino 
acids on the target protein. There are two main types of protein kinase. The great 
majority are serine/threonine kinases, which phosphorylate the hydroxyl groups 
of serines and threonines in their targets. Others are tyrosine kinases, which 
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phosphorylate proteins on tyrosines. The two types of protein kinase are closely 
related members of a large family, differing primarily in the structure of their pro- 
tein substrate binding sites. 

Many intracellular signaling proteins controlled by phosphorylation are them- 
selves protein kinases, and these are often organized into kinase cascades. In such 
a cascade, one protein kinase, activated by phosphorylation, phosphorylates the 
next protein kinase in the sequence, and so on, relaying the signal onward and, in 
some cases, amplifying it or spreading it to other signaling pathways. 

The other important class of molecular switches consists of GTP-binding pro- 
teins (discussed in Chapter 3). These proteins switch between an “on” (actively 
signaling) state when GTP is bound and an “off” state when GDP is bound. In the 
“on” state, they usually have intrinsic GTPase activity and shut themselves off by 
hydrolyzing their bound GTP to GDP (Figure 15-7B). There are two major types 
of GTP-binding proteins. Large, trimeric GTP-binding proteins (also called G pro- 
teins) help relay signals from G-protein-coupled receptors that activate them (see 
Figure 15-6B). Small monomeric GTPases (also called monomeric GTP-binding 
proteins) help relay signals from many classes of cell-surface receptors. 

Specific regulatory proteins control both types of GTP-binding proteins. 
GTPase-activating proteins (GAPs) drive the proteins into an “off” state by 
increasing the rate of hydrolysis of bound GTP. Conversely, guanine nucleotide 
exchange factors (GEFs) activate GTP-binding proteins by promoting the release 
of bound GDP, which allows a new GTP to bind. In the case of trimeric G proteins, 
the activated receptor serves as the GEE Figure 15-8 illustrates the regulation of 
monomeric GTPases. 

Not all molecular switches in signaling systems depend on phosphorylation 
or GTP binding. We see later that some signaling proteins are switched on or off 
by the binding of another signaling protein or a second messenger such as cyclic 
AMP or Ca**, or by covalent modifications other than phosphorylation or dephos- 
phorylation, such as ubiquitylation (discussed in Chapter 3). 

For simplicity, we often portray a signaling pathway as a series of activation 
steps (see Figure 15-1). It is important to note, however, that most signaling path- 
ways contain inhibitory steps, and a sequence of two inhibitory steps can have the 
same effect as one activating step (Figure 15-9). This double-negative activation 
is very common in signaling systems, as we will see when we describe specific 
pathways later in this chapter. 


Intracellular Signals Must Be Specific and Precise in a Noisy 
Cytoplasm 
Ideally, an activated intracellular signaling molecule should interact only with 


the appropriate downstream targets, and, likewise, the targets should only be 
activated by the appropriate upstream signal. In reality, however, intracellular 
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Figure 15-7 Two types of intracellular 
signaling proteins that act as 

molecular switches. (A) A protein kinase 
covalently adds a phosphate from ATP 

to the signaling protein, and a protein 
phosphatase removes the phosphate. 
Although not shown, many signaling 
proteins are activated by dephosphorylation 
rather than by phosphorylation. (B) A GTP- 
binding protein is induced to exchange its 
bound GDP for GTP, which activates the 
protein; the protein then inactivates itself by 
hydrolyzing its bound GTP to GDP. 
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signaling molecules share the cytoplasm with a crowd of closely related signaling 
molecules that control a diverse array of cellular processes. It is inevitable that an 
occasional signaling molecule will bind or modify the wrong partner, potentially 
creating unwanted cross-talk and interference between signaling systems. How 
does a signal remain strong, precise, and specific under these noisy conditions? 

The first line of defense comes from the high affinity and specificity of the 
interactions between signaling molecules and their correct partners compared to 
the relatively low affinity of the interactions between inappropriate partners. The 
binding of a signaling molecule to the correct target is determined by precise and 
complex interactions between complementary surfaces on the two molecules. 
Protein kinases, for example, contain active sites that recognize a specific amino 
acid sequence around the phosphorylation site on the correct target protein, and 
they often contain additional docking sites that promote a specific, high-affinity 
interaction with the target. These and related mechanisms help provide a strong 
and persistent interaction between the correct partners, reducing the likelihood 
of inappropriate interactions with other proteins. 

Another important way that cells avoid responses to unwanted background 
signals depends on the ability of many downstream target proteins to sim- 
ply ignore such signals. These proteins respond only when the upstream signal 
reaches a high concentration or activity level. Consider a signaling pathway in 
which a protein kinase activates some downstream target protein by phosphory- 
lation. Ifa response is triggered only when more than half of the target proteins are 
phosphorylated, then there will be little harm done if a small number of them are 
occasionally phosphorylated by some inappropriate protein kinase. Furthermore, 
constitutively active protein phosphatases will further reduce the impact of back- 
ground phosphorylation by rapidly removing much of it. In these and other ways, 
intracellular signaling systems filter out noise, generating little or no response to 
low levels of stimuli. 
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Figure 15-8 The regulation of a 
monomeric GTPase. GTPase-activating 
proteins (GAPs) inactivate the protein by 
stimulating it to hydrolyze its bound GTP 
to GDP, which remains tightly bound to the 
inactivated GTPase. Guanine nucleotide 
exchange factors (GEFs) activate the 
inactive protein by stimulating it to release 
its GDP; because the concentration of GTP 
in the cytosol is 10 times greater than the 
concentration of GDP, the protein rapidly 
binds GTP and is thereby activated. 


Figure 15-9 A sequence of two inhibitory 
signals produces a positive signal. 

(A) In this simple signaling system, a 
transcription regulator is kept in an 

inactive state by a bound inhibitor 

protein. In response to some upstream 
signal, a protein kinase is activated and 
phosphorylates the inhibitor, causing its 
dissociation from the transcription regulator 
and thereby activating gene expression. 

(B) This signaling pathway can be 
diagrammed as a sequence of four steps, 
including two sequential inhibitory steps 
that are equivalent to a single activating 
step. 
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Cells in a population often exhibit random variation in the concentration or 
activity of their intracellular signaling molecules. Similarly, individual molecules 
in a large population of molecules vary in their activity or interactions with other 
molecules. This signal variability introduces another form of noise that can inter- 
fere with the precision and efficiency of signaling. Most signaling systems, how- 
ever, are built to generate remarkably robust and precise responses even when 
upstream signals are variable or even when some components of the system 
are disabled. In many cases, this robustness depends on the presence of backup 
mechanisms: for example, a signal might employ two parallel pathways to acti- 
vate a single common downstream target protein, allowing the response to occur 
even if one pathway is crippled. 


Intracellular Signaling Complexes Form at Activated Receptors 


One simple and effective strategy for enhancing the specificity of interactions 
between signaling molecules is to localize them in the same part of the cell or 
even within large protein complexes, thereby ensuring that they interact only with 
each other and not with inappropriate partners. Such mechanisms often involve 
scaffold proteins, which bring together groups of interacting signaling proteins 
into signaling complexes, often before a signal has been received (Figure 15-10A). 
Because the scaffold holds the proteins in close proximity, they can interact at 
high local concentrations and be sequentially activated rapidly, efficiently, and 
selectively in response to an appropriate extracellular signal, avoiding unwanted 
cross-talk with other signaling pathways. 

In other cases, such signaling complexes form only transiently in response 
to an extracellular signal and rapidly disassemble when the signal is gone. They 
often assemble around a receptor after an extracellular signal molecule has acti- 
vated it. In many of these cases, the cytoplasmic tail of the activated receptor is 
phosphorylated during the activation process, and the phosphorylated amino 
acids then serve as docking sites for the assembly of other signaling proteins 
(Figure 15-10B). In yet other cases, receptor activation leads to the production 
of modified phospholipid molecules (called phosphoinositides) in the adjacent 
plasma membrane, which then recruit specific intracellular signaling proteins to 
this region of membrane, where they are activated (Figure 15-10C). 


Modular Interaction Domains Mediate Interactions Between 
Intracellular Signaling Proteins 


Simply bringing intracellular signaling proteins together into close proximity is 
sometimes sufficient to activate them. Thus, induced proximity, where a signal 
triggers assembly of a signaling complex, is commonly used to relay signals from 
protein to protein along a signaling pathway. The assembly of such signaling com- 
plexes depends on various highly conserved, small interaction domains, which 
are found in many intracellular signaling proteins. Each of these compact pro- 
tein modules binds to a particular structural motif in another protein or lipid. The 
recognized motif in the interacting protein can be a short peptide sequence, a 
covalent modification (such as a phosphorylated amino acid), or another protein 
domain. The use of modular interaction domains presumably facilitated the evo- 
lution of new signaling pathways; because it can be inserted at many locations in 
a protein without disturbing the protein’s folding or function, a new interaction 
domain added to an existing signaling protein could connect the protein to addi- 
tional signaling pathways. 

There are many types of interaction domains in signaling proteins. Src homol- 
ogy 2 (SH2) domains and phosphotyrosine-binding (PTB) domains, for example, 
bind to phosphorylated tyrosines in a particular peptide sequence on activated 
receptors or intracellular signaling proteins. Src homology 3 (SH3) domains bind 
to short, proline-rich amino acid sequences. Some pleckstrin homology (PH) 
domains bind to the charged head groups of specific phosphoinositides that are 
produced in the plasma membrane in response to an extracellular signal; they 
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enable the protein they are part of to dock on the membrane and interact with 
other similarly recruited signaling proteins (see Figure 15-10C). Some signaling 
proteins consist solely of two or more interaction domains and function only as 


adaptors to link two other proteins together in a signaling pathway. 


Interaction domains enable signaling proteins to bind to one another in mul- 
tiple specific combinations. Like Lego® bricks, the proteins can form linear or 
branching chains or three-dimensional networks, which determine the route 
followed by the signaling pathway. As an example, Figure 15-11 illustrates how 
some interaction domains mediate the formation of a large signaling complex 


around the receptor for the hormone insulin. 
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Figure 15-10 Three types of intracellular 
signaling complexes. (A) A receptor and 
some of the intracellular signaling proteins 
it activates in sequence are preassembled 
into a signaling complex on the inactive 
receptor by a large scaffold protein. 

(B) A signaling complex assembles 
transiently on a receptor only after the 
binding of an extracellular signal molecule 
has activated the receptor; here, the 
activated receptor phosphorylates itself at 
multiple sites, which then act as docking 
sites for intracellular signaling proteins. 

(C) Activation of a receptor leads to the 
increased phosphorylation of specific 
phospholipids (@hosphoinositides) in 

the adjacent plasma membrane; these 
then serve as docking sites for specific 
intracellular signaling proteins, which can 
now interact with each other. 
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Another way of bringing receptors and intracellular signaling proteins together 
is to concentrate them in a specific region of the cell. An important example is the 
primary cilium that projects like an antenna from the surface of most vertebrate 
cells (discussed in Chapter 16). It is usually short and nonmotile and has micro- 
tubules in its core, and a number of surface receptors and signaling proteins are 
concentrated there. We shall see later that light and smell receptors are also highly 
concentrated in specialized cilia. 


The Relationship Between Signal and Response Varies in Different 
Signaling Pathways 


The function of an intracellular signaling system is to detect and measure a spe- 
cific stimulus in one location of a cell and then generate an appropriately timed 
and measured response at another location. The system accomplishes this task 
by sending information in the form of molecular “signals” from the sensor to the 
target, often through a series of intermediaries that do not simply pass the signal 
along but process it in various ways. All signaling systems do not work in precisely 
the same way: each has evolved specialized behaviors that produce a response 
that is appropriate for the cell function that system controls. In the following para- 
graphs, we list some of these behaviors and describe how they vary in different 
systems, as a foundation for more detailed discussions later. 


1. Response timing varies dramatically in different signaling systems, accord- 
ing to the speed required for the response. In some cases, such as synaptic 
signaling (see Figure 15-2C), the response can occur within milliseconds. 
In other cases, as in the control of cell fate by morphogens during develop- 
ment, a full response can require hours or days. 


2. Sensitivity to extracellular signals can vary greatly. Hormones tend to act 
at very low concentrations on their distant target cells, which are therefore 
highly sensitive to low concentrations of signal. Neurotransmitters, on the 
other hand, operate at much higher concentrations at a synapse, reducing 
the need for high sensitivity in postsynaptic receptors. Sensitivity is often 
controlled by changes in the number or affinity of the receptors on the tar- 
get cell. A particularly important mechanism for increasing the sensitivity 
of a signaling system is signal amplification, whereby a small number of 
activated cell-surface receptors evoke a large intracellular response either 
by producing large amounts of a second messenger or by activating many 
copies of a downstream signaling protein. 


3. Dynamic range of a signaling system is related to its sensitivity. Some sys- 
tems, like those involved in simple developmental decisions, are responsive 


Figure 15-11 A specific signaling 
complex formed using modular 
interaction domains. This example is 
based on the insulin receptor, which is 

an enzyme-coupled receptor (a receptor 
tyrosine kinase, discussed later). First, 

the activated receptor phosphorylates 
itself on tyrosines, and one of the 
phosphotyrosines then recruits a docking 
protein called insulin receptor substrate-1 
(IRS1) viaa PTB domain of IRS1; the PH 
domain of IRS1 also binds to specific 
phosphoinositides on the inner surface of 
the plasma membrane. Then, the activated 
receptor phosphorylates IRS1 on tyrosines, 
and one of these phosphotyrosines binds 
the SH2 domain of the adaptor protein 
Grb2. Next, Gro2 uses one of its two SH3 
domains to bind to a proline-rich region of 
a protein called Sos, which relays the signal 
downstream by acting as a GEF (see Figure 
15-8) to activate a monomeric GTPase 
called Ras (not shown). Sos also binds to 
phosphoinositides in the plasma membrane 
via its PH domain. Gro2 uses its other SH3 
domain to bind to a proline-rich sequence 
in a scaffold protein. The scaffold protein 
binds several other signaling proteins, and 
the other phosphorylated tyrosines on IRS1 
recruit additional signaling proteins that 
have SH2 domains (not shown). 
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Over a narrow range of extracellular signal concentrations. Other systems, 
like those controlling vision or the metabolic response to some hormones, 
are highly responsive over a much broader range of signal strengths. We 
will see that broad dynamic range is often achieved by adaptation mecha- 
nisms that adjust the responsiveness of the system according to the prevail- 
ing amount of signal. 


4. Persistence of a response can vary greatly. A transient response of less than 
a second is appropriate in some synaptic responses, for example, while a 
prolonged or even permanent response is required in cell fate decisions 
during development. Numerous mechanisms, including positive feedback, 
can be used to alter the duration and reversibility of a response. 


5. Signal processing can convert a simple signal into a complex response. In 
many systems, for example, a gradual increase in an extracellular signal 
is converted into an abrupt, switchlike response. In other cases, a sim- 
ple input signal is converted into an oscillatory response, produced by a 
repeating series of transient intracellular signals. Feedback usually lies at 
the heart of biochemical switches and oscillators, as we describe later. 


6. Integration allows a response to be governed by multiple inputs. As dis- 
cussed earlier, for example, specific combinations of extracellular signals 
are generally required to stimulate complex cell behaviors such as cell sur- 
vival and proliferation (see Figure 15-4). The cell therefore has to integrate 
information coming from multiple signals, which often depends on intra- 
cellular coincidence detectors; these proteins are equivalent to AND gates 
in the microprocessor of a computer, in that they are only activated if they 
receive multiple converging signals (Figure 15-12). 


7. Coordination of multiple responses in one cell can be achieved by a single 
extracellular signal. Some extracellular signal molecules, for example, stim- 
ulate a cell to both grow and divide. This coordination generally depends 
on mechanisms for distributing a signal to multiple effectors, by creating 
branches in the signaling pathway. In some cases, the branching of signal- 
ing pathways can allow one signal to modulate the strength of a response to 
other signals. 

Given the complexity that arises from behaviors like signal integration, distri- 
bution, and feedback, it is clear that signaling systems rarely depend on a simple 
linear sequence of steps but are often more like a network, in which information 
flows not just forward but in multiple directions—and sometimes even backward. 
A major research challenge is to understand the nature of these networks and the 
response behaviors they can achieve. 


The Speed of a Response Depends on the Turnover of Signaling 
Molecules 


The speed of any signaling response depends on the nature of the intracellular 
signaling molecules that carry out the target cell’s response. When the response 
requires only changes in proteins already present in the cell, it can occur very rap- 
idly: an allosteric change in a neurotransmitter-gated ion channel (discussed in 
Chapter 11), for example, can alter the plasma membrane electrical potential in 
milliseconds, and responses that depend solely on protein phosphorylation can 
occur within seconds. When the response involves changes in gene expression 
and the synthesis of new proteins, however, it usually requires many minutes or 
hours, regardless of the mode of signal delivery (Figure 15-13). 

It is natural to think of intracellular signaling systems in terms of the changes 
produced when an extracellular signal is delivered. But it is just as important to 
consider what happens when the signal is withdrawn. During development, tran- 
sient extracellular signals often produce lasting effects: they can trigger a change 
in the cell’s development that persists indefinitely through cell memory mecha- 
nisms, as we discuss later (and in Chapters 7 and 22). In most cases in adult tis- 
sues, however, the response fades when a signal ceases. Often the effect is transient 
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Figure 15-12 Signal integration. 
Extracellular signals A and B activate 
different intracellular signaling pathways, 
each of which leads to the phosphorylation 
of protein Y but at different sites on the 
protein. Protein Y is activated only when 
both of these sites are phosphorylated, 
and therefore it becomes active only when 
signals A and B are simultaneously present. 
Such proteins are often called coincidence 
detectors. 
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because the signal exerts its effects by altering the concentrations of intracellular 
molecules that are short-lived (unstable), undergoing continual turnover. Thus, 
once the extracellular signal is gone, the degradation of the old molecules quickly 
wipes out all traces of the signal’s action. It follows that the speed with which a cell 
responds to signal removal depends on the rate of destruction, or turnover, of the 
intracellular molecules that the signal affects. 

It is also true, although much less obvious, that this turnover rate can deter- 
mine the promptness of the response when an extracellular signal arrives. Con- 
sider, for example, two intracellular signaling molecules, X and Y, both of which 
are normally maintained at a steady-state concentration of 1000 molecules per 
cell. The cell synthesizes and degrades molecule Y at a rate of 100 molecules per 
second, with each molecule having an average lifetime of 10 seconds. Molecule X 
has a turnover rate that is 10 times slower than that of Y: it is both synthesized and 
degraded at a rate of 10 molecules per second, so that each molecule has an aver- 
age lifetime in the cell of 100 seconds. If a signal acting on the cell causes a tenfold 
increase in the synthesis rates of both X and Y with no change in the molecu- 
lar lifetimes, at the end of 1 second the concentration of Y will have increased 
by nearly 900 molecules per cell (10 x 100 - 100), while the concentration of X 
will have increased by only 90 molecules per cell. In fact, after a molecule’s syn- 
thesis rate has been either increased or decreased abruptly, the time required for 
the molecule to shift halfway from its old to its new equilibrium concentration is 
equal to its half-life—that is, equal to the time that would be required for its con- 
centration to fall by half if all synthesis were stopped (Figure 15-14). 

The same principles apply to proteins and small molecules, whether the mol- 
ecules are in the extracellular space or inside cells. Many intracellular proteins 
have short half-lives, some surviving for less than 10 minutes. In most cases, these 
are key regulatory proteins whose concentrations are rapidly controlled in the cell 
by changes in their rates of synthesis. 

As we have seen, many cell responses to extracellular signals depend on the 
conversion of intracellular signaling proteins from an inactive to an active form, 
rather than on their synthesis or degradation. Phosphorylation or the binding of 
GTP, for example, commonly activates signaling proteins. Even in these cases, 
however, the activation must be rapidly and continuously reversed (by dephos- 
phorylation or GTP hydrolysis to GDP, respectively, in these examples) to make 
rapid signaling possible. These inactivation processes play a crucial part in deter- 
mining the magnitude, rapidity, and duration of the response. 


Figure 15-13 Slow and rapid responses 
to an extracellular signal. Certain types 
of signal-induced cellular responses, such 
as increased cell growth and division, 
involve changes in gene expression 

and the synthesis of new proteins; they 
therefore occur slowly, often starting an 
hour or more after the signal is received. 
Other responses—such as changes in cell 
movement, secretion, or metabolism — 
need not involve changes in gene 
transcription and therefore occur much 
more quickly, often starting in seconds 

or minutes; they may involve the rapid 
phosphorylation of effector proteins in 

the cytoplasm, for example. Synaptic 
responses mediated by changes in 
membrane potential are even quicker and 
can occur in milliseconds (not shown). 
Some signaling systems generate both 
rapid and slow responses as shown here, 
allowing the cell to respond quickly to a 
signal while simultaneously initiating a more 
long-term, persistent response. 
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Cells Can Respond Abruptly to a Gradually Increasing Signal 


Some signaling systems are capable of generating a smoothly graded response 
over a wide range of extracellular signal concentrations (Figure 15-15, blue line); 
such systems are useful, for example, in the fine tuning of metabolic processes 
by some hormones. Other signaling systems generate significant responses only 
when the signal concentration rises beyond some threshold value. These abrupt 
responses are of two types. One is a sigmoidal response, in which low concentra- 
tions of stimulus do not have much effect, but then the response rises steeply and 
continuously at intermediate stimulus levels (Figure 15-15, red line). Such sys- 
tems provide a filter to reduce inappropriate responses to low-level background 
signals but respond with high sensitivity when the stimulus falls within a small 
range of physiological signal concentrations. A second type of abrupt response 
is the discontinuous or all-or-none response, in which the response switches 
on completely (and often irreversibly) when the signal reaches some threshold 
concentration (Figure 15-15, green line). Such responses are particularly useful 
for controlling the choice between two alternative cell states, and they generally 
involve positive feedback, as we describe in more detail shortly. 

Cells use a variety of molecular mechanisms to produce a sigmoidal response 
to increasing signal concentrations. In one mechanism, more than one intracel- 
lular signaling molecule must bind to its downstream target protein to induce a 
response. As we discuss later, for example, four molecules of the second messen- 
ger cyclic AMP must bind simultaneously to each molecule of cyclic-AMP-depen- 
dent protein kinase (PKA) to activate the kinase. A similar sharpening of response 
is seen when the activation of an intracellular signaling protein requires phos- 
phorylation at more than one site. Such responses become sharper as the number 
of required molecules or phosphate groups increases, and if the number is large 
enough, responses become almost all-or-none (Figure 15-16). 

Responses are also sharpened when an intracellular signaling molecule acti- 
vates one enzyme and also inhibits another enzyme that catalyzes the oppo- 
site reaction. A well-studied example of this common type of regulation is the 
stimulation of glycogen breakdown in skeletal muscle cells induced by the hor- 
mone adrenaline (epinephrine). Adrenaline’s binding to a G-protein-coupled 
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Figure 15-14 The importance of rapid 
turnover. The graphs show the predicted 
relative rates of change in the intracellular 
concentrations of molecules with differing 
turnover times when their synthesis rates 
are either (A) decreased or (B) increased 
suddenly by a factor of 10. In both cases, 
the concentrations of those molecules that 
are normally degraded rapidly in the cell 
(red lines) change quickly, whereas the 
concentrations of those that are normally 
degraded slowly (green lines) change 
proportionally more slowly. The numbers (in 
blue) on the right are the half-lives assumed 
for each of the different molecules. 
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Figure 15-15 Signal processing can produce smoothly graded or 
switchlike responses. Some cell responses increase gradually as the 
concentration of extracellular signal molecule increases, eventually reaching 

a plateau as the signaling pathway is saturated, resulting in a hyperbolic 
response curve (blue line). In other cases, the signaling system reduces 

the response at low signal concentrations and then produces a steeper 
response at some intermediate signal concentration—resulting in a sigmoidal 
response curve (red line). In still other cases, the response is more abrupt and 
switchlike; the cell switches completely between a low and high response, 
without any stable intermediate response (green line). 


cell-surface receptor increases the intracellular concentration of cyclic AMP, 
which both activates an enzyme that promotes glycogen breakdown and inhibits 
an enzyme that promotes glycogen synthesis. 


Positive Feedback Can Generate an All-or-None Response 


Like intracellular metabolic pathways (discussed in Chapter 2) and the systems 
controlling gene activity (Chapter 7), most intracellular signaling systems incor- 
porate feedback loops, in which the output of a process acts back to regulate that 
same process. We discussed the mathematical analysis of feedback loops in Chap- 
ter 8. In positive feedback, the output stimulates its own production; in negative 
feedback, the output inhibits its own production (Figure 15-17). Feedback loops 
are of great general importance in biology, and they regulate many chemical and 
physical processes in cells. Those that regulate cell signaling can either operate 
exclusively within the target cell or involve the secretion of extracellular signals. 
Here, we focus on those feedback loops that operate entirely within the target cell; 
even the simplest of these loops can produce complex and interesting effects. 
Positive feedback in a signaling pathway can transform the behavior of the 
responding cell. If the positive feedback is of only moderate strength, its effect will 
be simply to steepen the response to the signal, generating a sigmoidal response 
like those described earlier; but if the feedback is strong enough, it can produce 
an all-or-none response (see Figure 15-15). This response goes hand in hand with 
a further property: once the responding system has switched to the high level of 
activation, this condition is often self-sustaining and can persist even after the 
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Figure 15-16 Activation curves for 

an allosteric protein as a function 

of effector molecule concentration. 
The curves show how the sharpness of 
the activation response increases with 
an increase in the number of allosteric 
effector molecules that must be bound 
simultaneously to activate the target 
protein. The curves shown are those 
expected, under certain conditions, if the 
activation requires the simultaneous binding 
of 1, 2, 8, or 16 effector molecules. 
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signal strength drops back below its critical value. In such a case, the system is 
said to be bistable: it can exist in either a “switched-off” or a “switched-on” state, 
and a transient stimulus can flip it from the one state to the other (Figure 15-18A 
and B). 

Through positive feedback, a transient extracellular signal can induce long- 
term changes in cells and their progeny that can persist for the lifetime of the 
organism. The signals that trigger muscle-cell specification, for example, turn on 
the transcription of a series of genes that encode muscle-specific transcription 
regulatory proteins, which stimulate the transcription of their own genes, as well 
as genes encoding various other muscle-cell proteins; in this way, the decision 
to become a muscle cell is made permanent. This type of cell memory, which 
depends on positive feedback, is one of the basic ways in which a cell can undergo 
a lasting change of character without any alteration in its DNA sequence. 

Studies of signaling responses in large populations of cells can give the false 
impression that a response is smoothly graded, even when strong positive feed- 
back is causing an abrupt, discontinuous switch in the response in individual cells. 
Only by studying the response in single cells is it possible to see its all-or-none 
character (Figure 15-19). The misleading smooth response in a cell population 
is due to the random, intrinsic variability in signaling systems that we described 
earlier: all cells in a population do not respond identically to the same concentra- 
tion of extracellular signal, especially at intermediate signal concentrations where 
the receptor is only partially occupied. 


Negative Feedback is a Common Motif in Signaling Systems 


By contrast with positive feedback, negative feedback counteracts the effect of a 
stimulus and thereby abbreviates and limits the level of the response, making the 
system less sensitive to perturbations (see Chapter 8). As with positive feedback, 
however, qualitatively different responses can be obtained when the feedback 
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Figure 15-17 Positive and negative 
feedback. In these simple examples, a 
stimulus activates protein A, which, in turn, 
activates protein B. Protein B then acts 
back to either increase or decrease the 
activity of A. 


Figure 15-18 Some effects of simple 
feedback. The graphs show the computed 
effects of simple positive and negative 
feedback loops (See Chapter 8). In each 
case, the input signal is an activated protein 
kinase (S) that phosphorylates and thereby 
activates another protein kinase (E); a 
protein phosphatase (l) dephosphorylates 
and inactivates the activated E kinase. In 
the graphs, the red line indicates the activity 
of the E kinase over time; the underlying 
blue bar indicates the time for which the 
input signal (activated S kinase) is present. 
(A) Diagram of the positive feedback 

loop, in which the activated E kinase acts 
back to promote its own phosphorylation 
and activation; the basal activity of the | 
phosphatase dephosphorylates activated 

E at a steady, low rate. (B) The top graph 
shows that, without feedback, the activity 
of the E kinase is simply proportional (with a 
short lag) to the level of stimulation by the S 
kinase. The bottom graph shows that, with 
the positive feedback loop, the transient 
stimulation by S kinase switches the 
system from an “off” state to an “on” state, 
which then persists after the stimulus has 
been removed. (C) Diagram of the negative 
feedback loop, in which the activated E 
kinase phosphorylates and activates the | 
phosphatase, thereby increasing the rate at 
which the phosphatase dephosphorylates 
and inactivates the phosphorylated E 
kinase. (D) The top graph shows, again, 
the response in E kinase activity without 
feedback. The other graphs show the 
effects on E kinase activity of negative 
feedback operating after a short or long 
delay. With a short delay, the system shows 
a strong, brief response when the signal 

is abruptly changed, and the feedback 
then drives the response back down to a 
lower level. With a long delay, the feedback 
produces sustained oscillations for as long 
as the stimulus is present. 
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operates more powerfully. A delayed negative feedback with a long enough delay 
can produce responses that oscillate. The oscillations may persist for as long as the 
stimulus is present (Figure 15-18C and D) or they may even be generated sponta- 
neously, without need of an external signal to drive them. Many such oscillators 
also contain positive feedback loops that generate sharper oscillations. Later in 
this chapter, we will encounter specific examples of oscillatory behavior in the 
intracellular responses to extracellular signals; all of them depend on negative 
feedback, generally accompanied by positive feedback. 

If negative feedback operates with a short delay, the system behaves like a 
change detector. It gives a strong response to a stimulus, but the response rap- 
idly decays even while the stimulus persists; if the stimulus is suddenly increased, 
however, the system responds strongly again, but, again, the response rapidly 
decays. This is the phenomenon of adaptation, which we now discuss. 


Cells Can Adjust Their Sensitivity to a Signal 


In responding to many types of stimuli, cells and organisms are able to detect the 
same percentage of change in a signal over a wide range of stimulus strengths. 
The target cells accomplish this through a reversible process of adaptation, or 
desensitization, whereby a prolonged exposure to a stimulus decreases the cells’ 
response to that level of stimulus. In chemical signaling, adaptation enables cells 
to respond to changes in the concentration of an extracellular signal molecule 
(rather than to the absolute concentration of the signal) over a very wide range 
of signal concentrations. The underlying mechanism is negative feedback that 
operates with a short delay: a strong response modifies the signaling machinery 
involved, such that the machinery resets itself to become less responsive to the 
same level of signal (see Figure 15-18D, middle graph). Owing to the delay, how- 
ever, a sudden increase in the signal is able to stimulate the cell again for a short 
period before the negative feedback has time to kick in. 

Adaptation to a signal molecule can occur in various ways. It can result from 
inactivation of the receptors themselves. The binding of signal molecules to 
cell-surface receptors, for example, may induce the endocytosis and temporary 
sequestration of the receptors in endosomes. In some cases, such signal-induced 
receptor endocytosis leads to the destruction of the receptors in lysosomes, a pro- 
cess referred to as receptor down-regulation (in other cases, however, activated 
receptors continue to signal after they have been endocytosed). Receptors can 
also become inactivated on the cell surface—for example, by becoming phos- 
phorylated—with a short delay following their activation. Adaptation can also 
occur at sites downstream of the receptors, either by a change in intracellular 
signaling proteins involved in transducing the extracellular signal or by the pro- 
duction of an inhibitor protein that blocks the signal transduction process. These 
various adaptation mechanisms are compared in Figure 15-20. 

Though bewildering in their complexity, the multiple cross-regulatory signal- 
ing pathways and feedback loops that we describe in this chapter are not just a 
haphazard tangle, but a highly evolved system for processing and interpreting 


Figure 15-19 The importance of 
examining individual cells to detect 
all-or-none responses to increasing 
concentrations of an extracellular 
signal. In these experiments, immature 
frog eggs (oocytes) were stimulated with 
increasing concentrations of the hormone 
progesterone. The response was assessed 
by analyzing the activation of MAP kinase 
(discussed later), which is one of the protein 
kinases activated by phosphorylation in the 
response. The amount of phosphorylated 
(activated) MAP kinase in extracts of the 
oocytes was assessed biochemically. In 
(A), extracts of populations of stimulated 
oocytes were analyzed, and the activation 
of MAP kinase appeared to increase 
progressively with increasing progesterone 
concentration. There are two possible ways 
of explaining this result: (B) MAP kinase 
could have increased gradually in each 
individual cell with increasing progesterone 
concentration; or (C) individual cells could 
have responded in an all-or-none way, 

with the gradual increase in total MAP 
kinase activation reflecting the increasing 
number of cells responding with increasing 
progesterone concentration. When extracts 
of individual oocytes were analyzed, it 

was found that cells had either very low 
amounts or very high amounts, but not 
intermediate amounts, of the activated 
kinase, indicating that the response was 
essentially all-or-none at the level of 
individual cells, as diagrammed in (C). 
Subsequent studies revealed that this all- 
or-none response is due in part to strong 
positive feedback in the progesterone 
signaling system. (Adapted from J.E. Ferrell 
and E.M. Machleder, Science 280:895- 
898, 1998. With permission from AAAS.) 
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Figure 15-20 Some ways in which target cells can become adapted (desensitized) to an extracellular signal molecule. 
The mechanisms shown here that operate at the level of the receptor often involve phosphorylation or ubiquitylation of the 
receptor proteins. 


endosome 


the vast number of signals that impinge upon animal cells. The whole molecu- 
lar control network, leading from the receptors at the cell surface to the genes in 
the nucleus, can be viewed as a computing device; and, like that other biological 
computing device, the brain, it presents one of the hardest problems in biology. 
We can identify the components and discover how they work individually. We can 
understand how small subsets of components work together as regulatory mod- 
ules, noise filters, or adaptation mechanisms, as we have seen. However, it is a 
much more difficult task to understand how the system works as a whole. This 
is not only because the system is complex; it is also because the way in which it 
behaves is strongly dependent on the quantitative details of the molecular inter- 
actions, and, for most animal cells, we have only rough qualitative information. A 
major challenge for the future of signaling research is to develop more sophisti- 
cated quantitative and computational methods for the analysis of signaling sys- 
tems, as described in Chapter 8. 


Summary 


Each cell in a multicellular animal is programmed to respond to a specific set of 
extracellular signal molecules produced by other cells. The signal molecules act by 
binding to a complementary set of receptor proteins expressed by the target cells. 
Most extracellular signal molecules activate cell-surface receptor proteins, which 
act as signal transducers, converting the extracellular signal into intracellular ones 
that alter the behavior of the target cell. Activated receptors relay the signal into 
the cell interior by activating intracellular signaling proteins. Some of these signal- 
ing proteins transduce, amplify, or spread the signal as they relay it, while others 
integrate signals from different signaling pathways. Some function as switches that 
are transiently activated by phosphorylation or GTP binding. Large signaling com- 
plexes form by means of modular interaction domains in the signaling proteins, 
which allow the proteins to form functional signaling networks. 

Target cells use various mechanisms, including feedback loops, to adjust the 
ways in which they respond to extracellular signals. Positive feedback loops can 
help cells to respond in an all-or-none fashion to a gradually increasing concen- 
tration of an extracellular signal and to convert a short-lasting signal into a long- 
lasting, or even irreversible, response. Negative feedback allows cells to adapt to a 
signal molecule, which enables them to respond to small changes in the concentra- 
tion of the signal molecule over a large concentration range. 
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SIGNALING THROUGH G-PROTEIN-COUPLED 
RECEPTORS 


G-protein-coupled receptors (GPCRs) form the largest family of cell-surface 
receptors, and they mediate most responses to signals from the external world, as 
well as signals from other cells, including hormones, neurotransmitters, and local 
mediators. Our senses of sight, smell, and taste depend on them. There are more 
than 800 GPCRs in humans, and in mice there are about 1000 concerned with 
the sense of smell alone. The signal molecules that act on GPCRs are as varied in 
structure as they are in function and include proteins and small peptides, as well 
as derivatives of amino acids and fatty acids, not to mention photons of light and 
all the molecules that we can smell or taste. The same signal molecule can activate 
many different GPCR family members; for example, adrenaline activates at least 
9 distinct GPCRs, acetylcholine another 5, and the neurotransmitter serotonin at 
least 14. The different receptors for the same signal are usually expressed in differ- 
ent cell types and elicit different responses. 

Despite the chemical and functional diversity of the signal molecules that acti- 
vate them, all GPCRs have a similar structure. They consist of a single polypeptide 
chain that threads back and forth across the lipid bilayer seven times, forming a 
cylindrical structure, often with a deep ligand-binding site at its center (Figure 
15-21). In addition to their characteristic orientation in the plasma membrane, 
they all use G proteins to relay the signal into the cell interior. 

The GPCR superfamily includes rhodopsin, the light-activated protein in the 
vertebrate eye, as well as the large number of olfactory receptors in the vertebrate 
nose. Other family members are found in unicellular organisms: the receptors in 
yeasts that recognize secreted mating factors are an example. It is likely that the 
GPCRs that mediate cell-cell signaling in multicellular organisms evolved from 
the sensory receptors in their unicellular eukaryotic ancestors. 

It is remarkable that almost half of all known drugs work through GPCRs or the 
signaling pathways GPCRs activate. Of the many hundreds of genes in the human 
genome that encode GPCRs, about 150 encode orphan receptors, for which the 
ligand is unknown. Many of them are likely targets for new drugs that remain to 
be discovered. 


Trimeric G Proteins Relay Signals From GPCRs 


When an extracellular signal molecule binds to a GPCR, the receptor undergoes 
a conformational change that enables it to activate a trimeric GTP-binding pro- 
tein (G protein), which couples the receptor to enzymes or ion channels in the 
membrane. In some cases, the G protein is physically associated with the recep- 
tor before the receptor is activated, whereas in others it binds only after receptor 
activation. There are various types of G proteins, each specific for a particular set 
of GPCRs and for a particular set of target proteins in the plasma membrane. They 
all have a similar structure, however, and operate similarly. 

G proteins are composed of three protein subunits—a, f, and y. In the unstim- 
ulated state, the a subunit has GDP bound and the G protein is inactive (Figure 
15-22). When a GPCR is activated, it acts like a guanine nucleotide exchange fac- 
tor (GEF) and induces the a subunit to release its bound GDP, allowing GTP to 
bind in its place. GTP binding then causes an activating conformational change in 
the Ga subunit, releasing the G protein from the receptor and triggering dissocia- 
tion of the GTP-bound Ga subunit from the GBy pair—both of which then interact 
with various targets, such as enzymes and ion channels in the plasma membrane, 
which relay the signal onward (Figure 15-23). 

The a subunit is a GTPase and becomes inactive when it hydrolyzes its bound 
GTP to GDP. The time required for GTP hydrolysis is usually short because the 
GTPase activity is greatly enhanced by the binding of the a subunit to a second 
protein, which can be either the target protein or a specific regulator of G pro- 
tein signaling (RGS). RGS proteins act as a-subunit-specific GTPase-activating 
proteins (GAPs) (see Figure 15-8), and they help shut off G-protein-mediated 
responses in all eukaryotes. There are about 25 RGS proteins encoded in the 
human genome, each of which interacts with a particular set of G proteins. 
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Figure 15-21 A G-protein-coupled 
receptor (GPCR). (A) GPCRs that bind 
small ligands such as adrenaline have 
small extracellular domains, and the ligand 
usually binds deep within the plane of 

the membrane to a site that is formed by 
amino acids from several transmembrane 
segments. GPCRs that bind protein 
ligands have a large extracellular domain 
(not shown here) that contributes to 

ligand binding. (B) The structure of the 
Bo-adrenergic receptor, a receptor for the 
neurotransmitter adrenaline, illustrates the 
typical cylindrical arrangement of the seven 
transmembrane helices in a GPCR. The 
ligand (orange) binds in a pocket between 
the helices, resulting in conformational 
changes on the cytoplasmic surface of the 
receptor that promote G-protein activation 
(not shown). (PDB code: 3P0G.) 
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Some G Proteins Regulate the Production of Cyclic AMP 


Cyclic AMP (cAMP) acts as a second messenger in some signaling pathways. 
An extracellular signal can increase cAMP concentration more than twentyfold 
in seconds (Figure 15-24). As explained earlier (see Figure 15-14), such a rapid 
response requires balancing a rapid synthesis of the molecule with its rapid 
breakdown or removal. Cyclic AMP is synthesized from ATP by an enzyme called 
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Figure 15-22 The structure of an inactive 
G protein. (A) Note that both the a and the 
y subunits have covalently attached lipid 
molecules (red tails) that help bind them to 
the plasma membrane, and the a subunit 
has GDP bound. (B) The three-dimensional 
structure of the inactive, GDP-bound form 
of a G protein called Gs, which interacts 
with numerous GPCRs, including the 
Bo-adrenergic receptor shown in Figures 
15-21 and 15-23. The a subunit contains 
the GTPase domain and binds to one side 
of the B subunit. The y subunit binds to 
the opposite side of the B subunit, and the 
B and y subunits together form a single 
functional unit. The GTPase domain of the 
a subunit contains two major subdomains: 
the “Ras” domain, which is related to other 
GTPases and provides one face of the 
nucleotide-binding pocket; and the alpha- 
helical or “AH” domain, which clamps the 
nucleotide in place. (B, based on 

D.G. Lombright et al., Nature 379:311- 
319, 1996. With permission from Macmillan 
Publishers Ltd.) 


Figure 15-23 Activation of a G protein 
by an activated GPCR. Binding of an 
extracellular signal molecule to a GPCR 
changes the conformation of the receptor, 
which allows the receptor to bind and 

alter the conformation of a trimeric 

G protein. The AH domain of the G protein 
a Subunit moves outward to open the 
nucleotide-binding site, thereby promoting 
dissociation of GDP. GTP binding then 
promotes closure of the nucleotide-binding 
site, triggering conformational changes that 
cause dissociation of the a subunit from 
the receptor and from the By complex. The 
GTP-bound a subunit and the By complex 
each regulate the activities of downstream 
signaling molecules (not shown). The 
receptor stays active while the extracellular 
signal molecule is bound to it, and it can 
therefore catalyze the activation of many 
G-protein molecules (Movie 15.1). 


834 Chapter 15: Cell Signaling 


time 0 sec time 20 sec 


+ serotonin 
EE 





(A) o (B) 
20 um 


adenylyl cyclase, and it is rapidly and continuously destroyed by cyclic AMP 
phosphodiesterases (Figure 15-25). Adenylyl cyclase is a large, multipass trans- 
membrane protein with its catalytic domain on the cytosolic side of the plasma 
membrane. There are at least eight isoforms in mammals, most of which are regu- 
lated by both G proteins and Ca**. 

Many extracellular signals work by increasing cAMP concentrations inside 
the cell. These signals activate GPCRs that are coupled to a stimulatory G protein 
(Gs). The activated a subunit of Gs binds and thereby activates adenylyl cyclase. 
Other extracellular signals, acting through different GPCRs, reduce cAMP levels 
by activating an inhibitory G protein (G;), which then inhibits adenylyl cyclase. 

Both G, and Gj; are targets for medically important bacterial toxins. Cholera 
toxin, which is produced by the bacterium that causes cholera, is an enzyme that 
catalyzes the transfer of ADP ribose from intracellular NAD* to the a subunit of 
Gs. This ADP ribosylation alters the a subunit so that it can no longer hydrolyze its 
bound GTP, causing it to remain in an active state that stimulates adenylyl cyclase 
indefinitely. The resulting prolonged elevation in cAMP concentration within 
intestinal epithelial cells causes a large efflux of Cl and water into the gut, thereby 
causing the severe diarrhea that characterizes cholera. Pertussis toxin, which is 
made by the bacterium that causes pertussis (whooping cough), catalyzes the 
ADP ribosylation of the a subunit of Gi, preventing the protein from interacting 
with receptors; as a result, the G protein remains in the inactive GDP-bound state 
and is unable to regulate its target proteins. These two toxins are widely used in 
experiments to determine whether a cell’s GPCR-dependent response to a signal 
is mediated by G, or by Gj. 

Some of the responses mediated by a G,-stimulated increase in cAMP concen- 
tration are listed in Table 15-1. As the table shows, different cell types respond 
differently to an increase in cAMP concentration. Some cell types, such as fat cells, 
activate adenylyl cyclase in response to multiple hormones, all of which thereby 
stimulate the breakdown of triglyceride (the storage form of fat) to fatty acids. 
Individuals with genetic defects in the Gs a subunit show decreased responses to 
certain hormones, resulting in metabolic abnormalities, abnormal bone develop- 
ment, and mental retardation. 


Cyclic-AMP-Dependent Protein Kinase (PKA) Mediates Most of 
the Effects of Cyclic AMP 


In most animal cells, cAMP exerts its effects mainly by activating cyclic-AMP- 
dependent protein kinase (PKA). This kinase phosphorylates specific serines or 


Figure 15-25 The synthesis and degradation of cyclic AMP. In a reaction 
catalyzed by the enzyme adenylyl cyclase, cyclic AMP (cAMP) is synthesized 
from ATP through a cyclization reaction that removes two phosphate 

groups as pyrophosphate (PP;); a pyrophosphatase drives this synthesis by 
hydrolyzing the released pyrophosphate to phosphate (not shown). Cyclic 
AMP is short-lived (unstable) in the cell because it is hydrolyzed by specific 
ohosphodiesterases to form 5'-AMP., as indicated. 


Figure 15-24 An increase in cyclic AMP 
in response to an extracellular signal. 
This nerve cell in culture is responding to 
the neurotransmitter serotonin, which acts 
through a GPCR to cause a rapid rise in 
the intracellular concentration of cyclic 
AMP. To monitor the cyclic AMP level, the 
cell has been loaded with a fluorescent 
protein that changes its fluorescence when 
it binds cyclic AMP. Blue indicates a low 
level of cyclic AMP, yellow an intermediate 
level, and red a high level. (A) In the resting 
cell, the cyclic AMP level is about 5 x 107° 
M. (B) Twenty seconds after the addition 
of serotonin to the culture medium, 

the intracellular level of cyclic AMP has 
increased to more than 1078 M in the 
relevant parts of the cell, an increase of 
more than twentyfold. (From B.J. Bacskai 
et al., Science 260:222-226, 1993. With 
permission from AAAS.) 





adenylyl 
cyclase 






cyclic AMP 
phosphodiesterase 


OH OH 


SIGNALING THROUGH G-PROTEIN-COUPLED RECEPTORS 


TABLE 15-1 


LF 4 = L = l 
IAr LIK VNU 
f 


Thyroid gland Thyroid-stimulating hormone (TSH) Thyroid hormone synthesis 
and secretion 


Adrenal cortex | Adrenocorticotrophic hormone Cortisol secretion 
(ACTH) 


Luteinizing hormone (LH) Progesterone secretion 


Heart Adrenaline Increase in heart rate and 
force of contraction 


Fat Adrenaline, ACTH, glucagon, TSH Triglyceride breakdown 


threonines on selected target proteins, including intracellular signaling proteins 
and effector proteins, thereby regulating their activity. The target proteins differ 
from one cell type to another, which explains why the effects of cAMP vary so 
markedly depending on the cell type (see Table 15-1). 

In the inactive state, PKA consists of a complex of two catalytic subunits and 
two regulatory subunits. The binding of cAMP to the regulatory subunits alters 
their conformation, causing them to dissociate from the complex. The released 
catalytic subunits are thereby activated to phosphorylate specific target proteins 
(Figure 15-26). The regulatory subunits of PKA (also called A-kinase) are impor- 
tant for localizing the kinase inside the cell: special A-kinase anchoring proteins 
(AKAPs) bind both to the regulatory subunits and to a component of the cyto- 
skeleton or a membrane of an organelle, thereby tethering the enzyme complex 
to a particular subcellular compartment. Some AKAPs also bind other signaling 
proteins, forming a signaling complex. An AKAP located around the nucleus of 
heart muscle cells, for example, binds both PKA and a phosphodiesterase that 
hydrolyzes cAMP. In unstimulated cells, the phosphodiesterase keeps the local 
cAMP concentration low, so that the bound PKA is inactive; in stimulated cells, 
cAMP concentration rapidly rises, overwhelming the phosphodiesterase and acti- 
vating the PKA. Among the target proteins that PKA phosphorylates and activates 
in these cells is the adjacent phosphodiesterase, which rapidly lowers the cAMP 
concentration again. This negative feedback arrangement converts what might 
otherwise be a prolonged PKA response into a brief, local pulse of PKA activity. 

Whereas some responses mediated by cAMP occur within seconds (see Figure 
15-24), others depend on changes in the transcription of specific genes and take 
hours to develop fully. In cells that secrete the peptide hormone somatostatin, 
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Figure 15-26 The activation of cyclic- 
AMP-dependent protein kinase (PKA). 
The binding of cAMP to the regulatory 
subunits of the PKA tetramer induces a 
conformational change, causing these 
subunits to dissociate from the catalytic 
subunits, thereby activating the kinase 
activity of the catalytic subunits. The 
release of the catalytic subunits requires the 
binding of more than two cAMP molecules 
to the regulatory subunits in the tetramer. 
This requirement greatly sharpens the 
response of the kinase to changes in cAMP 
concentration, as discussed earlier (see 
Figure 15-16). Mammalian cells have at 
least two types of PKAs: type | is mainly in 
the cytosol, whereas type Il is bound via its 
regulatory subunits and special anchoring 
proteins to the plasma membrane, nuclear 
membrane, mitochondrial outer membrane, 
and microtubules. In both types, once the 
catalytic subunits are freed and active, they 
can migrate into the nucleus (where they 
can phosphorylate transcription regulatory 
proteins), while the regulatory subunits 
remain in the cytoplasm. 
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for example, cAMP activates the gene that encodes this hormone. The regulatory 
region of the somatostatin gene contains a short cis-regulatory sequence, called 
the cyclic AMP response element (CRE), which is also found in the regulatory 
region of many other genes activated by cAMP. A specific transcription regulator 
called CRE-binding (CREB) protein recognizes this sequence. When PKA is acti- 
vated by cAMP, it phosphorylates CREB on a single serine; phosphorylated CREB 
then recruits a transcriptional coactivator called CREB-binding protein (CBP), 
which stimulates the transcription of the target genes (Figure 15-27). Thus, CREB 
can transform a short cAMP signal into a long-term change in a cell, a process 
that, in the brain, is thought to play an important part in some forms of learning 
and memory. 


Some G Proteins Signal Via Phospholipids 


Many GPCRs exert their effects through G proteins that activate the plasma-mem- 
brane-bound enzyme phospholipase C-B (PLC). Table 15-2 lists some exam- 
ples of responses activated in this way. The phospholipase acts on a phosphory- 
lated inositol phospholipid (a phosphoinositide) called phosphatidylinositol 
4,5-bisphosphate [PI(4,5)P2|, which is present in small amounts in the inner half 
of the plasma membrane lipid bilayer (Figure 15-28). Receptors that activate this 
inositol phospholipid signaling pathway mainly do so via a G protein called Gg, 
which activates phospholipase C-ß in much the same way that Gs activates adeny- 
lyl cyclase. The activated phospholipase then cleaves the PI(4,5)P2 to generate two 


Figure 15-27 How arise in intracellular 
cyclic AMP concentration can alter 
gene transcription. The binding of an 
extracellular signal molecule to its GPCR 
activates adenylyl cyclase via Gs and 
thereby increases cAMP concentration in 
the cytosol. This rise activates PKA, and 
the released catalytic subunits of PKA 

can then enter the nucleus, where they 
phosphorylate the transcription regulatory 
protein CREB. Once phosphorylated, 
CREB recruits the coactivator CBP, which 
stimulates gene transcription. In some 
cases, at least, the inactive CREB protein is 
bound to the cyclic AMP response element 
(CRE) in DNA before it is phosphorylated 
(not shown). See Movie 15.2. 
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TABLE 15-2 


Acetylcholine Amylase secretion 
Acetylcholine Muscle contraction 


products: inositol 1,4,5-trisphosphate (IP3) and diacylglycerol. At this step, the 
signaling pathway splits into two branches. 

IP3 is a water-soluble molecule that leaves the plasma membrane and diffuses 
rapidly through the cytosol. When it reaches the endoplasmic reticulum (ER), it 
binds to and opens IP3-gated Ca**-release channels (also called IP3 receptors) 
in the ER membrane. Ca** stored in the ER is released through the open chan- 
nels, quickly raising the concentration of Ca** in the cytosol (Figure 15-29). The 
increase in cytosolic Ca** propagates the signal by influencing the activity of Ca?+- 
sensitive intracellular proteins, as we describe shortly. 

At the same time that the IP; produced by the hydrolysis of PI(4,5)P2 is increas- 
ing the concentration of Ca?* in the cytosol, the other cleavage product of the 
PI(4,5)P2, diacylglycerol, is exerting different effects. It also acts as a second mes- 
senger, but it remains embedded in the plasma membrane, where it has several 
potential signaling roles. One of its major functions is to activate a protein kinase 
called protein kinase C (PKC), so named because it is Ca**-dependent. The initial 
rise in cytosolic Ca** induced by IP3 alters the PKC so that it translocates from the 
cytosol to the cytoplasmic face of the plasma membrane. There it is activated by 
the combination of Ca**, diacylglycerol, and the negatively charged membrane 
phospholipid phosphatidylserine (see Figure 15-29). Once activated, PKC phos- 
phorylates target proteins that vary depending on the cell type. The principles are 
the same as discussed earlier for PKA, although most of the target proteins are 
different. 

Diacylglycerol can be further cleaved to release arachidonic acid, which can 
either act as a signal in its own right or be used in the synthesis of other small lipid 
signal molecules called eicosanoids. Most vertebrate cell types make eicosanoids, 
including prostaglandins, which have many biological activities. They participate 
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Figure 15-28 The hydrolysis of PI(4,5) 
P2 by phospholipase C-f. Two second 
messengers are produced directly 

from the hydrolysis of PI(4,5)P2: inositol 
1,4,5-trisohosphate (IP3), which diffuses 
through the cytosol and releases Ca?+ 
from the endoplasmic reticulum, and 
diacylglycerol, which remains in the 
membrane and helps to activate protein 
kinase C (PKC; see Figure 15-29). There 
are several classes of phospholipase 

C: these include the B class, which is 
activated by GPCRs; as we see later, the 
y class is activated by a class of enzyme- 
coupled receptors called receptor tyrosine 
kinases (RTKs). 
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in pain and inflammatory responses, for example, and many anti-inflammatory 
drugs (such as aspirin, ibuprofen, and cortisone) act in part by inhibiting their 
synthesis. 


Caĉ?+ Functions as a Ubiquitous Intracellular Mediator 


Many extracellular signals, and not just those that work via G proteins, trigger an 
increase in cytosolic Ca**+ concentration. In muscle cells, Ca** triggers contrac- 
tion, and in many secretory cells, including nerve cells, it triggers secretion. Ca** 
has numerous other functions in a variety of cell types. Ca** is such an effective 
signaling mediator because its concentration in the cytosol is normally very low 
(~10-’ M), whereas its concentration in the extracellular fluid (~10-° M) and in 
the lumen of the ER [and sarcoplasmic reticulum (SR) in muscle] is high. Thus, 
there is a large gradient tending to drive Ca** into the cytosol across both the 
plasma membrane and the ER or SR membrane. When a signal transiently opens 
Ca** channels in these membranes, Ca** rushes into the cytosol, and the result- 
ing 10-20-fold increase in the local Ca** concentration activates Ca*t-responsive 
proteins in the cell. 

Some stimuli, including membrane depolarization, membrane stretch, 
and certain extracellular signals, activate Ca? channels in the plasma mem- 
brane, resulting in Ca** influx from outside the cell. Other signals, including the 
GPCR-mediated signals described earlier, act primarily through IP3 receptors to 
stimulate Ca** release from intracellular stores in the ER (see Figure 15-29). The 
ER membrane also contains a second type of regulated Ca** channel called the 
ryanodine receptor (so called because it is sensitive to the plant alkaloid ryan- 
odine), which opens in response to rising Ca** levels and thereby amplifies the 
Ca** signal, as we describe shortly. 

Several mechanisms rapidly terminate the Ca** signal and are also responsi- 
ble for keeping the concentration of Ca** in the cytosol low in resting cells. Most 
importantly, there are Ca**-pumps in the plasma membrane and the ER mem- 
brane that use the energy of ATP hydrolysis to pump Ca** out of the cytosol. Cells 
such as muscle and nerve cells, which make extensive use of Ca** signaling, have 
an additional Ca** transporter (a Na*-driven Ca**+ exchanger) in their plasma 
membrane that couples the efflux of Ca** to the influx of Nat. 


Feedback Generates Cat Waves and Oscillations 


The IP3 receptors and ryanodine receptors of the ER membrane have an import- 
ant feature: they are both stimulated by low to moderate cytoplasmic Ca?* con- 
centrations. This Ca*+-induced calcium release (CICR) results in positive feedback, 


Figure 15-29 How GPCRs increase 
cytosolic Ca2+ and activate protein 
kinase C. The activated GPCR 

stimulates the plasma-membrane-bound 
phospholipase C-B (PLCB) via a G protein 
called Gg. The a subunit and By complex of 
Gg are both involved in this activation. Two 
second messengers are produced when 
PI(4,5)P2 is hydrolyzed by activated PLCB. 
Inositol 1,4,5-trisphosphate (IP3) diffuses 
through the cytosol and releases Ca?+ 
from the ER by binding to and opening 
IP3-gated Ca2+-release channels (IP3 
receptors) in the ER membrane. The large 
electrochemical gradient for Ca*+ across 
this membrane causes Ca?+ to escape 
into the cytosol when the release channels 
are opened. Diacylglycerol remains in the 
plasma membrane and, together with 
ohosphatidylserine (not shown) and Ca?2*, 
helps to activate protein kinase C (PKC), 
which is recruited from the cytosol to the 
cytosolic face of the plasma membrane. 
Of the 10 or more distinct isoforms of 
PKC in humans, at least 4 are activated by 
diacylglycerol (Movie 15.3). 
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which has a major impact on the properties of the Ca** signal. The importance of 
this feedback is seen clearly in studies with Ca**-sensitive fluorescent indicators, 
such as aequorin or fura-2 (discussed in Chapter 9), which allow researchers to 
monitor cytosolic Ca** in individual cells under a microscope (Figure 15-30 and 
Movie 15.4). 

When cells carrying a Ca** indicator are treated with a small amount of an 
extracellular signal molecule that stimulates IP production, tiny bursts of Ca?+ 
are seen in one or more discrete regions of the cell. These Ca** puffs or sparks 
reflect the local opening of small groups of IP3-gated Ca**-release channels in the 
ER. Because various Ca**-binding proteins act as Ca** buffers and restrict the dif- 
fusion of Ca**, the signal often remains localized to the site where the Ca** enters 
the cytosol. If the extracellular signal is sufficiently strong and persistent, how- 
ever, the local Ca** concentration can reach a sufficient level to activate nearby 
IP receptors and ryanodine receptors, resulting in a regenerative wave of Ca** 
release that moves through the cytosol (Figure 15-31), much like an action poten- 
tial in an axon. 
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Figure 15-30 The fertilization of an egg 
by a sperm triggers a wave of cytosolic 
Ca?*. This starfish egg was injected with 
a Ca?+-sensitive fluorescent dye before 

it was fertilized. A wave of cytosolic Ca?+ 
(red), released from the ER, sweeps 
across the egg from the site of sperm 
entry (arrow). This Ca*+ wave changes 
the egg cell surface, preventing the 

entry of other sperm, and it also initiates 
embryonic development (Movie 15.5). 
The initial increase in Ca?* is thought to 
be caused by a sperm-specific form of 
PLC (PLC) that the sperm brings into the 
egg cytoplasm when it fuses with the egg; 
the PLCC cleaves PI(4,5)P2 to produce 
IP3, which releases Ca?+ from the egg 
ER. The released Ca?+ stimulates further 
Ca?* release from the ER, producing the 
spreading wave, as we explain in Figure 
15-31. (Courtesy of Stephen A. Stricker.) 


Figure 15-31 Positive and negative 
feedback produce Ca?2+ waves and 
oscillations. This diagram shows IP3 
receptors and ryanodine receptors on 

a portion of the ER membrane: active 
receptors are in green; inactive receptors 
are in red. When a small amount of cytosolic 
IP3 activates a cluster of IP3 receptors at 
one site on the ER membrane (top), the 
local release of Ca2* promotes the opening 
of nearby IP3 and ryanodine receptors, 
resulting in more Ca?+ release. This 

positive feedback (indicated by positive 
signs) produces a regenerative wave of 
Ca?* release that spreads across the cell 
(see Figure 15-30). These waves of Ca?+ 
release move more quickly across the cell 
than would be possible by simple diffusion. 
Also, unlike a diffusing burst of Ca?* ions, 
which will become more dilute as it soreads, 
the regenerative wave produces a high 
Ca?* concentration across the entire cell. 
Eventually, the local Ca?+ concentration 
inactivates IP3 receptors and ryanodine 
receptors (middle; indicated by red negative 
signs), shutting down the Ca?* release. 
Ca2*-pumps reduce the local cytosolic Ca?+ 
concentration to its normal low levels. The 
result is a Ca2*+ spike: positive feedback 
drives a rapid rise in cytosolic Ca?*, and 
negative feedback sends it back down 
again. The Ca?+ channels remain refractory 
to further stimulation for some period of 
time, delaying the generation of another 
Ca? spike (bottom). Eventually, however, 
the negative feedback wears off, allowing 
IP3 to trigger another Ca?+ wave. The end 
result is repeated Ca** oscillations (see 
Figure 15-32). Under some conditions, these 
oscillations can be seen as repeating narrow 
waves of Ca?*+ moving across the cell. 
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Another important property of IP; receptors and ryanodine receptors is that 
they are inhibited, after some delay, by high Ca** concentrations (a form of nega- 
tive feedback). Thus, the rise in Ca”* in a stimulated cell leads to inhibition of Ca** 
release; because Ca** pumps remove the cytosolic Ca*t, the Ca** concentration 
falls (see Figure 15-31). The decline in Ca** eventually relieves the negative feed- 
back, allowing cytosolic Ca** to rise again. As in other cases of delayed negative 
feedback (see Figure 15-18), the result is an oscillation in the Ca** concentration. 
These oscillations persist for as long as receptors are activated at the cell surface, 
and their frequency reflects the strength of the extracellular stimulus (Figure 
15-32). The frequency, amplitude, and breadth of oscillations can also be modu- 
lated by other signaling mechanisms, such as phosphorylation, which influence 
the Ca”* sensitivity of Ca** channels or affect other components in the signaling 
system. 

The frequency of Ca** oscillations can be translated into a frequency-depen- 
dent cell response. In some cases, the frequency-dependent response itself is also 
oscillatory: in hormone-secreting pituitary cells, for example, stimulation by an 
extracellular signal induces repeated Ca** spikes, each of which is associated with 
a burst of hormone secretion. In other cases, the frequency-dependent response 
is non-oscillatory: in some types of cells, for instance, one frequency of Ca** spikes 
activates the transcription of one set of genes, while a higher frequency activates 
the transcription of a different set. How do cells sense the frequency of Ca?* spikes 
and change their response accordingly? The mechanism presumably depends on 
Ca**-sensitive proteins that change their activity as a function of Ca**-spike fre- 
quency. A protein kinase that acts as a molecular memory device seems to have 
this remarkable property, as we discuss next. 


Ca*t/Calmodulin-Dependent Protein Kinases Mediate Many 
Responses to Ca?* Signals 


Various Ca**-binding proteins help to relay the cytosolic Ca** signal. The most 
important is calmodulin, which is found in all eukaryotic cells and can consti- 
tute as much as 1% of a cell's total protein mass. Calmodulin functions as a mul- 
tipurpose intracellular Ca?* receptor, governing many Ca*t-regulated processes. 
It consists of a highly conserved, single polypeptide chain with four high-affinity 
Ca**-binding sites (Figure 15-33A). When activated by Ca** binding, it under- 
goes a conformational change. Because two or more Ca** ions must bind before 
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Figure 15-32 Vasopressin-induced Ca?+ 
oscillations in a liver cell. The cell was 
loaded with the Ca*+-sensitive protein 
aequorin and then exposed to increasing 
concentrations of the peptide signal 
molecule vasopressin, which activates a 
GPCR and thereby PLCB (see Table 15-2). 
Note that the frequency of the Ca?+ spikes 
increases with an increasing concentration 
of vasopressin but that the amplitude of 
the spikes is not affected. Each spike lasts 
about 7 seconds. (Adapted from 

N.M. Woods, K.S.R. Cuthbertson and 

P.H. Cobbold, Nature 319:600-602, 

1986. With permission from Macmillan 
Publishers Ltd.) 
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calmodulin adopts its active conformation, the protein displays a sigmoidal 
response to increasing concentrations of Ca** (see Figure 15-16). 

The allosteric activation of calmodulin by Ca** is analogous to the activation 
of PKA by cyclic AMP, except that Ca*t/calmodulin has no enzymatic activity 
itself but instead acts by binding to and activating other proteins. In some cases, 
calmodulin serves as a permanent regulatory subunit of an enzyme complex, but 
usually the binding of Ca** instead enables calmodulin to bind to various target 
proteins in the cell to alter their activity. 

When an activated molecule of Ca**/calmodulin binds to its target protein, 
the calmodulin further changes its conformation, the nature of which depends on 
the specific target protein (Figure 15-33B). Among the many targets calmodulin 
regulates are enzymes and membrane transport proteins. As one example, Ca?*/ 
calmodulin binds to and activates the plasma membrane Ca**-pump that uses 
ATP hydrolysis to pump Ca** out of cells. Thus, whenever the concentration of 
Ca?* in the cytosol rises, the pump is activated, which helps to return the cytosolic 
Ca** level to resting levels. 

Many effects of Ca**, however, are more indirect and are mediated by protein 
phosphorylations catalyzed by a family of protein kinases called Ca**/calmodu- 
lin-dependent kinases (CaM-kinases). Some CaM-kinases phosphorylate tran- 
scription regulators, such as the CREB protein (see Figure 15-27), and in this way 
activate or inhibit the transcription of specific genes. 

One of the best-studied CaM-kinases is CaM-kinase II, which is found in most 
animal cells but is especially enriched in the nervous system. It constitutes up 
to 2% of the total protein mass in some regions of the brain, and it is highly con- 
centrated in synapses. CaM-kinase II has several remarkable properties. To begin 
with, it has a spectacular quaternary structure: twelve copies of the enzyme are 
assembled into a stacked pair of rings, with kinase domains on the outside linked 
to a central hub (Figure 15-34). This structure helps the enzyme function as a 
molecular memory device, switching to an active state when exposed to Ca?*/ 
calmodulin and then remaining active even after the Ca** signal has decayed. 
This is because adjacent kinase subunits can phosphorylate each other (a pro- 
cess called autophosphorylation) when Ca?*/calmodulin activates them (Figure 
15-34). Once a kinase subunit is autophosphorylated, it remains active even in 
the absence of Ca**, thereby prolonging the duration of the kinase activity beyond 
that of the initial activating Ca** signal. The enzyme maintains this activity until a 
protein phosphatase removes the autophosphorylation and shuts the kinase off. 
CaM-kinase II activation can thereby serve as a memory trace of a prior Ca** pulse, 
and it seems to have a role in some types of memory and learning in the vertebrate 
nervous system. Mutant mice that lack a brain-specific form of the enzyme have 
specific defects in their ability to remember where things are. 
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Figure 15-33 The structure of 
Ca?*/calmodulin. (A) The molecule has a 
dumbbell shape, with two globular ends, 
which can bind to many target proteins. 
The globular ends are connected by a long, 
exposed a helix, which allows the protein to 
adopt a number of different conformations, 
depending on the target protein it interacts 
with. Each globular head has two Ca?*- 
binding sites (Movie 15.6). (B) Shown is 
the major structural change that occurs in 
Ca?+/calmodulin when it binds to a target 
protein (in this example, a peptide that 
consists of the Ca?*/calmodulin-binding 
domain of a Ca**/calmodulin-dependent 
protein kinase). Note that the 
Ca?+/calmodulin has “jack-knifed” to 
surround the peptide. When it binds 

to other targets, it can adopt different 
conformations. (A, based on x-ray 
crystallographic data from Y.S. Babu et al., 
Nature 315:37-40, 1985. With permission 
from Macmillan Publishers Ltd; B, based 
on x-ray crystallographic data from 

W.E. Meador, A.R. Means, and 

F.A. Quiocho, Science 257:1251-1255, 
1992, and on nuclear magnetic resonance 
(NMR) spectroscopy data from M. Ikura et 
al., Science 256:632-638, 1992.) 
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Figure 15-34 The stepwise activation of CaM-kinase Il. (A) Each CaM-kinase Il protein has two major domains: an amino- 
terminal kinase domain (green) and a carboxyl-terminal hub domain (blue), linked by a regulatory segment. Six CaM-kinase 

ll proteins are assembled into a giant ring in which the hub domains interact tightly to produce a central structure that is 
surrounded by kinase domains. The complete enzyme contains two stacked rings, for a total of 12 kinase proteins, but only one 
ring is shown here for clarity. When the enzyme is inactive, the ring exists in a dynamic equilibrium between two states. The first 
(upper left) is a compact state, in which the kinase domains interact with the hub, so that the regulatory segment is buried in the 
kinase active site and thereby blocks catalytic activity. In the second inactive state (upper middle), a kinase domain has popped 
out and is linked to the central hub by its regulatory segment, which continues to inhibit the kinase but is now accessible to 
Ca2*/calmodulin. If present, Ca2*/calmodiulin will bind the regulatory segment and prevent it from inhibiting the kinase, thereby 
locking the kinase in an active state (upper right). If the adjacent kinase subunit also pops out from the hub, it will also be 
activated by Ca?+/calmodulin, and the two kinases will then phosphorylate each other on their regulatory segments (lower 
right). This autophosphorylation further activates the enzyme. It also prolongs the activity of the enzyme in two ways. First, 

it traps the bound Ca?*/calmodulin so that it does not dissociate from the enzyme until cytosolic Ca?+ levels return to basal 
values for at least 10 seconds (not shown). Second, it converts the enzyme to a Ca?+-independent form, so that the kinase 
remains active even after the Ca?*/calmodulin dissociates from it (lower left). This activity continues until the action of a protein 
phosphatase overrides the autophosphorylation activity of CaM-kinase Il. (B) This structural model of the enzyme is based on 
x-ray crystallographic analysis. 

The remarkable dodecameric structure of the enzyme allows it to achieve a broad range of intermediate activity states in 
response to different Ca?+ oscillation frequencies: higher frequencies tend to cause more subunits in the enzyme to reach the 
phosphorylated active state (see Figure 15-35). The behavior of CaM-kinase II is also controlled by the length of the linker 
segment between the kinase and hub domains. The linker is longer in some isoforms of the enzyme; in these isoforms, the 
kinase domains tend to pop out of the ring more frequently, making it more sensitive to Ca*+. These and other mechanisms 
allow the cell to tailor the responsiveness of the enzyme to the needs of different tyoes of neurons. (Adapted from L.H. Chao et 
al., Cell 146:732-745, 2011. PDB code: 3SOA.) 


Another remarkable property of CaM-kinase II is that the enzyme can use its 
intrinsic memory mechanism to decode the frequency of Ca** oscillations. This 
property is thought to be especially important at a nerve cell synapse, where 
changes in intracellular Ca** levels in a postsynaptic cell as a result of neural activ- 
ity can lead to long-term changes in the subsequent effectiveness of that synapse 
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(discussed in Chapter 11). When CaM-kinase II is exposed to both a protein phos- 
phatase and repetitive pulses of Ca*+/calmodulin at different frequencies that 
mimic those observed in stimulated cells, the enzyme’s activity increases steeply 
as a function of pulse frequency (Figure 15-35). 


Some G Proteins Directly Regulate lon Channels 


G proteins do not act exclusively by regulating the activity of membrane-bound 
enzymes that alter the concentration of cyclic AMP or Ca** in the cytosol. The 
a subunit of one type of G protein (called G72), for example, activates a guanine 
nucleotide exchange factor (GEF) that activates a monomeric GTPase of the Rho 
family (discussed later and in Chapter 16), which regulates the actin cytoskeleton. 

In some other cases, G proteins directly activate or inactivate ion channels in 
the plasma membrane of the target cell, thereby altering the ion permeability— 
and hence the electrical excitability—of the membrane. As an example, acetyl- 
choline released by the vagus nerve reduces the heart rate (see Figure 15-5B). This 
effect is mediated by a special class of acetylcholine receptors that activate the 
Gi protein discussed earlier. Once activated, the a subunit of Gi inhibits adenylyl 
cyclase (as described previously), while the By subunits bind to K* channels in 
the heart muscle cell plasma membrane and open them. The opening of these 
K* channels makes it harder to depolarize the cell and thereby contributes to the 
inhibitory effect of acetylcholine on the heart. (These acetylcholine receptors, 
which can be activated by the fungal alkaloid muscarine, are called muscarinic 
acetylcholine receptors to distinguish them from the very different nicotinic ace- 
tylcholine receptors, which are ion-channel-coupled receptors on skeletal muscle 
and nerve cells that can be activated by the binding of nicotine, as well as by ace- 
tylcholine. ) 

Other G proteins regulate the activity of ion channels less directly, either by 
stimulating channel phosphorylation (by PKA, PKC, or CaM-kinase, for example) 
or by causing the production or destruction of cyclic nucleotides that directly acti- 
vate or inactivate ion channels. These cyclic-nucleotide-gated ion channels have a 
crucial role in both smell (olfaction) and vision, as we now discuss. 


smell and Vision Depend on GPCRs That Regulate lon Channels 


Humans can distinguish more than 10,000 distinct smells, which they detect using 
specialized olfactory receptor neurons in the lining of the nose. These cells use 
specific GPCRs called olfactory receptors to recognize odors; the receptors are 
displayed on the surface of the modified cilia that extend from each cell (Figure 
15-36). The receptors act through cAMP. When stimulated by odorant binding, 
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Figure 15-35 CaM-kinase Il as a 
frequency decoder of Ca? oscillations. 
(A) At low frequencies of Ca2* spikes, the 
enzyme becomes inactive after each spike, 
as the autophosphorylation induced by 
Ca?+/calmodulin binding does not maintain 
the enzyme’s activity long enough for the 
enzyme to remain active until the next 
Ca?* spike arrives. (B) At higher spike 
frequencies, however, the enzyme fails to 
inactivate completely between Ca?* spikes, 
so its activity ratchets up with each spike. 
If the spike frequency is high enough, 

this progressive increase in enzyme 

activity will continue until the enzyme is 
autophosphorylated on all subunits and is 
therefore maximally activated. Although not 
shown, once enough of its subunits are 
autophosphorylated, the enzyme can be 
maintained in a highly active state even with 
a relatively low frequency of Ca?+ spikes (a 
form of cell memory). The binding of Ca?+/ 
calmodulin to the enzyme is enhanced by 
the CaM-kinase II autophosphorylation 

(an additional form of positive feedback), 
helping to generate a more switchlike 
response to repeated Ca?+ spikes. (From 
P.I. Hanson, T. Meyer, L. Stryer, and 

H. Schulman, Neuron 12:943-956, 1994. 
With permission from Elsevier.) 
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they activate an olfactory-specific G protein (known as Gof), which in turn acti- 
vates adenylyl cyclase. The resulting increase in cAMP opens cyclic-AMP-gated 
cation channels, thereby allowing an influx of Na*, which depolarizes the olfac- 
tory receptor neuron and initiates a nerve impulse that travels along its axon to 
the brain. 

There are about 1000 different olfactory receptors in a mouse and about 350 in 
a human, each encoded by a different gene and each recognizing a different set 
of odorants. Each olfactory receptor neuron produces only one of these receptors; 
the neuron responds to a specific set of odorants by means of the specific recep- 
tor it displays, and each odorant activates its own characteristic set of olfactory 
receptor neurons. The same receptor also helps direct the elongating axon of each 
developing olfactory neuron to the specific target neurons that it will connect to 
in the brain. A different set of GPCRs acts in a similar way in some vertebrates to 
mediate responses to pheromones, chemical signals detected in a different part of 
the nose that are used in communication between members of the same species. 
Humans, however, are thought to lack functional pheromone receptors. 

Vertebrate vision employs a similarly elaborate, highly sensitive, signal-detec- 
tion process. Cyclic-nucleotide-gated ion channels are also involved, but the 
crucial cyclic nucleotide is cyclic GMP (Figure 15-37) rather than cAMP. As with 
cAMP, a continuous rapid synthesis (by guanylyl cyclase) and rapid degradation 
(by cyclic GMP phosphodiesterase) controls the concentration of cyclic GMP in 
the cytosol. 

In visual transduction responses, which are the fastest G-protein-mediated 
responses known in vertebrates, the receptor activation stimulated by light causes 
a fall rather than a rise in the level of the cyclic nucleotide. The pathway has been 
especially well studied in rod photoreceptors (rods) in the vertebrate retina. Rods 
are responsible for noncolor vision in dim light, whereas cone photoreceptors 
(cones) are responsible for color vision in bright light. A rod photoreceptor is a 
highly specialized cell with outer and inner segments, a cell body, and a synap- 
tic region where the rod passes a chemical signal to a retinal nerve cell (Figure 
15-38). This nerve cell relays the signal to another nerve cell in the retina, which 
in turn relays it to the brain. 

The phototransduction apparatus is in the outer segment of the rod, which 
contains a stack of discs, each formed by a closed sac of membrane that is densely 
packed with photosensitive rhodopsin molecules. The plasma membrane sur- 
rounding the outer segment contains cyclic-GMP-gated cation channels. Cyclic 
GMP bound to these channels keeps them open in the dark. Paradoxically, light 
causes a hyperpolarization (which inhibits synaptic signaling) rather than a 
depolarization of the plasma membrane (which would stimulate synaptic signal- 
ing). Hyperpolarization (that is, the membrane potential moves to a more nega- 
tive value—discussed in Chapter 11) results because the light-induced activation 
of rhodopsin molecules in the disc membrane decreases the cyclic GMP con- 
centration and closes the cation channels in the surrounding plasma membrane 
(Figure 15-39). 





Figure 15-36 Olfactory receptor 
neurons. (A) A section of olfactory 
epithelium in the nose. Olfactory receptor 
neurons possess modified cilia, which 
project from the surface of the epithelium 
and contain the olfactory receptors, as well 
as the signal transduction machinery. The 
axon, which extends from the opposite 
end of the receptor neuron, conveys 
electrical signals to the brain when an 
odorant activates the cell to produce an 
action potential. In rodents, at least, the 
basal cells act as stem cells, producing 
new receptor neurons throughout life, to 
replace the neurons that die. (B) A scanning 
electron micrograph of the cilia on the 
surface of an olfactory neuron. (B, from 
E.E. Morrison and R.M. Costanzo, 

J. Comp. Neurol. 297:1-13, 1990. With 
permission from Wiley-Liss.) 





Figure 15-37 Cyclic GMP. 
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Figure 15-38 A rod photoreceptor cell. There are about 1000 discs in 

the outer segment. The disc membranes are not connected to the plasma 
membrane. The inner and outer segments are specialized parts of a primary 
cilium (discussed in Chapter 16). A primary cilium extends from the surface of 
most vertebrate cells, where it serves as a signaling organelle. 


Rhodopsin is a member of the GPCR family, but the activating extracellular 
signal is not a molecule but a photon of light. Each rhodopsin molecule con- 
tains a covalently attached chromophore, 11-cis retinal, which isomerizes almost 
instantaneously to all-trans retinal when it absorbs a single photon. The isomeri- 
zation alters the shape of the retinal, forcing a conformational change in the pro- 
tein (opsin). The activated rhodopsin molecule then alters the conformation of 
the G protein transducin (G+), causing the transducin a subunit to activate cyclic 
GMP phosphodiesterase. The phosphodiesterase then hydrolyzes cyclic GMP, 
so that cyclic GMP levels in the cytosol fall. This drop in cyclic GMP concentra- 
tion decreases the amount of cyclic GMP bound to the plasma membrane cation 
channels, allowing more of these cyclic-GMP-sensitive channels to close. In this 
way, the signal quickly passes from the disc membrane to the plasma membrane, 
and a light signal is converted into an electrical one, through a hyperpolarization 
of the rod cell plasma membrane. 

Rods use several negative feedback loops to allow the cells to revert quickly to a 
resting, dark state in the aftermath ofa flash of light—a requirement for perceiving 
the shortness of the flash. A rhodopsin-specific protein kinase called rhodopsin 
kinase (RK) phosphorylates the cytosolic tail of activated rhodopsin on multiple 
serines, partially inhibiting the ability of the rhodopsin to activate transducin. An 
inhibitory protein called arrestin (discussed later) then binds to the phosphory- 
lated rhodopsin, further inhibiting rhodopsin’s activity. Mice or humans with a 
mutation that inactivates the gene encoding RK have a prolonged light response. 

At the same time as arrestin shuts off rhodopsin, an RGS protein (discussed 
earlier) binds to activated transducin, stimulating the transducin to hydrolyze its 
bound GTP to GDP, which returns transducin to its inactive state. In addition, the 
cation channels that close in response to light are permeable to Ca**, as well as 
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Figure 15-39 The response of a rod 
photoreceptor cell to light. Rhodopsin 
molecules in the outer-segment discs 
absorb photons. Photon absorption closes 
cation channels in the plasma membrane, 
which hyperpolarizes the membrane 

and reduces the rate of neurotransmitter 
release from the synaptic region. Because 
the neurotransmitter inhibits many of the 
postsynaptic retinal neurons, illumination 
serves to free the neurons from inhibition 
and thus, in effect, excites them. The neural 
connections of the retina lie between the 
light source and the outer segment, and so 
the light must pass through the synapses 
and rod cell nucleus to reach the light 
sensors. 


846 Chapter 15: Cell Signaling 


to Na*, so that when they close, the normal influx of Ca? is inhibited, causing 
the Ca** concentration in the cytosol to fall. The decrease in Ca** concentration 
stimulates guanylyl cyclase to replenish the cyclic GMP, rapidly returning its level 
to where it was before the light was switched on. A specific Ca**-sensitive protein 
mediates the activation of guanylyl cyclase in response to the fall in Ca** levels. In 
contrast to calmodulin, this protein is inactive when Ca** is bound to it and active 
when it is Ca**-free. It therefore stimulates the cyclase when Ca** levels fall fol- 
lowing a light response. 

Negative feedback mechanisms do more than just return the rod to its resting 
state after a transient light flash; they also help the rod to adapt, stepping down 
the response when the rod is exposed to light continuously. Adaptation, as we 
discussed earlier, allows the receptor cell to function as a sensitive detector of 
changes in stimulus intensity over an enormously wide range of baseline levels 
of stimulation. It is why we can see faint stars in a dark sky, or a camera flash in 
bright sunlight. 

The various trimeric G proteins we have discussed in this chapter fall into four 
major families, as summarized in Table 15-3. 


Nitric Oxide Is a Gaseous Signaling Mediator That Passes 
Between Cells 


Signaling molecules like cyclic nucleotides and calcium are hydrophilic small 
molecules that generally act within the cell where they are produced. Some sig- 
naling molecules, however, are hydrophobic enough, small enough, or both, to 
pass readily across the plasma membrane and carry signals to nearby cells. An 
important and remarkable example is the gas nitric oxide (NO), which acts as a 
signal molecule in many tissues of both animals and plants. 

In mammals, one of NO’s many functions is to relax smooth muscle in the walls 
of blood vessels. The neurotransmitter acetylcholine stimulates NO synthesis by 


TABLE 15-3 


Activates adenylyl cyclase; 
activates Ca2+ channels 


Activates adenylyl cyclase in 
olfactory sensory neurons 
Inhibits adenylyl cyclase 
Activates Kt channels 


Activates Kt channels; inactivates 


Ca?+ channels 


Activates phospholipase C-B 


Activates cyclic GMP 
ohosphodiesterase in vertebrate 
rod photoreceptors 


Activates phospholipase C-B 


Activates Rho family monomeric 
GTPases (via Rno-GEF) to regulate 
the actin cytoskeleton 


“Families are determined by amino acid sequence relatedness of the a subunits. Only selected 
examples are included. About 20 a subunits and at least 6 B subunits and 11 y subunits have 
been described in humans. 
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activating a GPCR on the membranes of the endothelial cells that line the interior 
of the vessel. The activated receptor triggers IP3 synthesis and Ca" release (see 
Figure 15-29), leading to stimulation of an enzyme that synthesizes NO. Because 
dissolved NO passes readily across membranes, it diffuses out of the cell where it 
is produced and into neighboring smooth muscle cells, where it causes muscle 
relaxation and thereby vessel dilation (Figure 15-40). It acts only locally because 
it has a short half-life—about 5-10 seconds—in the extracellular space before oxy- 
gen and water convert it to nitrates and nitrites. 

The effect of NO on blood vessels provides an explanation for the mechanism of 
action of nitroglycerine, which has been used for about 100 years to treat patients 
with angina (pain resulting from inadequate blood flow to the heart muscle). The 
nitroglycerine is converted to NO, which relaxes blood vessels. This reduces the 
workload on the heart and, as a consequence, reduces the oxygen requirement of 
the heart muscle. 

NO is made by the deamination of the amino acid arginine, catalyzed by 
enzymes called NO synthases (NOS) (see Figure 15-40). The NOS in endothelial 
cells is called eNOS, while that in nerve and muscle cells is called nNOS. Both 
eNOS and nNOS are stimulated by Ca**. Macrophages, by contrast, make yet 
another NOS, called inducible NOS (iNOS), that is constitutively active but syn- 
thesized only when the cells are activated, usually in response to an infection. 

In some target cells, including smooth muscle cells, NO binds reversibly to 
iron in the active site of guanylyl cyclase, stimulating synthesis of cyclic GMP. NO 
can increase cyclic GMP in the cytosol within seconds, because the normal rate of 
turnover of cyclic GMP is high: rapid degradation to GMP by a phosphodiesterase 
constantly balances the production of cyclic GMP by guanylyl cyclase. The drug 
Viagra® and its relatives inhibit the cyclic GMP phosphodiesterase in the penis, 
thereby increasing the amount of time that cyclic GMP levels remain elevated in 
the smooth muscle cells of penile blood vessels after NO production is induced 
by local nerve terminals. The cyclic GMP, in turn, keeps the blood vessels relaxed 
and thereby the penis erect. NO can also signal cells independently of cyclic 
GMP. It can, for example, alter the activity of an intracellular protein by covalently 
nitrosylating thiol (-SH) groups on specific cysteines in the protein. 


(A) 
smooth muscle cell basal lamina Figure 15-40 The role of nitric oxide (NO) in smooth muscle relaxation in a blood vessel 
wall. (A) Simplified cross section of a blood vessel, showing the endothelial cells lining the 
lumen and the smooth muscle cells around them. (B) The neurotransmitter acetylcholine 
stimulates blood vessel dilation by activating a GPCR —the muscarinic acetylcholine 
receptor—on the surface of endothelial cells. This receptor activates a G protein, Gg, thereby 
stimulating IP3 synthesis and Ca?* release by the mechanisms illustrated in Figure 15-29. Ca?+ 
activates nitric oxide synthase, causing the endothelial cells to produce NO from arginine. The 
lumen of NO diffuses out of the endothelial cells and into the neighboring smooth muscle cells, where 
blood vessel it activates guanylyl cyclase to produce cyclic GMP. The cyclic GMP triggers a response that 
causes the smooth muscle cells to relax, increasing blood flow through the vessel. 





endothelial cell 


(B) 


acetylcholine 











NO bound to 
activated guanylyl cyclase 
NO synthase 
(NOS) 





aD 0 © RAPID RELAXATION 





RAPID DIFFUSION OF NO a OF SMOOTH MUSCLE CELL P 
ACROSS MEMBRANES 
endothelial cell a smooth muscle cell 


848 Chapter 15: Cell Signaling 


Second Messengers and Enzymatic Cascades Amplify Signals 


Despite the differences in molecular details, the different intracellular signaling 
pathways that GPCRs trigger share certain features and obey similar general prin- 
ciples. They depend on relay chains of intracellular signaling proteins and sec- 
ond messengers. These relay chains provide numerous opportunities for amplify- 
ing the responses to extracellular signals. In the visual transduction cascade, for 
example, a single activated rhodopsin molecule catalyzes the activation of hun- 
dreds of molecules of transducin at a rate of about 1000 transducin molecules per 
second. Each activated transducin molecule activates a molecule of cyclic GMP 
phosphodiesterase, each of which hydrolyzes about 4000 molecules of cyclic 
GMP per second. This catalytic cascade lasts for about 1 second and results in 
the hydrolysis of more than 10° cyclic GMP molecules for a single quantum of 
light absorbed, and the resulting drop in the concentration of cyclic GMP in turn 
transiently closes hundreds of cation channels in the plasma membrane (Figure 
15-41). As a result, a rod cell can respond to even a single photon of light in a way 
that is highly reproducible in its timing and magnitude. 

Likewise, when an extracellular signal molecule binds to a receptor that indi- 
rectly activates adenylyl cyclase via Gs, each receptor protein may activate many 
molecules of Gs protein, each of which can activate a cyclase molecule. Each 
cyclase molecule, in turn, can catalyze the conversion of a large number of ATP 
molecules to cAMP molecules. A similar amplification operates in the IP3 signal- 
ing pathway. In these ways, a nanomolar (10-9 M) change in the concentration of 
an extracellular signal can induce micromolar (10-8 M) changes in the concen- 
tration of a second messenger such as cAMP or Ca?*. Because these messengers 
function as allosteric effectors to activate specific enzymes or ion channels, a sin- 
gle extracellular signal molecule can alter many thousands of protein molecules 
within the target cell. 

Any such amplifying cascade of stimulatory signals requires counterbalancing 
mechanisms at every step of the cascade to restore the system to its resting state 
when stimulation ceases. As emphasized earlier, the response to stimulation can 
be rapid only if the inactivating mechanisms are also rapid. Cells therefore have 
efficient mechanisms for rapidly degrading (and resynthesizing) cyclic nucleo- 
tides and for buffering and removing cytosolic Ca**, as well as for inactivating the 
responding enzymes and ion channels once they have been activated. This is not 
only essential for turning a response off, but is also important for defining the rest- 
ing state from which a response begins. 

Each protein in the signaling relay chain can be a separate target for regula- 
tion, including the receptor itself, as we discuss next. 


GPCR Desensitization Depends on Receptor Phosphorylation 


As discussed earlier, when target cells are exposed to a high concentration of a 
stimulating ligand for a prolonged period, they can become desensitized, or 
adapted, in several different ways. An important class of adaptation mechanisms 
depends on alteration of the quantity or condition of the receptor molecules 
themselves. 

For GPCRs, there are three general modes of adaptation (see Figure 15-20): 
(1) In receptor sequestration, they are temporarily moved to the interior of the 
cell (internalized) so that they no longer have access to their ligand. (2) In recep- 
tor down-regulation, they are destroyed in lysosomes after internalization. (3) In 
receptor inactivation, they become altered so that they can no longer interact with 
G proteins. 

In each case, the desensitization of the GPCRs depends on their phosphory- 
lation by PKA, PKC, or a member of the family of GPCR kinases (GRKs), which 
includes the rhodopsin-specific kinase RK involved in rod photoreceptor desen- 
sitization discussed earlier. The GRKs phosphorylate multiple serines and thre- 
onines on a GPCR, but they do so only after ligand binding has activated the 
receptor, because it is the activated receptor that allosterically activates the GRK. 
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Figure 15-41 Amplification in the light- 
induced catalytic cascade in vertebrate 
rods. The red arrows indicate the steps 
where amplification occurs, with the 
thickness of the arrow roughly indicating 
the magnitude of the amplification. 


SIGNALING THROUGH G-PROTEIN-COQUPLED RECEPTORS 849 


activated desensitized 
GPCR GPCR 


eo 
GPCR STIMULATES 
GRK TO 
PHOSPHORYLATE 
THE GPCR ON 
MULTIPLE SITES 


IE 


ARRESTIN BINDS BINDS 


PHOSPHORYLATED 
GPCR 





arrestin 


GPCR kinase (GRK) 


Figure 15-42 The roles of GPCR kinases (GRKs) and arrestins in GPCR desensitization. 

A GRK phosphorylates only activated receptors because it is the activated GPCR that activates the 
GRK. The binding of an arrestin to the phosphorylated receptor prevents the receptor from binding 
to its G protein and also directs its endocytosis (not shown). Mice that are deficient in one form 

of arrestin fail to desensitize in resoonse to morphine, for example, attesting to the importance of 
arrestins for desensitization. 


As with rhodopsin, once a receptor has been phosphorylated by a GRK, it binds 
with high affinity to a member of the arrestin family of proteins (Figure 15-42). 

The bound arrestin can contribute to the desensitization process in at least 
two ways. First, it prevents the activated receptor from interacting with G pro- 
teins. Second, it serves as an adaptor protein to help couple the receptor to the 
clathrin-dependent endocytosis machinery (discussed in Chapter 13), inducing 
receptor-mediated endocytosis. The fate of the internalized GPCR-arrestin com- 
plex depends on other proteins in the complex. In some cases, the receptor is 
dephosphorylated and recycled back to the plasma membrane for reuse. In oth- 
ers, itis ubiquitylated, endocytosed, and degraded in lysosomes (discussed later). 

Receptor endocytosis does not necessarily stop the receptor from signaling. In 
some cases, the bound arrestin recruits other signaling proteins to relay the signal 
onward from the internalized GPCRs along new pathways. 


Summary 


GPCRs can indirectly activate or inactivate either plasma-membrane-bound 
enzymes or ion channels via G proteins. When an activated receptor stimulates a G 
protein, the G protein undergoes a conformational change that activates its a sub- 
unit, thereby triggering release of a By complex. Either component can then directly 
regulate the activity of target proteins in the plasma membrane. Some GPCRs either 
activate or inactivate adenylyl cyclase, thereby altering the intracellular concentra- 
tion of the second messenger cyclic AMP. Others activate a phosphoinositide-spe- 
cific phospholipase C (PLCB), which generates two second messengers. One is inosi- 
tol 1,4,5-trisphosphate (IP3), which releases Ca** from the ER and thereby increases 
the concentration of Ca** in the cytosol. The other is diacylglycerol, which remains 
in the plasma membrane and helps activate protein kinase C (PKC). An increase in 
cytosolic cyclic AMP or Ca** levels affects cells mainly by stimulating cAMP-depen- 
dent protein kinase (PKA) and Ca**/calmodulin-dependent protein kinases (CaM- 
kinases), respectively. 

PKC, PKA, and CaM-kinases phosphorylate specific target proteins and thereby 
alter the activity of the proteins. Each type of cell has its own characteristic set of 
target proteins that is regulated in these ways, enabling the cell to make its own 
distinctive response to the second messengers. The intracellular signaling cascades 
activated by GPCRs greatly amplify the responses, so that many thousands of target 
protein molecules are changed for each molecule of extracellular signaling ligand 
bound to its receptor. The responses mediated by GPCRs are rapidly turned off when 
the extracellular signal is removed, and activated GPCRs are inactivated by phos- 
phorylation and association with arrestins. 
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SIGNALING THROUGH ENZYME-COUPLED 
RECEPTORS 


Like GPCRs, enzyme-coupled receptors are transmembrane proteins with their 
ligand-binding domain on the outer surface of the plasma membrane. Instead of 
having a cytosolic domain that associates with a trimeric G protein, however, their 
cytosolic domain either has intrinsic enzyme activity or associates directly with an 
enzyme. Whereas a GPCR has seven transmembrane segments, each subunit of 
an enzyme-coupled receptor typically has only one. GPCRs and enzyme-coupled 
receptors often activate some of the same signaling pathways. In this section, we 
describe some of the important features of signaling by enzyme-coupled recep- 
tors, with an emphasis on the most common class of these proteins, the receptor 
tyrosine kinases. 


Activated Receptor Tyrosine Kinases (RTKs) Phosphorylate 
Themselves 


Many extracellular signal proteins act through receptor tyrosine kinases (RTKs). 
These include many secreted and cell-surface-bound proteins that control cell 
behavior in developing and adult animals. Some of these signal proteins and their 
RTKs are listed in Table 15-4. 

There are about 60 human RTKs, which can be classified into about 20 struc- 
tural subfamilies, each dedicated to its complementary family of protein ligands. 
Figure 15-43 shows the basic structural features of a number of the families 
that operate in mammals. In all cases, the binding of the signal protein to the 
ligand-binding domain on the extracellular side of the receptor activates the tyro- 
sine kinase domain on the cytosolic side. This leads to phosphorylation of tyro- 
sine side chains on the cytosolic part of the receptor, creating phosphotyrosine 
docking sites for various intracellular signaling proteins that relay the signal. 

How does the binding of an extracellular ligand activate the kinase domain on 
the other side of the plasma membrane? For a GPCR, ligand binding is thought to 
change the relative orientation of several of the transmembrane a helices, thereby 
shifting the position of the cytoplasmic loops relative to one another. It is unlikely, 
however, that a conformational change could propagate across the lipid bilayer 


TABLE 15-4 
al protein famil 


Epidermal growth factor (EGF) EGF receptors Stimulates cell survival, growth, proliferation, or differentiation 
of various cell types; acts as inductive signal in development 


Insulin Insulin receptor Stimulates carbohydrate utilization and protein synthesis 
Insulin-like growth factor (IGF1) IGF receptor-1 Stimulates cell growth and survival in many cell types 
Nerve growth factor (NGF) Trk receptors Stimulates survival and growth of some neurons 


Platelet-derived growth factor (PDGF) PDGF receptors Stimulates survival, growth, proliferation, and migration of 
various cell tyoes 

Macrophage-colony-stimulating factor MCSF receptor Stimulates monocyte/macrophage proliferation and 

(MCSF) differentiation 


Fibroblast growth factor (FGF) FGF receptors Stimulates proliferation of various cell types; inhibits 
differentiation of some precursor cells; acts as inductive 
signal in development 


VEGF receptors Stimulates angiogenesis 
Eph receptors Stimulates angiogenesis; guides cell and axon migration 


Vascular endothelial growth factor (VEGF) 





Ephrin 
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through a single transmembrane a helix. Instead, for most RTKs, ligand binding 
causes the receptors to dimerize, bringing the two cytoplasmic kinase domains 
together and thereby promoting their activation (Figure 15-44). 

Dimerization stimulates kinase activity by a variety of mechanisms. In many 
cases, such as the insulin receptor, dimerization simply brings the kinase domains 
close to each other in an orientation that allows them to phosphorylate each other 
on specific tyrosines in the kinase active sites, thereby promoting conformational 
changes that fully activate both kinase domains. In other cases, such as the recep- 
tor for epidermal growth factor (EGF), the kinase is not activated by phosphory- 
lation but by conformational changes brought about by interactions between the 
two kinase domains outside their active sites (Figure 15-45). 
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Figure 15-44 Activation of RTKs by dimerization. In the absence of extracellular signals, most RTKs exist as monomers 

in which the internal kinase domain is inactive. Binding of ligand brings two monomers together to form a dimer. In most 
cases, the close proximity in the dimer leads the two kinase domains to phosphorylate each other, which has two effects. 
First, phosphorylation at some tyrosines in the kinase domains promotes the complete activation of the domains. Second, 
phosphorylation at tyrosines in other parts of the receptors generates docking sites for intracellular signaling proteins, resulting 
in the formation of large signaling complexes that can then broadcast signals along multiple signaling pathways. 

Mechanisms of dimerization vary widely among different RTK family members. In some cases, as shown here, the ligand 
itself is a dimer and brings two receptors together by binding them simultaneously. In other cases, a monomeric ligand can 
interact with two receptors simultaneously to bring them together, or two ligands can bind independently on two receptors to 
promote dimerization. In some RTKs—notably those in the insulin receptor family—the receptor is always a dimer (See Figure 
15-43), and ligand binding causes a conformational change that brings the two internal kinase domains closer together. 
Although many RTKs are activated by transautophosphorylation as shown here, there are some important exceptions, including 
the EGF receptor illustrated in Figure 15—45. 
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Phosphorylated Tyrosines on RTKs Serve as Docking Sites for 
Intracellular Signaling Proteins 


Once the kinase domains of an RTK dimer are activated, they phosphorylate mul- 
tiple additional sites in the cytosolic parts of the receptors, typically in disordered 
regions outside the kinase domain (see Figure 15-44). This phosphorylation cre- 
ates high-affinity docking sites for intracellular signaling proteins. Each signal- 
ing protein binds to a particular phosphorylated site on the activated receptors 
because it contains a specific phosphotyrosine-binding domain that recognizes 
surrounding features of the polypeptide chain in addition to the phosphotyrosine. 

Once bound to the activated RTK, a signaling protein may become phosphor- 
ylated on tyrosines and thereby activated. In many cases, however, the binding 
alone may be sufficient to activate the docked signaling protein, by either induc- 
ing a conformational change in the protein or simply bringing it near the protein 
that is next in the signaling pathway. Thus, receptor phosphorylation serves as a 
switch to trigger the assembly of an intracellular signaling complex, which can 
then relay the signal onward, often along multiple routes, to various destinations 
in the cell. Because different RTKs bind different combinations of these signaling 
proteins, they activate different responses. 

Some RTKs use additional docking proteins to enlarge the signaling complex 
at activated receptors. Insulin and IGF1 receptor signaling, for example, depend 
on a specialized docking protein called insulin receptor substrate 1 (IRS1). IRS1 
associates with phosphorylated tyrosines on the activated receptor and is then 
phosphorylated at multiple sites, thereby creating many more docking sites than 
could be accommodated on the receptor alone (see Figure 15-11). 


Proteins with SH2 Domains Bind to Phosphorylated Tyrosines 


A whole menagerie of intracellular signaling proteins can bind to the phospho- 
tyrosines on activated RTKs (or on docking proteins such as IRS1). They help to 
relay the signal onward, mainly through chains of protein-protein interactions 
mediated by modular interaction domains, as discussed earlier. Some of the 
docked proteins are enzymes, such as phospholipase C-y (PLCy), which func- 
tions in the same way as phospholipase C-B—activating the inositol phospholipid 
signaling pathway discussed earlier in connection with GPCRs (see Figures 15-28 
and 15-29). Through this pathway, RTKs can increase cytosolic Ca** levels and 
activate PKC. Another enzyme that docks on these receptors is the cytoplasmic 
tyrosine kinase Src, which phosphorylates other signaling proteins on tyrosines. 
Yet another is phosphoinositide 3-kinase (PI 3-kinase), which phosphorylates lip- 
ids rather than proteins; as we discuss later, the phosphorylated lipids then serve 
as docking sites to attract various signaling proteins to the plasma membrane. 
The intracellular signaling proteins that bind to phosphotyrosines have varied 
structures and functions. However, they usually share highly conserved phospho- 
tyrosine-binding domains. These can be either SH2 domains (for Src homology 


Figure 15-45 Activation of the EGF 
receptor kinase. In the absence of 

ligand, the EGF receptor exists primarily 

as an inactive monomer. EGF binding 
results in a conformational change that 
promotes dimerization of the external 
domains. The receptor kinase domain, 
unlike that of many RITKs, is not activated 
by transautophosphorylation. Instead, 
dimerization orients the internal kinase 
domains into an asymmetric dimer, in which 
one kinase domain (the “activator’) pushes 
against the other kinase domain (the 
“receiver”), thereby causing an activating 
conformational change in the receiver. The 
active receiver domain then phosphorylates 
multiple tyrosines in the C-terminal tails of 
both receptors, generating docking sites for 
intracellular signaling proteins (See Figure 
15-44), 
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region) or, less commonly, PTB domains (for phosphotyrosine-binding). By recog- 
nizing specific phosphorylated tyrosines, these small interaction domains enable 
the proteins that contain them to bind to activated RTKs, as well as to many other 
intracellular signaling proteins that have been transiently phosphorylated on 
tyrosines (Figure 15-46). As discussed previously, many signaling proteins also 
contain other interaction domains that allow them to interact specifically with 
other proteins as part of the signaling process. These domains include the SH3 
domain, which binds to proline-rich motifs in intracellular proteins (see Figure 
15-11). 

Not all proteins that bind to activated RTKs via SH2 domains help to relay 
the signal onward. Some act to decrease the signaling process, providing nega- 
tive feedback. One example is the c-Cbl protein, which can dock on some acti- 
vated receptors and catalyze their ubiquitylation, covalently adding one or more 
ubiquitin molecules to specific sites on the receptor. This promotes the endo- 
cytosis and degradation of the receptors in lysosomes—an example of receptor 
down-regulation (see Figure 15-20). Endocytic proteins that contain ubiquitin- 
interaction motifs (UIMs) recognize the ubiquitylated RTKs and direct them into 
clathrin-coated vesicles and, ultimately, into lysosomes (discussed in Chapter 
13). Mutations that inactivate c-Cbl-dependent RTK down-regulation cause pro- 
longed RTK signaling and thereby promote the development of cancer. 

As is the case for GPCRs, ligand-induced endocytosis of RTKs does not always 
decrease signaling. In some cases, RTKs are endocytosed with their bound signal- 
ing proteins and continue to signal from endosomes or other intracellular com- 
partments. This mechanism, for example, allows nerve growth factor (NGF) to 
bind to its specific RTK (called TrkA) at the end of a long nerve cell axon and signal 
to the cell body of the same cell a long distance away. Here, signaling endocytic 


PDGF 


receptor 
plasma membrane i 


CYTOSOL 






PI 3-kinase 740 . 
(regulatory subunit) 751 orodne 
kinase 
GTPase-activating ——» 771 domain 


protein (GAP) 
phospholipase C-y x 1009 
(PLCy) 1021 


SH2 domains SH3 domain 


(A) 


binding site for 
amino acid side chain 


binding site for 
phosphotyrosine 





(C) 





853 


Figure 15-46 The binding of SH2- 
containing intracellular signaling 
proteins to an activated RTK. 

(A) This drawing of a receptor for platelet- 
derived growth factor (PDGF) shows five 
phosphotyrosine docking sites, three in 
the kinase insert region and two on the 
C-terminal tail, to which the three signaling 
proteins shown bind as indicated. The 
numbers on the right indicate the positions 
of the tyrosines in the polypeptide chain. 
These binding sites have been identified 
by using recombinant DNA technology to 
mutate specific tyrosines in the receptor. 
Mutation of tyrosines 1009 and 1021, 

for example, prevents the binding and 
activation of PLCy, so that receptor 
activation no longer stimulates the inositol 
phospholipid signaling pathway. The 
locations of the SH2 (red) and SHS (blue) 
domains in the three signaling proteins 

are indicated. (Additional phosphotyrosine 
docking sites on this receptor are not 
shown, including those that bind the 
cytoplasmic tyrosine kinase Src and two 
adaptor proteins.) It is unclear how many 
signaling proteins can bind simultaneously 
to a single RTK. (B) The three-dimensional 
structure of an SH2 domain, as determined 
by x-ray crystallography. The binding 
pocket for phosphotyrosine is shaded 

in orange on the right, and a pocket for 
binding a specific amino acid side chain 
(isoleucine, in this case) is shaded in yellow 
on the left. The RTK polypeptide segment 
that binds the SH2 domain is shown in 
yellow (see also Figure 3-40). (C) The SH2 
domain is a compact, “plug-in” module, 
which can be inserted almost anywhere in 
a protein without disturbing the protein’s 
folding or function (discussed in Chapter 
3). Because each domain has distinct sites 
for recognizing phosphotyrosine and for 
recognizing a particular amino acid side 
chain, different SH2 domains recognize 
phosphotyrosine in the context of different 
flanking amino acid sequences. (B, based 
on data from G. Waksman et al., Cell 

72:/ 79-790, 1993. With permission from 
Elsevier. PDB code: 2SRC.) 
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vesicles containing TrkA, with NGF bound on the lumenal side and signaling pro- 
teins docked on the cytosolic side, are transported along the axon to the cell body, 
where they signal the cell to survive. 

Some signaling proteins are composed almost entirely of SH2 and SH3 
domains and function as adaptors to couple tyrosine-phosphorylated proteins to 
other proteins that do not have their own SH2 domains (see Figure 15-11). Adap- 
tor proteins of this type help to couple activated RTKs to the important signaling 
protein Ras, a monomeric GTPase that, in turn, can activate various downstream 
signaling pathways, as we now discuss. 


The GTPase Ras Mediates Signaling by Most RTKs 


The Ras superfamily consists of various families of monomeric GTPases, but only 
the Ras and Rho families relay signals from cell-surface receptors (Table 15-5). By 
interacting with different intracellular signaling proteins, a single Ras or Rho fam- 
ily member can coordinately spread the signal along several distinct downstream 
signaling pathways, thereby acting as a signaling hub. 

There are three major, closely related Ras proteins in humans: H-, K-, and 
N-Ras (see Table 15-5). Although they have subtly different functions, they are 
thought to work in the same way, and we will refer to them simply as Ras. Like 
many monomeric GTPases, Ras contains one or more covalently attached lipid 
groups that help anchor the protein to the cytoplasmic face of the membrane, 
from where it relays signals to other parts of the cell. Ras is often required, for 
example, when RTKs signal to the nucleus to stimulate cell proliferation or dif- 
ferentiation, both of which require changes in gene expression. If Ras function 
is inhibited by various experimental approaches, the cell proliferation or differ- 
entiation responses normally induced by the activated RTKs do not occur. Con- 
versely, 30% of human tumors express hyperactive mutant forms of Ras, which 
contribute to the uncontrolled proliferation of the cancer cells. 

Like other GTP-binding proteins, Ras functions as a molecular switch, cycling 
between two distinct conformational states—active when GTP is bound and 
inactive when GDP is bound (Movie 15.7). As discussed earlier for monomeric 
GTPases in general, two classes of signaling proteins regulate Ras activity by 
influencing its transition between active and inactive states (see Figure 15-8). 


TABLE 15-5 


H-Ras, K-Ras, Relay signals from RI Ks 
N-Ras 


— Activates “Activates mTOR to stimulate cell growth to stimulate cell “Activates mTOR to stimulate cell growth 


Rap Activated by a cyclic-AMP-dependent GEF; 
influences cell adhesion by activating integrins 


Rho* Rho, Rac, Cdc42 Relay signals from surface receptors to the 
cytoskeleton and elsewhere 
ARF* ARF 1—ARF6 Regulate assembly of protein coats on intracellular 
vesicles 
Rab1-—60 Regulate intracellular vesicle traffic 
Ran* Ran Regulates mitotic spindle assembly and nuclear 
transport of RNAs and proteins 


“The Rho family is discussed in Chapter 16, the ARF and Rab proteins in Chapter 13, and Ran 
in Chapters 12 and 17. The three-dimensional structure of Ras is shown in Figure 3-67. 
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Ras guanine nucleotide exchange factors (Ras-GEFs) stimulate the dissociation 
of GDP and the subsequent uptake of GTP from the cytosol, thereby activating 
Ras. Ras GTPase-activating proteins (Ras-GAPs) increase the rate of hydrolysis of 
bound GTP by Ras, thereby inactivating Ras. Hyperactive mutant forms of Ras are 
resistant to GAP-mediated GTPase stimulation and are locked permanently in the 
GTP-bound active state, which is why they promote the development of cancer. 

But how do RTKs normally activate Ras? In principle, they could either activate 
a Ras-GEF or inhibit a Ras-GAP. Even though some GAPs bind directly (via their 
SH2 domains) to activated RTKs (see Figure 15-46A), it is the indirect coupling of 
the receptor to a Ras-GEF that drives Ras into its active state. The loss of function 
of a Ras-GEF has a similar effect to the loss of function of Ras itself. Activation of 
the other Ras superfamily proteins, including those of the Rho family, also occurs 
through the activation of GEFs. The particular GEF determines in which mem- 
brane the GTPase is activated and, by acting as a scaffold, it can also determine 
which downstream proteins the GTPase activates. 

The GEF that mediates Ras activation by RTKs was discovered by genetic 
studies of eye development in Drosophila, where an RTK called Sevenless (Sev) 
is required for the formation of a photoreceptor cell called R7. Genetic screens 
for components of this signaling pathway led to the discovery of a Ras-GEF called 
Son-of-sevenless (Sos). Further genetic screens uncovered another protein, now 
called Grb2, which is an adaptor protein that links the Sev receptor to the Sos pro- 
tein; the SH2 domain of the Grb2 adaptor binds to the activated receptor, while its 
two SH3 domains bind to Sos. Sos then promotes Ras activation. Biochemical and 
cell biological studies have shown that Grb2 and Sos also link activated RTKs to 
Ras in mammalian cells, revealing that this is a highly conserved mechanism in 
RTK signaling (Figure 15-47). Once activated, Ras activates various other signal- 
ing proteins to relay the signal downstream, as we discuss next. 


Ras Activates a MAP Kinase Signaling Module 


Both the tyrosine phosphorylations and the activation of Ras triggered by acti- 
vated RTKs are usually short-lived (Figure 15-48). Tyrosine-specific protein phos- 
phatases quickly reverse the phosphorylations, and Ras-GAPs induce activated 
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Figure 15-47 How an RTK activates 
Ras. Grb2 recognizes a specific 
phosphorylated tyrosine on the activated 
receptor by means of an SH2 domain 
and recruits Sos by means of two SH3 
domains. Sos stimulates the inactive Ras 
protein to replace its bound GDP by GTP, 
which activates Ras to relay the signal 
downstream. 


Figure 15-48 Transient activation 

of Ras revealed by single-molecule 
fluorescence resonance energy transfer 
(FRET). (A) Schematic drawing of the 
experimental strategy. Cells of a human 
cancer cell line are genetically engineered 
to express a Ras protein that is covalently 
linked to yellow fluorescent protein (YFP). 
GTP that is labeled with a red fluorescent 
dye is microinjected into some of the cells. 
The cells are then stimulated with the 
extracellular signal protein EGF, and single 
fluorescent molecules of Ras-YFP at the 
inner surface of the plasma membrane are 
followed by video fluorescence microscopy 
in individual cells. When a fluorescent 
Ras-YFP molecule becomes activated, it 
exchanges unlabeled GDP for fluorescently 
labeled GTP; the energy emitted by the 
YFP now activates the fluorescent GTP 

to emit red light (called fluorescence 
resonance energy transfer, or FRET; see 
Figure 9-26). Thus, the activation of 

single Ras molecules can be followed by 
the emission of red fluorescence from a 
previously yellow-green fluorescent spot 

at the plasma membrane. As shown 

in (B), activated Ras molecules can be 
detected after about 30 seconds of EGF 
stimulation. The red signal peaks at about 
3 minutes and then decreases to baseline 
by 6 minutes. As Ras-GAP is found to be 
recruited to the same spots at the plasma 
membrane as Ras, it presumably plays a 
major part in rapidly shutting off the Ras 
signal. (Modified from H. Murakoshi et al., 
Proc. Natl Acad. Sci. USA 101:7317-7322, 
2004. With permission from National 
Academy of Sciences.) 
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Ras to inactivate itself by hydrolyzing its bound GTP to GDP. To stimulate cells to 
proliferate or differentiate, these short-lived signaling events must be converted 
into longer-lasting ones that can sustain the signal and relay it downstream to the 
nucleus to alter the pattern of gene expression. One of the key mechanisms used 
for this purpose is a system of proteins called the mitogen-activated protein kinase 
module (MAP kinase module) (Figure 15-49). The three components of this sys- 
tem form a functional signaling module that has been remarkably well conserved 
during evolution and is used, with variations, in many different signaling contexts. 

The three components are all protein kinases. The final kinase in the series is 
called simply MAP kinase (MAPK). The next one upstream from this is MAP kinase 
kinase (MAPKK): it phosphorylates and thereby activates MAP kinase. Next above 
that, receiving an activating signal directly from Ras, is MAP kinase kinase kinase 
(MAPKKkK): it phosphorylates and thereby activates MAPKK. In the mammalian 
Ras-MAP-kinase signaling pathway, these three kinases are known by shorter 
names: Raf (= MAPKKkK), Mek (= MAPKK), and Erk (=MAPK). 

Once activated, the MAP kinase relays the signal downstream by phosphory- 
lating various proteins in the cell, including transcription regulators and other 
protein kinases (see Figure 15-49). Erk, for example, enters the nucleus and phos- 
phorylates one or more components of a transcription regulatory complex. This 
activates the transcription of a set of immediate early genes, so named because 
they turn on within minutes after an RTK receives an extracellular signal, even 
if protein synthesis is experimentally blocked with drugs. Some of these genes 
encode other transcription regulators that turn on other genes, a process that 
requires both protein synthesis and more time. In this way, the Ras-MAP-kinase 
signaling pathway conveys signals from the cell surface to the nucleus and alters 
the pattern of gene expression. Among the genes activated by this pathway are 
some that stimulate cell proliferation, such as the genes encoding G; cyclins (dis- 
cussed in Chapter 17). 

Extracellular signals usually activate MAP kinases only transiently, and the 
period during which the kinase remains active influences the response. When 
EGF activates its receptors in a neural precursor cell line, for example, Erk MAP 
kinase activity peaks at 5 minutes and rapidly declines, and the cells later go on to 
divide. By contrast, when NGF activates its receptors on the same cells, Erk activ- 
ity remains high for many hours, and the cells stop proliferating and differentiate 
into neurons. 

Many factors influence the duration and other features of the signaling 
response, including positive and negative feedback loops, which can combine to 


Figure 15-49 The MAP kinase module 
activated by Ras. The three-component 
module begins with a MAP kinase kinase 
kinase called Raf. Ras recruits Raf to the 
plasma membrane and helps activate it. 
Raf then activates the MAP kinase kinase 
Mek, which then activates the MAP kinase 
Erk. Erk in turn phosphorylates a variety 
of downstream proteins, including other 
protein kinases, as well as transcription 
regulators in the nucleus. The resulting 
changes in protein activities and gene 
expression cause complex changes in cell 
behavior. 


SIGNALING THROUGH ENZYME-COUPLED RECEPTORS 


mating factor osmolarity-sensing 


receptor es 
v 









CYTOSOL 
Y " | 
© kinases = MAP kinase kinase kinase = Kinase) 
> a 
ý p K i 











kinase B 





= MAP kinase kinase = 








ly x 
LS 1 
v 
Ý Ý 
t t 
(A) MATING RESPONSE (B) GLYCEROL SYNTHESIS 


give responses that are either graded or switchlike and either brief or long lasting. 
In an example illustrated earlier, in Figure 15-19, MAP kinase activates a complex 
positive feedback loop to produce an all-or-none, irreversible response when frog 
oocytes are stimulated to mature by a brief exposure to the extracellular signal 
molecule progesterone. In many cells, MAP kinases activate a negative feedback 
loop by increasing the concentration of a protein phosphatase that removes the 
phosphate from MAP kinase. The increase in the phosphatase results from both 
an increase in the transcription of the phosphatase gene and the stabilization of 
the enzyme against degradation. In the Ras-MAP-kinase pathway shown in Fig- 
ure 15-49, Erk also phosphorylates and inactivates Raf, providing another nega- 
tive feedback loop that helps shut off the MAP kinase module. 


Scaffold Proteins Help Prevent Cross-talk Between 
Parallel MAP Kinase Modules 


Three-component MAP kinase signaling modules operate in all eukaryotic cells, 
with different modules mediating different responses in the same cell. In bud- 
ding yeast, for example, one such module mediates the response to mating 
pheromone, another the response to starvation, and yet another the response to 
osmotic shock. Some of these MAP kinase modules use one or more of the same 
kinases and yet manage to activate different effector proteins and hence different 
responses. As discussed earlier, one way in which cells avoid cross-talk between 
the different parallel signaling pathways and ensure that each response is specific 
is to use scaffold proteins (see Figure 15-10A). In budding yeast cells, such scaf- 
folds bind all or some of the kinases in each MAP kinase module to form a com- 
plex and thereby help to ensure response specificity (Figure 15-50). 

Mammalian cells also use this scaffold strategy to prevent cross-talk between 
different MAP kinase modules. At least five parallel MAP kinase modules can 
operate in a mammalian cell. These modules make use of at least 12 MAP kinases, 
7 MAPKKs, and 7 MAPKKKS. Two of these modules (terminating in MAP kinases 
called JNK and p38) are activated by different kinds of cell stresses, such as ultra- 
violet (UV) irradiation, heat shock, and osmotic stress, as well as by inflammatory 
cytokines; others mainly mediate responses to signals from other cells. 

Although the scaffold strategy provides precision and avoids cross-talk, it 
reduces the opportunities for amplification and spreading of the signal to different 
parts of the cell, which require at least some of the components to be diffusible. It 
is unclear to what extent the individual components of MAP kinase modules can 
dissociate from the scaffold during the activation process to permit amplification. 


857 


Figure 15-50 The organization of two 
MAP kinase modules by scaffold 
proteins in budding yeast. Budding 
yeast have at least six three-component 
MAP kinase modules involved in a variety 
of biological processes, including the 

two responses illustrated here—a mating 
response and the response to high 
osmolarity. (A) The mating response is 
triggered when a mating factor secreted 
by a yeast of opposite mating type binds 
to a GPCR. This activates a G protein, the 
By complex of which indirectly activates 
the MAPKKK (kinase A), which then relays 
the response onward. Once activated, the 
MAP kinase (kinase C) phosphorylates 
and thereby activates several proteins that 
mediate the mating response, in which the 
yeast cell stops dividing and prepares for 
fusion. The three kinases in this module 
are bound to scaffold protein 1. (B) In a 
second response, a yeast cell exposed to a 
high-osmolarity environment is induced to 
synthesize glycerol to increase its internal 
osmolarity. This response is mediated by 
an osmolarity-sensing receptor protein and 
a different MAP kinase module bound to 

a second scaffold protein. (Note that the 
kinase domain of scaffold 2 provides the 
MAPKK activity of this module.) Although 
both pathways use the same MAPKKK 
(kinase A, green), there is no cross-talk 
between them because the kinases in each 
module are bound to different scaffold 
proteins, and the osmosensor is bound to 
the same scaffold protein as the particular 
kinase it activates. 


858 Chapter 15: Cell Signaling 


Rho Family GTPases Functionally Couple Cell-Surface Receptors 
to the Cytoskeleton 


Besides the Ras proteins, the other class of Ras superfamily GTPases that relays 
signals from cell-surface receptors is the large Rho family (see Table 15-5). Rho 
family monomeric GTPases regulate both the actin and microtubule cytoskele- 
tons, controlling cell shape, polarity, motility, and adhesion (discussed in Chapter 
16); they also regulate cell-cycle progression, gene transcription, and membrane 
transport. They play a key part in the guidance of cell migration and nerve axon 
outgrowth, mediating cytoskeletal responses to the activation of a special class of 
guidance receptors. We focus on this aspect of Rho family function here. 

The three best-characterized family members are Rho itself, Rac, and Cdc42, 
each of which affects multiple downstream target proteins. In the same way as for 
Ras, GEFs activate and GAPs inactivate the Rho family GTPases; there are more 
than 80 Rho-GEFs and more than 70 Rho-GAPs in humans. Some of the GEFs and 
GAPs are specific for one particular family member, whereas others are less spe- 
cific. Unlike Ras, which is membrane-associated even when inactive (with GDP 
bound), inactive Rho family GTPases are often bound to guanine nucleotide disso- 
ciation inhibitors (GDIs) in the cytosol, which prevent the GTPases from interact- 
ing with their Rho-GEFs at the plasma membrane. 

Signaling by extracellular signaling proteins of the ephrin family provides an 
example of how RTKs can activate a Rho GTPase. Ephrins bind and thereby acti- 
vate members of the Eph family of RTKs (see Figure 15-43). One member of the 
Eph family is found on the surface of motor neurons and helps guide the migrat- 
ing tip of the axon (called a growth cone) to its muscle target. The binding of a 
cell-surface ephrin protein activates the Eph receptor, causing the growth cones to 
collapse, thereby repelling them from inappropriate regions and keeping them on 
track. The response depends on a Rho-GEF called ephexin, which is stably asso- 
ciated with the cytosolic tail of the Eph receptor. When ephrin binding activates 
the Eph receptor, the receptor activates a cytoplasmic tyrosine kinase that phos- 
phorylates ephexin on a tyrosine, enhancing the ability of ephexin to activate the 
Rho protein RhoA. The activated RhoA (RhoA-GTP) then regulates various down- 
stream target proteins, including some effector proteins that control the actin 
cytoskeleton, causing the growth cone to collapse (Figure 15-51). 

Having considered how RTKs use GEFs and monomeric GTPases to relay sig- 
nals into the cell, we now consider a second major strategy that RTKs use that 
depends on a quite different intracellular relay mechanism. 
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Figure 15-51 Growth cone collapse 
mediated by Rho family GTPases. 

The binding of ephrin A1 proteins on an 
adjacent cell activates EohA4 RTKs on 
the growth cone. Phosphotyrosines on 
the activated Eph receptors recruit and 
activate a cytoplasmic tyrosine kinase to 
phosphorylate the receptor-associated 
Rho-GEF ephexin on a tyrosine. This 
enhances the ability of the ephexin to 
activate RhoA. RhoA then induces the 
growth cone to collapse by stimulating the 
myosin-dependent contraction of the actin 
cytoskeleton. 


SIGNALING THROUGH ENZYME-COUPLED RECEPTORS 


PI 38-Kinase Produces Lipid Docking Sites in the Plasma 
Membrane 


As mentioned earlier, one of the proteins that binds to the intracellular tail of RTK 
molecules is the plasma-membrane-bound enzyme phosphoinositide 3-kinase 
(PI 3-kinase). This kinase principally phosphorylates inositol phospholipids 
rather than proteins, and both RTKs and GPCRs can activate it. It plays a central 
part in promoting cell survival and growth. 

Phosphatidylinositol (PI) is unique among membrane lipids because it can 
undergo reversible phosphorylation at multiple sites on its inositol head group 
to generate a variety of phosphorylated PI lipids called phosphoinositides. When 
activated, PI 3-kinase catalyzes phosphorylation at the 3 position of the inositol 
ring to generate several phosphoinositides (Figure 15-52). The production of 
P1I(3,4,5)P3 matters most because it can serve as a docking site for various intracel- 
lular signaling proteins, which assemble into signaling complexes that relay the 
signal into the cell from the cytosolic face of the plasma membrane (see Figure 
15-10C). 

Notice the difference between this use of phosphoinositides and their use 
described earlier, in which PI(4,5)P2 is cleaved by PLCB (in the case of GPCRs) or 
PLCy (in the case of RTKs) to generate soluble IP3 and membrane-bound diacyl- 
glycerol (see Figures 15-28 and 15-29). By contrast, PI(3,4,5)P3 is not cleaved by 
either PLC. It is made from PI(4,5)P2 and then remains in the plasma membrane 
until specific phosphoinositide phosphatases dephosphorylate it. Prominent 
among these is the PTEN phosphatase, which dephosphorylates the 3 position 
of the inositol ring. Mutations in PTEN are found in many cancers: by prolonging 
signaling by PI 3-kinase, they promote uncontrolled cell growth. 

There are various types of PI 3-kinases. Those activated by RTKs and GPCRs 
belong to class I. These are heterodimers composed of a common catalytic sub- 
unit and different regulatory subunits. RTKs activate class Ia PI 3-kinases, in which 
the regulatory subunit is an adaptor protein that binds to two phosphotyrosines 
on activated RTKs through its two SH2 domains (see Figure 15-46A). GPCRs acti- 
vate class Ib PI 3-kinases, which have a regulatory subunit that binds to the By 
complex of an activated trimeric G protein (Gg) when GPCRs are activated by 
their extracellular ligand. The direct binding of activated Ras can also activate the 
common class I catalytic subunit. 
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Figure 15-52 The generation of 
phosphoinositide docking sites by PI 
3-kinase. PI 3-kinase phosphorylates the 
inositol ring on carbon atom 3 to generate 
the phosphoinositides shown at the bottom 
of the figure (diverting them away from the 
pathway leading to IP3 and diacylglycerol; 
see Figure 15-28). The most important 
phosphorylation (indicated in red) is of 
PI(4,5)P2 to PI(3,4,5)P3, which can serve 
as a docking site for signaling proteins with 
PI(3,4,5)P3-binding PH domains. Other 
inositol phospholipid kinases (not shown) 
catalyze the phosphorylations indicated by 
the green arrows. 
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Intracellular signaling proteins bind to PI(3,4,5)P3 produced by activated PI 
3-kinase via a specific interaction domain, such as a pleckstrin homology (PH) 
domain, first identified in the platelet protein pleckstrin. PH domains function 
mainly as protein-protein interaction domains, and it is only a small subset of 
them that bind to PI(3,4,5)P3; at least some of these also recognize a specific mem- 
brane-bound protein as well as the PI(3,4,5)P3, which greatly increases the speci- 
ficity of the binding and helps to explain why the signaling proteins with PI(3,4,5) 
P3-binding PH domains do not all dock at all PI(3,4,5)P3 sites. PH domains occur 
in about 200 human proteins, including the Ras-GEF Sos discussed earlier (see 
Figure 15-11). 

One especially important PH-domain-containing protein is the serine/threo- 
nine protein kinase Akt. The PI-3-kinase-Akt signaling pathway is the major path- 
way activated by the hormone insulin. It also plays a key part in promoting the 
survival and growth of many cell types in both invertebrates and vertebrates, as 
we now discuss. 


The PI-3-Kinase—-Akt Signaling Pathway Stimulates Animal Cells to 
Survive and Grow 


As discussed earlier, extracellular signals are usually required for animal cells to 
grow and divide, as well as to survive (see Figure 15-4). Members of the insulin- 
like growth factor (IGF) family of signal proteins, for example, stimulate many 
types of animal cells to survive and grow. They bind to specific RTKs (see Figure 
15-43), which activate PI 3-kinase to produce PI(3,4,5)P3. The PI(3,4,5)P3 recruits 
two protein kinases to the plasma membrane via their PH domains—Akt (also 
called protein kinase B, or PKB) and phosphoinositide-dependent protein kinase 
1 (PDK1), and this leads to the activation of Akt (Figure 15-53). Once activated, 
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Figure 15-53 One way in which signaling through PI 3-kinase promotes cell survival. An extracellular survival signal activates an RTK, 

which recruits and activates PI 3-kinase. The PI 3-kinase produces PI(3,4,5)P3, which serves as a docking site for two serine/threonine kinases 

with PH domains—Akt and the phosphoinositide-dependent kinase PDK1—and brings them into proximity at the plasma membrane. The Akt is 
phosphorylated on a serine by a third kinase (usually MTOR in complex 2), which alters the conformation of the Akt so that it can be phosphorylated 
on a threonine by PDK1, which activates the Akt. The activated Akt now dissociates from the plasma membrane and phosphorylates various target 
proteins, including the Bad protein. When unphosphorylated, Bad holds one or more apoptosis-inhibitory proteins (of the Bcl2 family—discussed in 
Chapter 18) in an inactive state. Once phosphorylated, Bad releases the inhibitory proteins, which now can block apoptosis and thereby promote 


cell survival. As shown, the phosphorylated Bad binds to a ubiquitous cytosolic protein called 74-3-3, which keeps Bad out of action. 
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Akt phosphorylates various target proteins at the plasma membrane, as well as in 
the cytosol and nucleus. The effect on most of the known targets is to inactivate 
them; but the targets are such that these actions of Akt all conspire to enhance cell 
survival and growth, as illustrated for one cell survival pathway in Figure 15-53. 

The control of cell growth by the PI-3-kinase-Akt pathway depends in part on 
a large protein kinase called TOR (named as the target of rapamycin, a bacterial 
toxin that inactivates the kinase and is used clinically as both an immunosup- 
pressant and anticancer drug). TOR was originally identified in yeasts in genetic 
screens for rapamycin resistance; in mammalian cells, it is called mTOR, which 
exists in cells in two functionally distinct multiprotein complexes. mTOR complex 
1 contains the protein raptor; this complex is sensitive to rapamycin, and it stimu- 
lates cell growth—both by promoting ribosome production and protein synthesis 
and by inhibiting protein degradation. Complex 1 also promotes both cell growth 
and cell survival by stimulating nutrient uptake and metabolism. mTOR complex 
2 contains the protein rictor and is insensitive to rapamycin; it helps to activate Akt 
(see Figure 15-53), and it regulates the actin cytoskeleton via Rho family GTPases. 

The mTOR in complex 1 integrates inputs from various sources, including 
extracellular signal proteins referred to as growth factors and nutrients such as 
amino acids, both of which help activate mTOR and promote cell growth. The 
growth factors activate mTOR mainly via the PI-3-kinase-Akt pathway. Akt acti- 
vates mTOR in complex 1 indirectly by phosphorylating, and thereby inhibiting, a 
GAP called Tsc2. Tsc2 acts on a monomeric Ras-related GTPase called Rheb (see 
Table 15-5). Rheb in its active form (Rheb-GTP) activates mTOR in complex 1. The 
net result is that Akt activates mTOR and thereby promotes cell growth (Figure 
15-54). We discuss how mTOR stimulates ribosome production and protein syn- 
thesis in Chapter 17 (see Figure 17-64). 


RTKs and GPCRs Activate Overlapping Signaling Pathways 


As mentioned earlier, RTKs and GPCRs activate some of the same intracellular 
signaling pathways. Both, for example, can activate the inositol phospholipid 
pathway triggered by phospholipase C. Moreover, even when they activate differ- 
ent pathways, the different pathways can converge on the same target proteins. 
Figure 15-55 illustrates both of these types of signaling overlaps: it summarizes 
five parallel intracellular signaling pathways that we have discussed so far—one 
triggered by GPCRs, two triggered by RTKs, and two triggered by both kinds of 
receptors. Interactions among these pathways allow different extracellular signal 
molecules to modulate and coordinate each other’s effects. 
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Figure 15-54 Activation of mTOR by 
the PI-3-kinase—Akt signaling pathway. 
(A) In the absence of extracellular growth 
factors, Tsc2 (a Rneb-GAP) keeps Rheb 
inactive; MTOR in complex 1 is inactive, 
and there is no cell growth. (B) In the 
presence of growth factors, activated Akt 
phosphorylates and inhibits Tsc2, thereby 
promoting the activation of Rheb. Activated 
Rheb (Rheb-GTP) helps activate mTOR 

in complex 1, which in turn stimulates cell 
growth. Figure 15-53 shows how growth 
factors (or survival signals) activate Akt. 
The Erk MAP kinase (see Figure 15—49) 
can also phosphorylate and inhibit Tsc2 
and thereby activate mTOR. Thus, both 
the Pl-3-kinase-Akt and Ras-MAP-kinase 
signaling pathways converge on mTOR in 
complex 1 to stimulate cell growth. 

Tsc2 is short for tuberous sclerosis 
protein 2, and it is one component of a 
heterodimer composed of Tsc1 and Tsc2 
(not shown); these proteins are so called 
because mutations in either gene encoding 
them cause the genetic disease tuberous 
sclerosis, which is associated with benign 
tumors that contain abnormally large cells. 
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Some Enzyme-Coupled Receptors Associate with Cytoplasmic 
Tyrosine Kinases 


Many cell-surface receptors depend on tyrosine phosphorylation for their activ- 
ity and yet lack a tyrosine kinase domain. These receptors act through cytoplas- 
mic tyrosine kinases, which are associated with the receptors and phosphorylate 
various target proteins, often including the receptors themselves, when the recep- 
tors bind their ligand. These tyrosine-kinase-associated receptors thus function 
in much the same way as RTKs, except that their kinase domain is encoded by 
a separate gene and is noncovalently associated with the receptor polypeptide 
chain. A variety of receptor classes belong in this category, including the receptors 
for antigen and interleukins on lymphocytes (discussed in Chapter 24), integrins 
(discussed in Chapter 19), and receptors for various cytokines and some hor- 
mones. As with RTKs, many of these receptors are either preformed dimers or are 
cross-linked into dimers by ligand binding. 

Some of these receptors depend on members of the largest family of mam- 
malian cytoplasmic tyrosine kinases, the Src family (see Figures 3-10 and 3-64), 
which includes Src, Yes, Fgr, Fyn, Lck, Lyn, Hck, and Blk. These protein kinases 
all contain SH2 and SH3 domains and are located on the cytoplasmic side of the 
plasma membrane, held there partly by their interaction with transmembrane 
receptor proteins and partly by covalently attached lipid chains. Different family 
members are associated with different receptors and phosphorylate overlapping 
but distinct sets of target proteins. Lyn, Fyn, and Lck, for example, are each asso- 
ciated with different sets of receptors on lymphocytes. In each case, the kinase is 
activated when an extracellular ligand binds to the appropriate receptor protein. 
Src itself, as well as several other family members, can also bind to activated RTKs; 
in these cases, the receptor and cytoplasmic kinases mutually stimulate each oth- 
er’s catalytic activity, thereby strengthening and prolonging the signal (see Figure 
15-51). There are even some G proteins (Gs and Gj) that can activate Src, which 
is one way that the activation of GPCRs can lead to tyrosine phosphorylation of 
intracellular signaling proteins and effector proteins. 


Figure 15-55 Five parallel intracellular 
signaling pathways activated by GPCRs, 
RTKs, or both. In this simplified example, 
the five kinases (Shaded yellow) at the end 
of each signaling pathway phosphorylate 
target proteins (Shaded red), many of 
which are phosphorylated by more than 
one of the kinases. The phospholipase C 
activated by the two types of receptors is 
different: GPCRs activate PLCB, whereas 
RTKs activate PLCy (not shown). Although 
not shown, some GPCRs can also activate 
Ras, but they do so independently of Grb2, 
via a Ras-GEF that is activated by Ca?+ 
and diacylglycerol. 
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Another type of cytoplasmic tyrosine kinase associates with integrins, the 
main receptors that cells use to bind to the extracellular matrix (discussed in 
Chapter 19). The binding of matrix components to integrins activates intracellular 
signaling pathways that influence the behavior of the cell. When integrins cluster 
at sites of matrix contact, they help trigger the assembly of cell-matrix junctions 
called focal adhesions. Among the many proteins recruited into these junctions is 
the cytoplasmic tyrosine kinase called focal adhesion kinase (FAK), which binds 
to the cytosolic tail of one of the integrin subunits with the assistance of other 
proteins. The clustered FAK molecules phosphorylate each other, creating phos- 
photyrosine docking sites where the Src kinase can bind. Src and FAK then phos- 
phorylate each other and other proteins that assemble in the junction, including 
many of the signaling proteins used by RTKs. In this way, the two tyrosine kinases 
signal to the cell that it has adhered to a suitable substratum, where the cell can 
now survive, grow, divide, migrate, and so on. 

The largest and most diverse class of receptors that rely on cytoplasmic tyro- 
sine kinases to relay signals into the cell is the class of cytokine receptors, which we 
consider next. 


Cytokine Receptors Activate the JAK-STAT Signaling Pathway 


The large family of cytokine receptors includes receptors for many kinds of local 
mediators (collectively called cytokines), as well as receptors for some hormones, 
such as growth hormone and prolactin (Movie 15.8). These receptors are stably 
associated with cytoplasmic tyrosine kinases called Janus kinases (JAKs) (after 
the two-faced Roman god), which phosphorylate and activate transcription regu- 
lators called STATs (signal transducers and activators of transcription). STAT pro- 
teins are located in the cytosol and are referred to as latent transcription regulators 
because they migrate into the nucleus and regulate gene transcription only after 
they are activated. 

Although many intracellular signaling pathways lead from cell-surface recep- 
tors to the nucleus, where they alter gene transcription (see Figure 15-55), the 
JAK-STAT signaling pathway provides one of the more direct routes. Cytokine 
receptors are dimers or trimers and are stably associated with one or two of 
the four known JAKs (JAK1, JAK2, JAK3, and Tyk2). Cytokine binding alters the 
arrangement so as to bring two JAKs into close proximity so that they phosphor- 
ylate each other, thereby increasing the activity of their tyrosine kinase domains. 
The JAKs then phosphorylate tyrosines on the cytoplasmic tails of cytokine recep- 
tors, creating phosphotyrosine docking sites for STATs (Figure 15-56). Some 
adaptor proteins can also bind to some of these sites and couple cytokine recep- 
tors to the Ras-MAP-kinase signaling pathway discussed earlier, but these will not 
be discussed here. 

There are at least six STATs in mammals. Each has an SH2 domain that per- 
forms two functions. First, it mediates the binding of the STAT protein to a phos- 
photyrosine docking site on an activated cytokine receptor. Once bound, the JAKs 
phosphorylate the STAT on tyrosines, causing the STAT to dissociate from the 
receptor. Second, the SH2 domain on the released STAT now mediates its binding 
to a phosphotyrosine on another STAT molecule, forming either a STAT homodi- 
mer or a heterodimer. The STAT dimer then translocates to the nucleus, where, 
in combination with other transcription regulatory proteins, it binds to a specific 
cis-regulatory sequence in various genes and stimulates their transcription (see 
Figure 15-56). In response to the hormone prolactin, for example, which stimu- 
lates breast cells to produce milk, activated STAT5 stimulates the transcription of 
genes that encode milk proteins. Table 15-6 lists some of the more than 30 cyto- 
kines and hormones that activate the JAK-STAT pathway by binding to cytokine 
receptors. 

Negative feedback regulates the responses mediated by the JAK-STAT path- 
way. In addition to activating genes that encode proteins mediating the cyto- 
kine-induced response, the STAT dimers can also activate genes that encode 
inhibitory proteins that help shut off the response. Some of these proteins bind to 
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Figure 15-56 The JAK-STAT signaling 
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separate receptor polypeptide chains 
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and inactivate phosphorylated JAKs and their associated phosphorylated recep- 
tors; others bind to phosphorylated STAT dimers and prevent them from bind- 
ing to their DNA targets. Such negative feedback mechanisms, however, are not 
enough on their own to turn off the response. Inactivation of the activated JAKs 
and STATs requires dephosphorylation of their phosphotyrosines. 


Protein Tyrosine Phosphatases Reverse Tyrosine Phosphorylations 


In all signaling pathways that use tyrosine phosphorylation, the tyrosine phos- 
phorylations are reversed by protein tyrosine phosphatases. These phosphatases 
are as important in the signaling process as the protein tyrosine kinases that add 
the phosphates. Whereas only a few types of serine/threonine protein phosphatase 


TABLE 15-6 


Interferon-y (IFNy) JAK1 and JAK2 STAT1 Activates macrophages 
Interferon-a (IFNa) Tyk2 and JAK2 STAT1 and STAT2 | Increases cell resistance to viral infection 


Erythropoietin JAK2 STATS Stimulates production of erythrocytes 
Prolactin JAK1 and JAK2 STATS Stimulates milk production 


Growth hormone JAK2 STAT1 and STATS | Stimulates growth by inducing IGF1 

production 
Granulocyte-—Macrophage-Colony- JAK2 STAT5 Stimulates production of granulocytes 
Stimulating Factor (GMCSF) and macrophages 
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catalytic subunits are responsible for removing phosphate groups from phos- 
phorylated serines and threonines on proteins, there are about 100 protein tyro- 
sine phosphatases encoded in the human genome, including some dual-specific- 
ity phosphatases that also dephosphorylate serines and threonines. 

Like tyrosine kinases, the tyrosine phosphatases occur in both cytoplasmic 
and transmembrane forms. Unlike serine/threonine protein phosphatases, which 
generally have broad specificity, most tyrosine phosphatases display exquisite 
specificity for their substrates, removing phosphate groups from only selected 
phosphotyrosines on a subset of proteins. Together, these phosphatases ensure 
that tyrosine phosphorylations are short-lived and that the level of tyrosine phos- 
phorylation in resting cells is very low. They do not, however, simply continuously 
reverse the effects of protein tyrosine kinases; they are often regulated to act only 
at the appropriate time and place. 

Having discussed the crucial role of tyrosine phosphorylation and dephos- 
phorylation in the intracellular signaling pathways activated by many enzyme- 
coupled receptors, we now turn to a class of enzyme-coupled receptors that rely 
on serine and threonine phosphorylation. These receptor serine/threonine kinases 
activate an even more direct signaling pathway to the nucleus than does the JAK- 
STAT pathway. They directly phosphorylate latent transcription regulators called 
Smads, which then translocate into the nucleus to control gene transcription. 


Signal Proteins of the TGFB Superfamily Act Through Receptor 
Serine/Threonine Kinases and Smads 


The transforming growth factor- (TGFB) superfamily consists of a large num- 
ber (33 in humans) of structurally related, secreted, dimeric proteins. They act 
either as hormones or, more commonly, as local mediators to regulate a wide 
range of biological functions in all animals. During development, they regulate 
pattern formation and influence various cell behaviors, including proliferation, 
specification and differentiation, extracellular matrix production, and cell death. 
In adults, they are involved in tissue repair and in immune regulation, as well as 
in many other processes. The superfamily consists of the TGFB/activin family and 
the larger bone morphogenetic protein (BMP) family. 

All of these proteins act through enzyme-coupled receptors that are single- 
pass transmembrane proteins with a serine/threonine kinase domain on the 
cytosolic side of the plasma membrane. There are two classes of these receptor 
serine/threonine kinases—type I and type IJ—which are structurally similar 
homodimers. Each member of the TGFB superfamily binds to a characteristic 
combination of type-I and type-II receptor dimers, bringing the kinase domains 
together so that the type-II receptor can phosphorylate and activate the type-I 
receptor, forming an active tetrameric receptor complex. 

Once activated, the receptor complex uses a strategy for rapidly relaying the 
signal to the nucleus that is very similar to the JAK-STAT strategy used by cyto- 
kine receptors. The activated type-I receptor directly binds and phosphorylates 
a latent transcription regulator of the Smad family (named after the first two 
proteins identified, Sma in C. elegans and Mad in Drosophila). Activated TGFB/ 
activin receptors phosphorylate Smad2 or Smad3, while activated BMP receptors 
phosphorylate Smad1, Smad5, or Smad8. Once one of these receptor-activated 
Smads (R-Smads) has been phosphorylated, it dissociates from the receptor and 
binds to Smad4 (called a co-Smad), which can form a complex with any of the five 
R-Smads. The Smad complex then translocates into the nucleus, where it associ- 
ates with other transcription regulators and controls the transcription of specific 
target genes (Figure 15-57). Because the partner proteins in the nucleus vary 
depending on the cell type and state of the cell, the genes affected vary. 

Activated TGF receptors and their bound ligand are endocytosed by two dis- 
tinct routes, one leading to further activation and the other leading to inactiva- 
tion. The activation route depends on clathrin-coated vesicles and leads to early 
endosomes (discussed in Chapter 13), where most of the Smad activation occurs. 
An anchoring protein called SARA (for Smad anchor for receptor activation) has 
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animportant role in this pathway; itis concentrated in early endosomes and binds 
to both activated TGFB receptors and Smads, increasing the efficiency of recep- 
tor-mediated Smad phosphorylation. The inactivation route depends on caveolae 
(discussed in Chapter 13) and leads to receptor ubiquitylation and degradation in 
proteasomes. 

During the signaling response, the Smads shuttle continuously between 
the cytoplasm and the nucleus: they are dephosphorylated in the nucleus and 
exported to the cytoplasm, where they can be rephosphorylated by activated 
receptors. In this way, the effect exerted on the target genes reflects both the con- 
centration of the extracellular signal and the time the signal continues to act on 
the cell-surface receptors (often several hours). Cells exposed to a morphogen at 
high concentration, or for a long time, or both, will switch on one set of genes, 
whereas cells receiving a lower or more transient exposure will switch on another 
set. 

As in other signaling systems, negative feedback regulates the Smad path- 
way. Among the target genes activated by Smad complexes are those that encode 
inhibitory Smads, either Smad6 or Smad7. Smad7 (and possibly Smad6) binds 
to the cytosolic tail of the activated receptor and inhibits its signaling ability in 
at least three ways: (1) it competes with R-Smads for binding sites on the recep- 
tor, decreasing R-Smad phosphorylation; (2) it recruits a ubiquitin ligase called 
Smurf, which ubiquitylates the receptor, leading to receptor internalization and 
degradation (it is because Smurfs also ubiquitylate and promote the degradation 
of Smads that they are called Smad ubiquitylation regulatory factors, or Smurfs); 
and (3) it recruits a protein phosphatase that dephosphorylates and inactivates 
the receptor. In addition, the inhibitory Smads bind to the co-Smad, Smad4, and 
inhibit it, either by preventing its binding to R-Smads or by promoting its ubiqui- 
tylation and degradation. 

Although receptor serine/threonine kinases operate mainly through the Smad 
pathway just described, they can also stimulate other intracellular signaling pro- 
teins such as MAP kinases and PI 3-kinase. Conversely, signaling proteins in other 
pathways can phosphorylate Smads and thereby influence signaling along the 
Smad pathway. 


Summary 


There are various classes of enzyme-coupled receptors, the most common of which 
are receptor tyrosine kinases (RTKs), tyrosine-kinase-associated receptors, and 
receptor serine/threonine kinases. 


Figure 15-57 The Smad-dependent 
signaling pathway activated by TGF. 
The TGFB dimer promotes the assembly of 
a tetrameric receptor complex containing 
two copies each of the type-I! and 

type-ll receptors. The type-ll receptors 
phosphorylate specific sites on the type-l 
receptors, thereby activating their kinase 
domains and leading to phosphorylation 
of R-Smads such as Smad2 and Smads. 
Smads open up to expose a dimerization 
surface when they are phosphorylated, 
leading to the formation of a trimeric Smad 
complex containing two R-Smads and the 
co-Smad, Smad4. The phosphorylated 
Smad complex enters the nucleus and 
collaborates with other transcription 
regulators to control the transcription of 
specific target genes. 
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Ligand binding to RTKs causes their dimerization, which leads to activation 
of their kinase domains. These activated kinase domains phosphorylate multiple 
tyrosines on the receptors, producing a set of phosphotyrosines that serve as dock- 
ing sites for a set of intracellular signaling proteins, which bind via their SH2 (or 
PTB) domains. One such signaling protein serves as an adaptor to couple some 
activated receptors to a Ras-GEF (Sos), which activates the monomeric GTPase Ras; 
Ras, in turn, activates a three-component MAP kinase signaling module, which 
relays the signal to the nucleus by phosphorylating transcription regulatory pro- 
teins. Another important signaling protein that can dock on activated RTKs is PI 
3-kinase, which phosphorylates specific phosphoinositides to produce lipid docking 
sites in the plasma membrane for signaling proteins with phosphoinositide-bind- 
ing PH domains, including the serine/threonine protein kinase Akt (PKB), which 
plays a key part in the control of cell survival and cell growth. Many receptor classes, 
including some RTKs, activate Rho family monomeric GTPases, which functionally 
couple the receptors to the cytoskeleton. 

Tyrosine-kinase-associated receptors depend on various cytoplasmic tyrosine 
kinases for their action. These kinases include members of the Src family, which 
associate with many kinds of receptors, and the focal adhesion kinase (FAK), which 
associates with integrins at focal adhesions. The cytoplasmic tyrosine kinases then 
phosphorylate a variety of signaling proteins to relay the signal onward. The larg- 
est family of receptors in this class is the cytokine receptor family. When stimulated 
by ligand binding, these receptors activate JAK cytoplasmic tyrosine kinases, which 
phosphorylate STATs. The STATs then dimerize, translocate to the nucleus, and acti- 
vate the transcription of specific genes. Receptor serine/threonine kinases, which 
are activated by signal proteins of the TGF£ superfamily, act similarly: they directly 
phosphorylate and activate Smads, which then oligomerize with another Smad, 
translocate to the nucleus, and regulate gene transcription. 


ALTERNATIVE SIGNALING ROUTES IN GENE 
REGULATION 


Major changes in the behavior of a cell tend to depend on changes in the expres- 
sion of numerous genes. Thus, many extracellular signaling molecules carry out 
their effects, in whole or in part, by initiating signaling pathways that change 
the activities of transcription regulators. There are numerous examples of gene 
regulation in both GPCR and enzyme-coupled receptor pathways (see Figures 
15-27 and 15-49). In this section, we describe some of the less common signaling 
mechanisms by which gene expression can be controlled. We begin with several 
pathways that depend on regulated proteolysis to control the activity and location 
of latent transcription regulators. We then turn to a class of extracellular signal 
molecules that do not employ cell-surface receptors but enter the cell and inter- 
act directly with transcription regulators to perform their functions. Finally, we 
briefly discuss some of the mechanisms by which gene expression is controlled by 
the circadian rhythm: the daily cycle of light and dark. 


The Receptor Notch Is a Latent Transcription Regulatory Protein 


Signaling through the Notch receptor protein is used widely in animal devel- 
opment. As discussed in Chapter 22, it has a general role in controlling cell fate 
choices and regulating pattern formation during the development of most tissues, 
as well as in the continual renewal of tissues such as the lining of the gut. It is best 
known, however, for its role in the production of Drosophila neural cells, which 
usually arise as isolated single cells within an epithelial sheet of precursor cells. 
During this process, when a precursor cell commits to becoming a neural cell, it 
signals to its immediate neighbors not to do the same; the inhibited cells develop 
into epidermal cells instead. This process, called lateral inhibition, depends on a 
contact-dependent signaling mechanism that is activated by a single-pass trans- 
membrane signal protein called Delta, displayed on the surface of the future 
neural cell. By binding to the Notch receptor protein on a neighboring cell, Delta 
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signals to the neighbor not to become neural (Figure 15-58). When this signaling 
process is defective, a huge excess of neural cells is produced at the expense of 
epidermal cells, which is lethal. 

Notch is a single-pass transmembrane protein that requires proteolytic pro- 
cessing to function. It acts as a latent transcription regulator and provides the sim- 
plest and most direct signaling pathway known from a cell-surface receptor to the 
nucleus. When activated by the binding of Delta on another cell, a plasma-mem- 
brane-bound protease cleaves off the cytoplasmic tail of Notch, and the released 
tail translocates into the nucleus to activate the transcription of a set of Notch- 
response genes. The Notch tail fragment acts by binding to a DNA-binding pro- 
tein, converting it from a transcriptional repressor into a transcriptional activator. 

The Notch receptor undergoes three successive proteolytic cleavage steps, but 
only the last two depend on Delta binding. As part of its normal biosynthesis, it is 
cleaved in the Golgi apparatus to form a heterodimer, which is then transported 
to the cell surface as the mature receptor. The binding of Delta to Notch induces 
a second cleavage in the extracellular domain, mediated by an extracellular pro- 
tease. A final cleavage quickly follows, cutting free the cytoplasmic tail of the acti- 
vated receptor (Figure 15-59). Note that, unlike most receptors, the activation of 
Notch is irreversible; once activated by ligand binding, the protein cannot be used 
again. 

This final cleavage of the Notch tail occurs just within the transmembrane seg- 
ment, and it is mediated by a protease complex called y-secretase, which is also 
responsible for the intramembrane cleavage of various other proteins. One of its 
essential subunits is Presenilin, so called because mutations in the gene encod- 
ing it are a frequent cause of early-onset, familial Alzheimer’s disease, a form of 
presenile dementia. The protease complex is thought to contribute to this and 
other forms of Alzheimer’s disease by generating extracellular peptide fragments 
from a transmembrane neuronal protein; the fragments accumulate in excessive 
amounts and form aggregates of misfolded protein called amyloid plaques, which 
may injure nerve cells and contribute to their degeneration and loss. 

Both Notch and Delta are glycoproteins, and their interaction is regulated by 
the glycosylation of Notch. The Fringe family of glycosyl transferases, in particular, 
adds extra sugars to the O-linked oligosaccharide (discussed in Chapter 13) on 
Notch, which alters the specificity of Notch for its ligands. This has provided the 
first example of the modulation of ligand-receptor signaling by differential recep- 
tor glycosylation. 


Wnt Proteins Bind to Frizzled Receptors and Inhibit the 
Degradation of B-Catenin 


Wnt proteins are secreted signal molecules that act as local mediators and mor- 
phogens to control many aspects of development in all animals that have been 
studied. They were discovered independently in flies and in mice: in Drosophila, 
the Wingless (Wg) gene originally came to light because of its role as a morphogen 


Figure 15-58 Lateral inhibition mediated 
by Notch and Delta during neural cell 
development in Drosophila. When 
individual cells in the epithelium begin 

to develop as neural cells, they signal to 
their neighbors not to do the same. This 
inhibitory, contact-dependent signaling 

is mediated by the ligand Delta, which 
appears on the surface of the future neural 
cell and binds to Notch receptor proteins 
on the neighboring cells. In many tissues, 
all the cells in a cluster initially express 
both Delta and Notch, and a competition 
occurs, with one cell emerging as winner, 
expressing Delta strongly and inhibiting 

its neighbors from doing likewise. In other 
cases, additional factors interact with Delta 
or Notch to make some cells susceptible 
to the lateral inhibition signal and others 
unresponsive to it. 
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Figure 15-59 The processing and activation of Notch by proteolytic cleavage. The numbered red arrowheads indicate the 
sites of proteolytic cleavage. The first proteolytic processing step occurs within the trans Golgi network to generate the mature 
heterodimeric Notch receptor that is then displayed on the cell surface. The binding to Delta on a neighboring cell triggers the 
next two proteolytic steps: the complex of Delta and the Notch fragment to which it is bound is endocytosed by the Delta- 
expressing cell, exposing the extracellular cleavage site in the transmembrane Notch subunit. Note that Notch and Delta interact 
through their repeated EGF-like domains. The released Notch tail migrates into the nucleus, where it binds to the Ropsuh 
protein, which it converts from a transcriptional repressor to a transcriptional activator. 


in wing development, while in mice, the Jnt1 gene was found because it promoted 
the development of breast tumors when activated by the integration of a virus next 
to it. Both of these genes encode Wnt proteins. Wnts are unusual as secreted pro- 
teins in that they have a fatty acid chain covalently attached to their N-terminus, 
which increases their binding to cell surfaces. There are 19 Wnts in humans, each 
having distinct, but often overlapping, functions. 

Wnts can activate at least two types of intracellular signaling pathways. Our 
primary focus here is the Wnt/$-catenin pathway (also known as the canonical 
Wnt pathway), which is centered on the latent transcription regulator /-catenin. 
A second pathway, called the planar polarity pathway, coordinates the polariza- 
tion of cells in the plane of a developing epithelium and depends on Rho fam- 
ily GTPases. Both of these pathways begin with the binding of Wnts to Frizzled 
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family cell-surface receptors, which are seven-pass transmembrane proteins that 
resemble GPCRs in structure but do not generally work through the activation 
of G proteins. Instead, when activated by Wnt binding, Frizzled proteins recruit 
the scaffold protein Dishevelled, which helps relay the signal to other signaling 
molecules. 

The Wnt/B-catenin pathway acts by regulating the proteolysis of the multi- 
functional protein B-catenin (or Armadillo in flies). A portion of the cell’s B-cat- 
enin is located at cell-cell junctions and thereby contributes to the control of cell- 
cell adhesion (discussed in Chapter 19), while the remaining B-catenin is rapidly 
degraded in the cytoplasm. Degradation depends on a large protein degradation 
complex, which binds B-catenin and keeps it out of the nucleus while promot- 
ing its degradation. The complex contains at least four other proteins: a protein 
kinase called casein kinase 1 (CK1) phosphorylates the B-catenin on a serine, 
priming it for further phosphorylation by another protein kinase called glycogen 
synthase kinase 3 (GSK3); this final phosphorylation marks the protein for ubiq- 
uitylation and rapid degradation in proteasomes. Two scaffold proteins called 
axin and Adenomatous polyposis coli (APC) hold the protein complex together 
(Figure 15-60A). APC gets its name from the finding that the gene encoding it 
is often mutated in a type of benign tumor (adenoma) of the colon; the tumor 
projects into the lumen as a polyp and can eventually become malignant. (This 
APC should not be confused with the anaphase-promoting complex, or APC/C, 
that plays a central part in selective protein degradation during the cell cycle—see 
Figure 17-15A.) 

Wnt proteins regulate B-catenin proteolysis by binding to both a Frizzled pro- 
tein and aco-receptor that is related to the low-density lipoprotein (LDL) receptor 
(discussed in Chapter 13) and is therefore called an LDL-receptor-related protein 
(LRP). In a poorly understood process, the activated receptor complex recruits 
the Dishevelled scaffold and promotes the phosphorylation of the LRP receptor 
by the two protein kinases, GSK3 and CK1. Axin is brought to the receptor com- 
plex and inactivated, thereby disrupting the B-catenin degradation complex in 
the cytoplasm. In this way, the phosphorylation and degradation of B-catenin are 
prevented, enabling unphosphorylated -catenin to accumulate and translocate 
to the nucleus, where it alters the pattern of gene transcription (Figure 15-60B). 
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Figure 15-60 The Wnt/f-catenin 
signaling pathway. (A) In the absence of 
a Wnt signal, B-catenin that is not bound 
to cell-cell adherens junctions (not shown) 
interacts with a degradation complex 
containing APC, axin, GSK3, and CK1. In 
this complex, B-catenin is phosphorylated 
by CK1 and then by GSK, triggering 

its ubiquitylation and degradation in 
proteasomes. Wnht-responsive genes are 
kept inactive by the Groucho co-repressor 
protein bound to the transcription regulator 
LEF1/TCF. (B) Wnt binding to Frizzled 

and LRP clusters the two co-receptors 
together, and the cytosolic tail of LRP is 
phosphorylated by GSKS and then by CK1. 
Axin binds to the phosphorylated LRP and 
is inactivated and/or degraded, resulting in 
disassembly of the degradation complex. 
The phosphorylation of b-catenin is 
thereby prevented, and unphosphorylated 
B-catenin accumulates and translocates 
to the nucleus, where it binds to LEF1/ 
TCF, displaces the co-repressor Groucho, 
and acts as a coactivator to stimulate the 
transcription of Wnt target genes. The 
scaffold protein Dishevelled is required for 
the signaling pathway to operate; it binds 
to Frizzled and becomes phosphorylated 
(not shown), but its precise role is 
unknown. 
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In the absence of Wnt signaling, Wnt-responsive genes are kept silent by an 
inhibitory complex of transcription regulatory proteins. The complex includes 
proteins of the LEF1/TCF family bound to a co-repressor protein of the Groucho 
family (see Figure 15-60A). In response to a Wnt signal, b-catenin enters the 
nucleus and binds to the LEF1/TCF proteins, displacing Groucho. The B-catenin 
now functions as a coactivator, inducing the transcription of the Wnt target genes 
(see Figure 15-60B). Thus, as in the case of Notch signaling, Wnt/#-catenin signal- 
ing triggers a switch from transcriptional repression to transcriptional activation. 

Among the genes activated by B-catenin is Myc, which encodes a protein 
(Myc) that is an important regulator of cell growth and proliferation (discussed 
in Chapter 17). Mutations of the Apc gene occur in 80% of human colon cancers 
(discussed in Chapter 20). These mutations inhibit the protein’s ability to bind 
B-catenin, so that B-catenin accumulates in the nucleus and stimulates the tran- 
scription of c-Myc and other Wnt target genes, even in the absence of Wnt signal- 
ing. The resulting uncontrolled cell growth and proliferation promote the devel- 
opment of cancer. 

Various secreted inhibitory proteins regulate Wnt signaling in development. 
Some bind to the LRP receptors and promote their down-regulation, whereas 
others compete with Frizzled receptors for secreted Wnts. In Drosophila at least, 
Wnts activate negative feedback loops, in which Wnt target genes encode proteins 
that help shut the response off; some of these proteins inhibit Dishevelled, and 
others are secreted inhibitors. 


Hedgehog Proteins Bind to Patched, Relieving Its Inhibition of 
Smoothened 


Hedgehog proteins and Wnt proteins act in similar ways. Both are secreted sig- 
nal molecules, which act as local mediators and morphogens in many develop- 
ing invertebrate and vertebrate tissues. Both proteins are modified by covalently 
attached lipids, depend on secreted or cell-surface-bound heparan sulfate proteo- 
glycans (discussed in Chapter 19) for their action, and activate latent transcription 
regulators by inhibiting their degradation. They both trigger a switch from tran- 
scriptional repression to transcriptional activation, and excessive signaling along 
either pathway in adult cells can lead to cancer. They even use some of the same 
intracellular signaling proteins and sometimes collaborate to mediate a response. 

The Hedgehog proteins were discovered in Drosophila, where this protein 
family has only one member. Mutation of the Hedgehog gene produces a larva 
covered with spiky processes (denticles), like a hedgehog. At least three genes 
encode Hedgehog proteins in vertebrates—Sonic, Desert, and Indian hedgehog. 
The active forms of all Hedgehog proteins are covalently coupled to cholesterol, as 
well as to a fatty acid chain. The cholesterol is added during an unusual process- 
ing step, in which a precursor protein cleaves itself to produce a smaller, choles- 
terol-containing signal protein. Most of what we know about the Hedgehog sig- 
naling pathway came initially from genetic studies in flies, and it is the fly pathway 
that we summarize here. 

The effects of Hedgehog are mediated by a latent transcription regulator called 
Cubitus interruptus (Ci), the regulation of which is reminiscent of the regulation 
of B-catenin by Wnts. In the absence of a Hedgehog signal, Ci is ubiquitylated and 
proteolytically cleaved in proteasomes. Instead of being completely degraded, 
however, Ci is processed to form a smaller fragment, which accumulates in the 
nucleus, where it acts as a transcriptional repressor, helping to keep Hedge- 
hog-responsive genes silent. The proteolytic processing of the Ci protein depends 
on its phosphorylation by three protein kinases—PKA and two kinases also used 
in the Wnt pathway, namely GSK3 and CK1. As in the Wnt pathway, the proteo- 
lytic processing occurs in a multiprotein complex. The complex includes the pro- 
tein kinase Fused and a scaffold protein Costal2, which stably associates with Ci, 
recruits the three other kinases, and binds the complex to microtubules, thereby 
keeping unprocessed Ci out of the nucleus (Figure 15-61A). 


871 


872 Chapter 15: Cell Signaling 





(B) WITH HEDGEHOG SIGNAL 


- Hedgehog 


(A) 
Ig-like domain 


fibronectin-type-lll-like domain 





Patched 
/ 


inactive 
Smoothened 


iHog N 


vesicle 


microtubule 


S 


, Fused 


large Ci protein 


cleaved Ci protein in complex 
with co-repressor 


a- co-repressor 














HEDGEHOG TARGET 
GENES OFF 








Hedgehog functions by blocking the proteolytic processing of Ci, thereby 
changing it into a transcriptional activator. It does this by a convoluted signal- 
ing process that depends on three transmembrane proteins: Patched, iHog, and 
Smoothened. Patched is predicted to cross the plasma membrane 12 times, and, 
although much of it is in intracellular vesicles, some is on the cell surface where 
it can bind the Hedgehog protein. iHog is also on the cell surface and is thought 
to serve as a co-receptor for Hedgehog. Smoothened is a seven-pass transmem- 
brane protein with a structure very similar to a GPCR, but it does not seem to act 
as a Hedgehog receptor or even as an activator of G proteins; it is controlled by 
Patched and iHog. 

In the absence of a Hedgehog signal, Patched employs an unknown mech- 
anism to keep Smoothened sequestered and inactive in intracellular vesicles 
(see Figure 15-61A). The binding of Hedgehog to iHog and Patched inhibits the 
activity of Patched and induces its endocytosis and degradation. The result is that 
Smoothened is liberated from inhibition and translocates to the plasma mem- 
brane, where it recruits the protein complex containing Ci, Fused, and Costal2. 
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Figure 15-61 Hedgehog signaling 

in Drosophila. (A) In the absence of 
Hedgehog, most Patched is in intracellular 
vesicles (not shown), where it keeps 
Smoothened inactive and sequestered. The 
Ci protein is bound in a cytosolic protein 
degradation complex, which includes the 
protein kinase Fused and the scaffold 
protein Costal2. Costal2 recruits three 
other protein kinases (PKA, GSK3, and 
CK1; not shown), which phosphorylate 

Ci. Phosphorylated Ci is ubiquitylated and 
then cleaved in proteasomes (not shown) 
to form a transcriptional repressor, which 
accumulates in the nucleus to help keep 
Hedgehog target genes inactive. 

(B) Hedgehog binding to iHog and Patched 
removes the inhibition of Smoothened by 
Patched. Smoothened is phosphorylated 
by PKA and CK1 and translocates to the 
plasma membrane, where it recruits the 
complex containing Fused, Costal2, and 
Ci. Costal2 releases unprocessed Ci, which 
accumulates in the nucleus and activates 
the transcription of Hedgehog target genes. 
Many details in the pathway are poorly 
understood, including the role of Fused. 
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Costal2 is no longer able to bind the other three kinases, and so Ci is no longer 
cleaved and can now enter the nucleus and activate the transcription of Hedge- 
hog target genes (Figure 15-61B). Among the genes activated by Ci is Patched 
itself; the resulting increase in Patched protein on the cell surface inhibits further 
Hedgehog signaling—providing another example of negative feedback. 

Many gaps remain in our understanding of the Hedgehog signaling pathway. It 
is not known, for example, how Patched keeps Smoothened inactive and intracel- 
lular. As the structure of Patched resembles a transmembrane transporter protein, 
it has been proposed that it may transport a small molecule into the cell that keeps 
Smoothened sequestered in vesicles. 

Even less is known about the more complex Hedgehog pathway in vertebrate 
cells. In addition to there being at least three types of vertebrate Hedgehog pro- 
teins, there are three Ci-like transcription regulator proteins (Glil, Gli2, and Gli3) 
downstream of Smoothened. Gli2 and Gli3 are most similar to Ci in structure and 
function, and Gli3 has been shown to undergo proteolytic processing like Ci and 
to act as either a transcriptional repressor or a transcriptional activator. Moreover, 
in vertebrates, Smoothened, upon activation, becomes localized to the surface of 
the primary cilium (discussed in Chapter 16), where the Gli proteins are also con- 
centrated, thereby increasing the speed and efficiency of signaling. 

Hedgehog signaling can promote cell proliferation, and excessive Hedgehog 
signaling can lead to cancer. Inactivating mutations in one of the two human 
Patched genes, for example, which lead to excessive Hedgehog signaling, occur 
frequently in basal cell carcinoma of the skin, the most common form of cancer in 
Caucasians. A small molecule called cyclopamine, made by a meadow lily, is being 
used to treat cancers associated with excessive Hedgehog signaling. It blocks 
Hedgehog signaling by binding tightly to Smoothened and inhibiting its activity. 
It was originally identified because it causes severe developmental defects in the 
progeny of sheep grazing on such lilies; these include the presence of a single cen- 
tral eye (a condition called cyclopia), which is also seen in mice that are deficient 
in Hedgehog signaling. 


Many Stressful and Inflammatory Stimuli Act Through 
an NF«B-Dependent Signaling Pathway 


The NFkB proteins are latent transcription regulators that are present in most 
animal cells and are central to many stressful, inflammatory, and innate immune 
responses. These responses occur as a reaction to infection or injury and help pro- 
tect stressed multicellular organisms and their cells (discussed in Chapter 24). An 
excessive or inappropriate inflammatory response in animals can also damage 
tissue and cause severe pain, and chronic inflammation can lead to cancer; as 
in the case of Wnt and Hedgehog signaling, excessive NFKB signaling is found 
in a number of human cancers. NFKB proteins also have important roles during 
normal animal development: the Drosophila NF«KB family member Dorsal, for 
example, has a crucial role in specifying the dorsal-ventral axis of the developing 
fly embryo (discussed in Chapter 22). 

Various cell-surface receptors activate the NF«B signaling pathway in animal 
cells. Toll receptors in Drosophila and Toll-like receptors in vertebrates, for exam- 
ple, recognize pathogens and activate this pathway in triggering innate immune 
responses (discussed in Chapter 24). The receptors for tumor necrosis factor a 
(TNFa) and interleukin-1 (IL1), which are vertebrate cytokines especially impor- 
tant in inducing inflammatory responses, also activate this signaling pathway. The 
Toll, Toll-like, and IL1 receptors belong to the same family of proteins, whereas 
TNF receptors belong to a different family; all of them, however, act in similar ways 
to activate NFKB. When activated, they trigger a multiprotein ubiquitylation and 
phosphorylation cascade that releases NF«B from an inhibitory protein complex, 
so that it can translocate to the nucleus and turn on the transcription of hundreds 
of genes that participate in inflammatory and innate immune responses. 

There are five NFKB proteins in mammals (RelA, RelB, c-Rel, NFkB1, and 
NFxB2), and they form a variety of homodimers and heterodimers, each of which 
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TNFa Figure 15-62 The activation of the 
NF«B pathway by TNFa. Both TNFa 
and its receptors are trimers. The binding 
of TNFa causes a rearrangement of the 
clustered cytosolic tails of the receptors, 
which now recruit various signaling 
proteins, resulting in the activation of 
a protein kinase that phosphorylates 
IKB and activates IkB kinase kinase (IKK). 
IKK is a heterotrimer composed of two 
kinase subunits (IKKa and IKKB) and a 
activated NFKB regulatory subunit called NEMO. IKKB then 
IKK) Yo}tB) phosphorylates IkB on two serines, which 
complex marks the protein for ubiquitylation and 
degradation in proteasomes. The released 
NF«B translocates into the nucleus, where, 
in collaboration with coactivator proteins, 
it stimulates the transcription of its target 
genes. 
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activates its own characteristic set of genes. Inhibitory proteins called IKB bind 
tightly to the dimers and hold them in an inactive state within the cytoplasm of 
unstimulated cells. There are three major IkB proteins in mammals (IKB a, B, and 
£), and the signals that release NF«B dimers do so by triggering a signaling path- 
way that leads to the phosphorylation, ubiquitylation, and consequent degrada- 
tion of the IKB proteins (Figure 15-62). 

Among the genes activated by the released NFKB is the gene that encodes 
IxBa. This activation leads to increased synthesis of IkBa protein, which binds 
to NF«B and inactivates it, creating a negative feedback loop (Figure 15-63A). 
Experiments on TNFa-induced responses, as well as computer modeling stud- 
ies of the responses, indicate that the negative feedback produces two types of 
NF«B responses, depending on the duration of the TNFa stimulus; importantly, 
the two types of responses induce different patterns of gene expression (Figure 
15-63B, C, and D). The negative feedback through IkBa is required for both types 
of responses: in cells deficient in IkBa, even a short exposure to TNFa induces a 
sustained activation of NF«B, without oscillations, and all of the NFKB-responsive 
genes are activated. 

Thus far, we have focused on the mechanisms by which extracellular signal 
molecules use cell-surface receptors to initiate changes in gene expression. We 
now turn to a class of extracellular signals that bypasses the plasma membrane 
entirely and controls, in the most direct way possible, transcription regulatory 
proteins inside the cell. 


Nuclear Receptors Are Ligand-Modulated Transcription Regulators 


Various small, hydrophobic signal molecules diffuse directly across the plasma 
membrane of target cells and bind to intracellular receptors that are transcription 
regulators. These signal molecules include steroid hormones, thyroid hormones, 
retinoids, and vitamin D. Although they differ greatly from one another in both 
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Figure 15-63 Negative feedback in the NFKB signaling pathway induces oscillations in NF«B activation. (A) Drawing 
showing how activated NF«KB stimulates the transcription of the IkBa gene, the protein product of which acts back in 

the cytoplasm to sequester and inhibit NF«B there; if the stimulus is persistent, the newly made IkBa protein will then be 
ubiquitylated and degraded, liberating active NF«B again so that it can return to the nucleus and activate transcription (see 
Figure 15-62). (B) A short exposure to TNFa produces a single, short pulse of NF«B activation, beginning within minutes and 
ending by 1 hour. This response turns on the transcription of gene A but not gene B. (C) A sustained exposure to TNFa for 

the entire 6 hours of the experiment produces oscillations in NF«B activation that damp down over time. This response turns 
on the transcription of both genes; gene B turns on only after several hours, indicating that gene B transcription requires 
prolonged activation of NF«B, for reasons that are not understood. (D) These time-lapse confocal fluorescence micrographs 
from a different study of TNFa stimulation show the oscillations of NF«B in a cultured cell, as indicated by its periodic movement 
into the nucleus (N) of a fusion protein composed of NFKB fused to a red fluorescent protein. In the cell at the center of the 
micrographs, NF«B is active and in the nucleus at 6, 60, 210, 380, and 480 minutes, but it is exclusively in the cytoplasm at 

O, 120, 300, 410, and 510 minutes. (A-C, based on data from A. Hoffmann et al., Science 298:1241-1245, 2002, and adapted 
from A.Y. Ting and D. Endy, Science 298:1189-1190, 2002; D, from D.E. Nelson et al., Science 306:704—708, 2004. All with 


permission from AAAS.) 


chemical structure (Figure 15-64) and function, they all act by a similar mech- 
anism. They bind to their respective intracellular receptor proteins and alter the 
ability of these proteins to control the transcription of specific genes. Thus, these 
proteins serve both as intracellular receptors and as intracellular effectors for the 
signal. 

The receptors are all structurally related, being part of the very large nuclear 
receptor superfamily. Many family members have been identified by DNA 
sequencing only, and their ligand is not yet known; they are therefore referred 
to as orphan nuclear receptors, and they make up large fractions of the nuclear 
receptors encoded in the genomes of humans, Drosophila, and the nematode 
C. elegans. Some mammalian nuclear receptors are regulated by intracellular 
metabolites rather than by secreted signal molecules; the peroxisome prolifera- 
tion-activated receptors (PPARs), for example, bind intracellular lipid metabolites 
and regulate the transcription of genes involved in lipid metabolism and fat-cell 
differentiation. It seems likely that the nuclear receptors for hormones evolved 
from such receptors for intracellular metabolites, which would help explain their 
intracellular location. 

Steroid hormones—which include cortisol, the steroid sex hormones, vita- 
min D (in vertebrates), and the molting hormone ecdysone (in insects)—are all 
made from cholesterol. Cortisol is produced in the cortex of the adrenal glands 
and influences the metabolism of many types of cells. The steroid sex hormones 
are made in the testes and ovaries and are responsible for the secondary sex 
characteristics that distinguish males from females. Vitamin D is synthesized in 
the skin in response to sunlight; after it has been converted to its active form in 
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the liver or kidneys, it regulates Ca** metabolism, promoting Ca** uptake in the 
gut and reducing its excretion in the kidneys. The thyroid hormones, which are 
made from the amino acid tyrosine, act to increase the metabolic rate of many 
cell types, while the retinoids, such as retinoic acid, are made from vitamin A and 
have important roles as local mediators in vertebrate development. Although all 
of these signal molecules are relatively insoluble in water, they are made soluble 
for transport in the bloodstream and other extracellular fluids by binding to spe- 
cific carrier proteins, from which they dissociate before entering a target cell (see 
Figure 15-3B). 

The nuclear receptors bind to specific DNA sequences adjacent to the genes 
that the ligand regulates. Some of the receptors, such as those for cortisol, are 
located primarily in the cytosol and enter the nucleus only after ligand bind- 
ing; others, such as the thyroid and retinoid receptors, are bound to DNA in the 
nucleus even in the absence of ligand. In either case, the inactive receptors are 
usually bound to inhibitory protein complexes. Ligand binding alters the con- 
formation of the receptor protein, causing the inhibitory complex to dissociate, 
while also causing the receptor to bind coactivator proteins that stimulate gene 
transcription (Figure 15-65). In other cases, however, ligand binding to a nuclear 
receptor inhibits transcription: some thyroid hormone receptors, for example, act 
as transcriptional activators in the absence of their hormone and become tran- 
scriptional repressors when hormone binds. 

Thus far, we have focused on the control of gene expression by extracellular 
signal molecules produced by other cells. We now turn to gene regulation by a 
more global environmental signal: the cycle of light and darkness that results from 
the Earth’s rotation. 


Circadian Clocks Contain Negative Feedback Loops That Control 
Gene Expression 


Life on Earth evolved in the presence of a daily cycle of day and night, and many 
present-day organisms (ranging from archaea to plants and humans) possess an 
internal rhythm that dictates different behaviors at different times of day. These 
behaviors range from the cyclical change in metabolic enzyme activities of a bac- 
terium to the elaborate sleep-wake cycles of humans. The internal oscillators that 
control such diurnal rhythms are called circadian clocks. 

Having a circadian clock enables an organism to anticipate the regular daily 
changes in its environment and take appropriate action in advance. Of course, 
the internal clock cannot be perfectly accurate, and so it must be capable of being 
reset by external cues such as the light of day. Thus, circadian clocks keep running 
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Figure 15-64 Some signal molecules 
that bind to intracellular receptors. Note 
that all of them are small and hydrophobic. 
The active, hydroxylated form of vitamin D3 
is shown. Estradiol and testosterone are 
steroid sex hormones. 
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even when the environmental cues (changes in light and dark) are removed, but 
the period of this free-running rhythm is generally a little less or more than 24 
hours. External signals indicating the time of day cause small adjustments in 
the running of the clock, so as to keep the organism in synchrony with its envi- 
ronment. Following more drastic shifts, circadian cycles become gradually reset 
(entrained) by the new cycle of light and dark, as anyone who has experienced jet 
lag can attest. 

We might expect that the circadian clock would be a complex multicellular 
device, with different groups of cells responsible for different parts of the oscil- 
lation mechanism. Remarkably, however, in almost all multicellular organisms, 
including humans, the timekeepers are individual cells. Thus, a clock that oper- 
ates in each member of a specialized group of brain cells (the SCN cells in the 
suprachiasmatic nucleus of the hypothalamus) controls our diurnal cycles of 
sleeping and waking, body temperature, and hormone release. Even if these cells 
are removed from the brain and dispersed in a culture dish, they will continue to 
oscillate individually, showing a cyclic pattern of gene expression with a period of 
approximately 24 hours. In the intact body, the SCN cells receive neural cues from 
the retina, entraining the SCN cells to the daily cycle of light and dark; they also 
send information about the time of day to another brain area, the pineal gland, 
which relays the time signal to the rest of the body by releasing the hormone mela- 
tonin in time with the clock. 

Although the SCN cells have a central role as timekeepers in mammals, almost 
all the other cells in the mammalian body have an internal circadian rhythm, 
which has the ability to reset in response to light. Similarly, in Drosophila, many 
different types of cells have a similar circadian clock, which continues to cycle 
when they have been dissected away from the rest of the fly and can be reset by 
externally imposed light and dark cycles. 

The working of circadian clocks, therefore, is a fundamental problem in cell 
biology. Although we do not yet understand all the details, studies in a wide vari- 
ety of organisms have revealed the basic principles and molecular components. 
The key principle is that circadian clocks generally depend on negative feedback 
loops. As discussed earlier, oscillations in the activity of an intracellular signaling 
protein can occur if that protein inhibits its own activity with a long delay (see 
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Figure 15-65 The activation of 

nuclear receptors. All nuclear receptors 
bind to DNA as either homodimers or 
heterodimers, but for simplicity we show 
them as monomers. (A) The receptors all 
have a related structure, which includes 
three major domains, as shown. An inactive 
receptor is bound to inhibitory proteins. 

(B) Typically, the binding of ligand to the 
receptor causes the ligand-binding domain 
of the receptor to clamp shut around the 
ligand, the inhibitory proteins to dissociate, 
and coactivator proteins to bind to the 
receptor’s transcription-activating domain, 
thereby increasing gene transcription. 

In other cases, ligand binding has the 
opposite effect, causing co-repressor 
proteins to bind to the receptor, thereby 
decreasing transcription (not shown). 

(C) The structure of the ligand-binding 
domain of the retinoic acid receptor is 
shown in the absence (left) and presence 
(middle) of ligand (shown in red). When 
ligand binds, the blue a helix acts as a 

lid that snaps shut, trapping the ligand in 
place. The shift in the conformation of the 
receptor upon ligand binding also creates a 
binding site for a small a helix (orange) on 
the surface of coactivator proteins. (PDB 
codes: 1LBD, 2ZYO, and 2ZXZ.) 
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Figure 15-18C and D). In Drosophila and many other animals, including humans, 
the heart of the circadian clock is a delayed negative feedback loop based on tran- 
scription regulators: accumulation of certain gene products switches off the tran- 
scription of their own genes, but with a delay, so that the cell oscillates between a 
state in which the products are present and transcription is switched off, and one 
in which the products are absent and transcription is switched on (Figure 15-66). 
The negative feedback underlying circadian rhythms does not have to be based on 
transcription regulators. In some cell types, the circadian clock is constructed of 
proteins that govern their own activities through post-translational mechanisms, 
as we discuss next. 





Three Proteins in a Test Tube Can Reconstitute a Cyanobacterial 
Circadian Clock 


The best understood circadian clock is found in the photosynthetic cyanobacte- 
rium, Synechococcus elongatus. The core oscillator in this organism is remarkably 
simple, being composed of just three proteins—KaiA, KaiB, and KaiC. The cen- 
tral player is KaiC, a multifunctional enzyme that catalyzes its own phosphoryla- 
tion and dephosphorylation in a 24-hour cycle: it gradually phosphorylates itself 
sequentially at two sites during the day and dephosphorylates itself during the 
night. This timing depends on interactions with the two other Kai proteins: KaiA 
binds to unphosphorylated KaiC and stimulates KaiC autophosphosphorylation, 
first at one site and then, with a delay, at the other. The second phosphorylation 
promotes the binding of the third protein, KaiB, which blocks the stimulatory 
effect of KaiA and thereby allows KaiC to dephosphorylate itself, bringing KaiC 
back to its dephosphorylated state. This clock depends on a negative feedback 
loop: KaiC drives its own phosphorylation until, after a delay, it recruits an inhib- 
itor, KaiB, that stimulates KaiC to dephosphorylate itself. Amazingly, when the 
three Kai proteins are purified and incubated in a test tube with ATP, KaiC phos- 
phorylation and dephosphorylation occur with roughly 24-hour timing over a 
period of several days (Figure 15-67). 

Circadian oscillations in KaiC phosphorylation lead to parallel rhythms in the 
expression of large numbers of genes involved in controlling metabolic activities 
and cell division (see Figure 15-67). As a result, many aspects of cell behavior are 
synchronized with the circadian cycle. 

Even in continuous darkness, cyanobacterial cells generate free-running oscil- 
lations of KaiC phosphorylation with roughly 24-hour periods. As in other circa- 
dian clocks, the cyanobacterial clock is entrained by the environmental light/dark 


Figure 15-66 Simplified outline of the 
mechanism of the circadian clock in 
Drosophila cells. A central feature of 

the clock is the periodic accumulation 

and decay of two transcription regulatory 
proteins, Tim (short for timeless, based 

on the phenotype of a gene mutation) 

and Per (short for period). The mRNAs 
encoding these proteins rise gradually 
during the day and are translated in the 
cytosol, where the two proteins associate 
to form a heterodimer. After a time delay, 
the heterodimer dissociates and Tim and 
Per are transported into the nucleus, where 
Per represses the Tim and Per genes, 
resulting in negative feedback that causes 
the levels of Tim and Per to fall. In addition 
to this transcriptional feedback, the clock 
depends on numerous other proteins. For 
example, the controlled degradation of Per 
indicated in the diagram imposes delays 

in the accumulation of Tim and Per, which 
are crucial to the functioning of the clock. 
Steps at which specific delays are imposed 
are shown in red. 

Entrainment (or resetting) of the clock 
occurs in response to new light-dark 
cycles. Although most Drosophila cells 
do not have true photoreceptors, light is 
sensed by intracellular flavoproteins, also 
called cryptochromes. In the presence of 
light, these proteins associate with the 
Tim protein and cause its degradation, 
thereby resetting the clock. (Adapted from 
J.C. Dunlap, Science 311:184-186, 2006.) 
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cycle. Light is thought to affect the circadian clock indirectly: the activities of Kai 
proteins are influenced by changes in intracellular redox potential, which occur 
as a result of increased photosynthetic activity during the day. 


Summary 


Some signaling pathways that are especially important in animal development 
depend on proteolysis to control the activity and location of latent transcription reg- 
ulatory proteins. Notch receptors are themselves such proteins, which are activated 
by cleavage when Delta on another cell binds to them; the cleaved cytosolic tail 
of Notch migrates into the nucleus, where it stimulates the transcription of Notch- 
responsive genes. In the Wnt/ß-catenin signaling pathway, by contrast, the prote- 
olysis of the latent transcription regulatory protein p-catenin is inhibited when a 
secreted Wnt protein binds to both a Frizzled and LRP receptor protein; as a result, 
f-catenin accumulates in the nucleus and activates the transcription of Wnt target 
genes. 

Hedgehog signaling in flies works much like Wnt signaling. In the absence of 
a signal, a bifunctional, cytoplasmic transcription regulator, Ci, is proteolyti- 
cally cleaved to form a transcriptional repressor that keeps Hedgehog target genes 
silenced. The binding of Hedgehog to its receptors (Patched and iHog) inhibits the 
proteolytic processing of Ci; as a result, the intact Ci protein accumulates in the 
nucleus and activates the transcription of Hedgehog-responsive genes. In Notch, 
Wnt, and Hedgehog signaling, the extracellular signal triggers a switch from tran- 
scriptional repression to transcriptional activation. 

Signaling through the latent transcription regulator NFkB also depends on 
proteolysis. NFkB proteins are normally held in an inactive state by inhibitory IkB 
proteins in the cytoplasm. A variety of extracellular stimuli, including proinflam- 
matory cytokines, trigger the phosphorylation and ubiquitylation of IkB, marking 
it for degradation; this enables the NFxB to translocate to the nucleus and activate 
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Figure 15-67 The core circadian 
oscillator of cyanobacteria. (A) KaiC is 

a combined kinase and phosphatase that 
phosphorylates and dephosphorylates itself 
on two adjacent sites. In the absence of 
other proteins, the phosphatase activity 

is dominant, and the protein is mostly 
unphosphorylated. The binding of KaiA 

to KaiC suppresses the phosphatase 
activity and promotes the kinase activity, 
leading to KaiC phosphorylation, first 

at site 1 and then at site 2, resulting 

in diphosphorylated KaiC. KaiC then 
dephosphorylates itself slowly at site 1, 
even in the presence of KaiA, so that KaiC 
is phosphorylated only at site 2. This form 
of KaiC interacts with KaiB, which blocks 
the stimulatory effects of KaiA, thereby 
reducing the rate of KaiC phosphorylation 
and allowing dephosphorylation to occur. 
Diphosphorylated KaiC increases in 
abundance during the day and peaks 
around dusk. It activates other proteins 
that phosphorylate a transcription 
regulator (RpaA), which then stimulates 
expression of some genes (the dusk genes 
that peak in early evening) and inhibits 
expression of other genes (the dawn genes 
that peak in the morning). When KaiC 
dephosphorylation gradually occurs during 
the night, these effects are reversed: Qusk 
genes are turned off and dawn genes are 
turned on. 

(B) In this experiment, the three Kai 
proteins were purified and mixed in a test 
tube with ATP (which is required for KaiC 
kinase activity). Every two hours over the 
next 3 days, the KaiC protein was analyzed 
by polyacrylamide gel electrophoresis, 
in which the phosphorylated form of 
KaiC migrates more slowly (upper band, 
P-KaiC) than the nonphosphorylated 
form (lower band, NP-KaiC). The three 
different phosphorylated forms of KaiC 
are not distinguished by this method. The 
phosphorylation of KaiC oscillates with a 
roughly 24-hour period. (C) The amount 
of phosphorylated and unphosphorylated 
KaiC in the experiment in B is plotted on 
this graph, along with the amount of total 
protein. (B and C, from M. Nakajima et 
al., Science 308:414—415, 2005. With 
permission from AAAS.) 
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the transcription of its target genes. NFkB also activates the transcription of the gene 
that encodes IkBa, creating a negative feedback loop, which can produce prolonged 
oscillations in NFkB activity with sustained extracellular signaling. 

Some small, hydrophobic signal molecules, including steroid and thyroid hor- 
mones, diffuse across the plasma membrane of the target cell and activate intracel- 
lular receptor proteins that directly regulate the transcription of specific genes. 

In many cell types, gene expression is governed by circadian clocks, in which 
delayed negative feedback produces 24-hour oscillations in the activities of tran- 
scription regulators, anticipating the cell's changing needs during the day and night. 


SIGNALING IN PLANTS 


In plants, as in animals, cells are in constant communication with one another. 
Plant cells communicate to coordinate their activities in response to the changing 
conditions of light, dark, and temperature, which guide the plant’s cycle of growth, 
flowering, and fruiting. Plant cells also communicate to coordinate activities in 
their roots, stems, and leaves. In this final section, we consider how plant cells sig- 
nal to one another and how they respond to light. Less is known about the recep- 
tors and intracellular signaling mechanisms involved in cell communication in 
plants than is known in animals, and we will concentrate mainly on how the recep- 
tors and intracellular signaling mechanisms differ from those used by animals. 


Multicellularity and Cell Communication Evolved Independently in 
Plants and Animals 


Although plants and animals are both eukaryotes, they have evolved separately 
for more than a billion years. Their last common ancestor is thought to have been 
a unicellular eukaryote that had mitochondria but no chloroplasts; the plant lin- 
eage acquired chloroplasts after plants and animals diverged. The earliest fossils 
of multicellular animals and plants date from almost 600 million years ago. Thus, 
it seems that plants and animals evolved multicellularity independently, each 
starting from a different unicellular eukaryote, some time between 1.6 and 0.6 bil- 
lion years ago (Figure 15-68). 

If multicellularity evolved independently in plants and animals, the molecules 
and mechanisms used for cell communication will have evolved separately and 
would be expected to be different. There should be some degree of resemblance, 
however, because the genes in both plants and animals diverged from those con- 
tained by their last common unicellular ancestor. Thus, whereas both plants and 
animals use nitric oxide, cyclic GMP, Ca*t, and Rho family GTPases for signal- 
ing, there are no homologs of the nuclear receptor family, Ras, JAK, STAT, TGFB, 
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Figure 15-68 The proposed divergence 
of plant and animal lineages from a 
common unicellular eukaryotic ancestor. 
The plant lineage acquired chloroplasts 
after the two lineages diverged. Both 
lineages independently gave rise to 
multicellular organisms — plants and 
animals. (Paintings courtesy of John Innes 
Foundation.) 





SIGNALING IN PLANTS 


Notch, Wnt, or Hedgehog encoded by the completely sequenced genome of Ara- 
bidopsis thaliana, the small flowering plant. Similarly, plants do not seem to use 
cyclic AMP for intracellular signaling. Nevertheless, the general strategies under- 
lying signaling are frequently very similar in plants and animals. Both, for exam- 
ple, use enzyme-coupled cell-surface receptors, as we now discuss. 


Receptor Serine/Threonine Kinases Are the Largest Class 
of Cell-Surface Receptors in Plants 


Most cell-surface receptors in plants are enzyme-coupled. However, whereas 
the largest class of enzyme-coupled receptors in animals is the receptor tyro- 
sine kinase (RTK) class, this type of receptor is extremely rare in plants. Instead, 
plants rely largely on a great diversity of transmembrane receptor serine/threonine 
kinases, which have a typical serine/threonine kinase cytoplasmic domain and an 
extracellular ligand-binding domain. The most abundant types of these receptors 
have a tandem array of extracellular leucine-rich repeat structures and are there- 
fore called leucine-rich repeat (LRR) receptor kinases. 

There are about 175 LRR receptor kinases encoded by the Arabidopsis genome. 
These include a protein called Bril, which forms part of a cell-surface steroid 
hormone receptor. Plants synthesize a class of steroids that are called brassino- 
steroids because they were originally identified in the mustard family Brassica- 
ceae, which includes Arabidopsis. These signal molecules regulate the growth and 
differentiation of plants throughout their life cycle. Binding of a brassinosteroid 
to a Bril cell-surface receptor kinase initiates an intracellular signaling cascade 
that uses a GSK3 protein kinase and a protein phosphatase to regulate the phos- 
phorylation and degradation of specific transcription regulatory proteins in the 
nucleus, and thereby specific gene transcription. Mutant plants that are deficient 
in the Bril receptor kinase are insensitive to brassinosteroids and are therefore 
dwarfs. 

The LRR receptor kinases are only one of many classes of transmembrane 
receptor serine/threonine kinases in plants. There are at least six additional fami- 
lies, each with its own characteristic set of extracellular domains. The lectin recep- 
tor kinases, for example, have extracellular domains that bind carbohydrate signal 
molecules. The Arabidopsis genome encodes over 300 receptor serine/threonine 
kinases, which makes them the largest family of receptors known in plants. Many 
are involved in defense responses against pathogens. 


Ethylene Blocks the Degradation of Specific Transcription 
Regulatory Proteins in the Nucleus 


Various plant growth regulators (also called plant hormones) help to coordinate 
plant development. They include ethylene, auxin, cytokinins, gibberellins, and 
abscisic acid, as well as brassinosteroids. Growth regulators are all small mole- 
cules made by most plant cells. They diffuse readily through cell walls and can 
either act locally or be transported to influence cells further away. Each growth 
regulator can have multiple effects. The specific effect depends on environmental 
conditions, the nutritional state of the plant, the responsiveness of the target cells, 
and which other growth regulators are acting. 

Ethylene is an important example. This small gas molecule (Figure 15-69A) 
can influence plant development in various ways; it can, for example, promote 
fruit ripening, leaf abscission, and plant senescence. It also functions as a stress 
signal in response to wounding, infection, flooding, and so on. When the shoot of 
a germinating seedling, for instance, encounters an obstacle, ethylene promotes 
a complex response that allows the seedling to safely bypass the obstacle (Figure 
15-69B and C). 

Plants have various ethylene receptors, which are located in the endoplasmic 
reticulum and are all structurally related. They are dimeric, multipass transmem- 
brane proteins, with a copper-containing ethylene-binding domain and a domain 
that interacts with a cytoplasmic protein called CTR1, which is closely related 
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Figure 15-69 The ethylene-mediated 
triple response that occurs when the 
growing shoot of a germinating seedling 
encounters an obstacle underground. 
(A) The structure of ethylene. (B) In the 
absence of obstacles, the shoot grows 
upward and is long and thin. (C) If the 
shoot encounters an obstacle, such as 

a piece of gravel in the soil, the seedling 
responds to the encounter in three ways. 
First, it thickens its stem, which can then 
exert more force on the obstacle. Second, 
it shields the tip of the shoot (at top) by 
increasing the curvature of a specialized 
hook structure. Third, it reduces the shoot’s 
tendency to grow away from the direction 
of gravity, so as to avoid the obstacle. 
(Courtesy of Melanie Webb.) 
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in sequence to the Raf MAP kinase kinase kinase discussed earlier (see Figure 
15-49). Surprisingly, it is the empty receptors that are active and keep CTR1 active. 
By an unknown signaling mechanism, active CTR1 stimulates the ubiquitylation 
and degradation in proteasomes of a nuclear transcription regulator called EIN3, 
which is required for the transcription of ethylene-responsive genes. In this way, 
the empty but active receptors keep ethylene-response genes off. Ethylene bind- 
ing inactivates the receptors, altering their conformation so that they no longer 
activate CTR1. The EIN3 protein is no longer ubiquitylated and degraded and can 
now activate the transcription of the large number of ethylene-responsive genes 
(Figure 15-70). 


Regulated Positioning of Auxin Transporters Patterns Plant Growth 


The plant hormone auxin, which is generally indole-3-acetic acid (Figure 
15-71A), binds to receptor proteins in the nucleus. It helps plants grow toward 
light, grow upward rather than branch out, and grow their roots downward. It also 
regulates organ initiation and positioning and helps plants flower and bear fruit. 
Like ethylene (and like some of the animal signal molecules we have described 
in this chapter), auxin influences gene expression by controlling the degradation 
of transcription regulators. It works by stimulating the ubiquitylation and degra- 
dation of repressor proteins that block the transcription of auxin target genes in 
unstimulated cells (Figure 15-71B and C). 

Auxin is unique in the way that it is transported. Unlike animal hormones, 
which are usually secreted by a specific endocrine organ and transported to target 
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Figure 15-70 The ethylene signaling pathway. (A) In the absence of ethylene, the receptors and 
CTR1 are active, causing the ubiquitylation and destruction of EINS, the transcription regulatory 
protein in the nucleus that is responsible for the transcription of ethylene-responsive genes. (B) The 
binding of ethylene inactivates the receptors and disrupts the activation of CTR1. The EIN protein 
is not degraded and can therefore activate the transcription of ethylene-responsive genes. 
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cells via the circulatory system, auxin has its own transport system. Specific plas- 
ma-membrane-bound influx transporter proteins and efflux transporter proteins 
move auxin into and out of plant cells, respectively. The efflux transporters can be 
distributed asymmetrically in the plasma membrane to make the efflux of auxin 
directional. A row of cells with their auxin efflux transporters confined to the basal 
plasma membrane, for example, will transport auxin from the top of the plant to 
the bottom. 

In some regions of the plant, the localization of the auxin transporters, and 
therefore the direction of auxin flow, is highly dynamic and regulated. A cell can 
rapidly redistribute transporters by controlling the traffic of vesicles containing 
them. The auxin efflux transporters, for example, normally recycle between intra- 
cellular vesicles and the plasma membrane. A cell can redistribute these trans- 
porters on its surface by inhibiting their endocytosis in one domain of the plasma 
membrane, causing the transporters to accumulate there. One example occurs in 
the root, where gravity influences the direction of growth. The auxin efflux trans- 
porters are normally distributed symmetrically in the cap cells of the root. Within 
minutes of a change in the direction of the gravity vector, however, the efflux trans- 
porters redistribute to one side of the cells, so that auxin is pumped out toward the 
side of the root pointing downward. Because auxin inhibits root-cell elongation, 
this redirection of auxin transport causes the root tip to reorient, so that it grows 
downward again (Figure 15-72). 


Phytochromes Detect Red Light, and Cryptochromes Detect Blue 
Light 
Plant development is greatly influenced by environmental conditions. Unlike 


animals, plants cannot move when conditions become unfavorable; they have 
to adapt or they die. The most important environmental influence on plants is 
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Figure 15-72 Auxin transport and root gravitropism. (A-C) Roots respond to a 90° change 

in the gravity vector and adjust their direction of growth so that they grow downward again. The 
cells that respond to gravity are in the center of the root cap, while it is the epidermal cells further 
back (on the lower side) that decrease their rate of elongation to restore downward growth. (D) The 
gravity-responsive cells in the root cap redistribute their auxin efflux transporters in response to 

the displacement of the root. This redirects the auxin flux mainly to the lower part of the displaced 
root, where it inhibits the elongation of the epidermal cells. The resulting asymmetrical distribution 
of auxin in the Arabidopsis root tip shown here is assessed indirectly, using an auxin-responsive 
reporter gene that encodes a protein fused to green fluorescent protein (GFP); the epidermal cells 
on the downward side of the root are green, whereas those on the upper side are not, reflecting 
the asymmetrical distribution of auxin. The distribution of auxin efflux transporters in the plasma 
membrane of cells in different regions of the root (Shown as gray rectangles) is indicated in red, and 
the direction of auxin efflux is indicated by a green arrow. (The fluorescence photograph in D is from 
T. Paciorek et al., Nature 435:1251-1256, 2005. With permission from Macmillan Publishers Ltd.) 


light, which is their energy source and has a major role throughout their entire life 
cycle—from germination, through seedling development, to flowering and senes- 
cence. Plants have thus evolved a large set of light-sensitive proteins to monitor 
the quantity, quality, direction, and duration of light. These are usually referred 
to as photoreceptors. However, because the term photoreceptor is also used for 
light-sensitive cells in the animal retina (see Figure 15-38), we shall use the term 
photoprotein instead. 

All photoproteins sense light by means of a covalently attached light-absorb- 
ing chromophore, which changes its shape in response to light and then induces a 
change in the protein’s conformation. The best-known plant photoproteins are the 
phytochromes, which are present in all plants and in some algae but are absent in 
animals. These are dimeric, cytoplasmic serine/threonine kinases, which respond 
differentially and reversibly to red and far-red light: whereas red light usually acti- 
vates the kinase activity of the phytochrome, far-red light inactivates it. When acti- 
vated by red light, the phytochrome is thought to phosphorylate itself and then 
to phosphorylate one or more other proteins in the cell. In some light responses, 
the activated phytochrome translocates into the nucleus, where it activates tran- 
scription regulators to alter gene transcription (Figure 15-73). In other cases, the 
activated phytochrome activates a latent transcription regulator in the cytoplasm, 
which then translocates into the nucleus to regulate gene transcription. In still 
other cases, the photoprotein triggers signaling pathways in the cytosol that alter 
the cell’s behavior without involving the nucleus. 
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Plants sense blue light using photoproteins of two other sorts, phototropin 
and cryptochromes. Phototropin is associated with the plasma membrane and is 
partly responsible for phototropism, the tendency of plants to grow toward light. 
Phototropism occurs by directional cell elongation, which is stimulated by auxin, 
but the links between phototropin and auxin are unknown. 

Cryptochromes are flavoproteins that are sensitive to blue light. They are 
structurally related to blue-light-sensitive enzymes called photolyases, which 
are involved in the repair of ultraviolet-induced DNA damage in all organisms, 
except most mammals. Unlike phytochromes, cryptochromes are also found in 
animals, where they have an important role in circadian clocks (see Figure 15-66). 
Although cryptochromes are thought to have evolved from the photolyases, they 
do not have a role in DNA repair. 


Summary 


Plants and animals are thought to have evolved multicellularity and cell com- 
munication mechanisms independently, each starting from a different unicellular 
eukaryote, which in turn evolved from a common unicellular eukaryotic ancestor. 
Not surprisingly, therefore, the mechanisms used to signal between cells in animals 
and in plants have both similarities and differences. Whereas animals rely heav- 
ily on GPCRs and RTKs, plants rely mainly on enzyme-coupled receptors of the 
receptor serine/threonine kinase type, especially ones with extracellular leucine- 
rich repeats. Various plant hormones, or growth regulators, including ethylene 
and auxin, help coordinate plant development. Ethylene acts through intracellular 
receptors to stop the degradation of specific nuclear transcription regulators, which 
can then activate the transcription of ethylene-responsive genes. The receptors for 
some other plant hormones, including auxin, also regulate the degradation of spe- 
cific transcription regulators, although the details vary. Auxin signaling is unusual 
in that it has its own highly regulated transport system, in which the dynamic posi- 
tioning of plasma-membrane-bound auxin transporters controls the direction of 
auxin flow and thereby the direction of plant growth. Light has an important role 
in regulating plant development. These light responses are mediated by a variety of 
light-sensitive photoproteins, including phytochromes, which are responsive to red 
light, and cryptochromes and phototropin, which are sensitive to blue light. 
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Figure 15-73 One way in which 
phytochromes mediate a light response 
in plant cells. When activated by red 

light, the phytochrome, which is a dimeric 
protein kinase, phosphorylates itself and 
then moves into the nucleus, where it 
activates transcription regulatory proteins 
to stimulate the transcription of red-light- 
responsive genes. 


WHAT WE DON’T KNOW 


e How does a cell integrate the 
information received from its many 
different cell-surface receptors to 
make all-or-none decisions? 


e Much of what we know about cell 
signaling comes from biochemical 
studies of isolated proteins in test 
tubes. What is the precise quantitative 
behavior of intracellular signaling 
networks in an intact cell, or in an 
intact animal, where countless other 
signals and cell components might 
influence signaling specificity and 
strength? 


e How do intracellular signaling 
circuits generate specific and dynamic 
signaling patterns such as oscillations 
and waves, and how are these 
patterns sensed and interpreted by 
the cell? 


e Scaffold proteins and activated 
receptor tyrosine kinases nucleate 

the assembly of large intracellular 
signaling complexes. What is the 
dynamic behavior of these complexes, 
and how does this behavior influence 
downstream signaling? 
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PROBLEMS 


Which statements are true? Explain why or why not. 


15-1 All second messengers are water-soluble and dif- 
fuse freely through the cytosol. 


15-2 In the regulation of molecular switches, protein 
kinases and guanine nucleotide exchange factors (GEFs) 
always turn proteins on, whereas protein phosphatases 
and GTPase-activating proteins (GAPs) always turn pro- 
teins off. 


15-3 Most intracellular signaling pathways provide 
numerous opportunities for amplifying the responses to 
extracellular signals. 


15-4 Binding of extracellular ligands to receptor tyro- 
sine kinases (RTKs) activates the intracellular catalytic 
domain by propagating a conformational change across 
the lipid bilayer through a single transmembrane a helix. 


15-5 Protein tyrosine phosphatases display exquisite 
specificity for their substrates, unlike most serine/thre- 
onine protein phosphatases, which have rather broad 
specificity. 


15-6 Even though plants and animals independently 
evolved multicellularity, they use virtually all the same sig- 
naling proteins and second messengers for cell-cell com- 
munication. 


Discuss the following problems. 


15-7 Suppose that the circulating concentration of hor- 
mone is 10-!° M and the Kg for binding to its receptor is 10-8 
M. What fraction ofthe receptors will have hormone bound? 
If a meaningful physiological response occurs when 50% of 
the receptors have bound a hormone molecule, how much 
will the concentration of hormone have to rise to elicit a 
response? The fraction of receptors (R) bound to hormone 
(H) to form a receptor-hormone complex (R-H) is [R-H]/ 
([R] + [R-H]) = [R-H]/[R] ror = [H]/([H] + Ka). 


15-8 Cells communicate in ways that resemble human 
communication. Decide which of the following forms of 
human communication are analogous to autocrine, para- 
crine, endocrine, and synaptic signaling by cells. 

A. A telephone conversation 

B. Talking to people at a cocktail party 

C. Aradio announcement 

D. Talking to yourself 


15-9 Why do signaling responses that involve changes 
in proteins already present in the cell occur in millisec- 
onds to seconds, whereas responses that require changes 
in gene expression require minutes to hours? 


15-10 How is it that different cells can respond in differ- 
ent ways to exactly the same signaling molecule even when 
they have identical receptors? 


15-11 Why do you suppose that phosphorylation/ 
dephosphorylation, as opposed to allosteric binding of 
small molecules, for example, has evolved to play such a 
prominent role in switching proteins on and off in signal- 
ing pathways? 


15-12 Consider a signaling pathway that proceeds 
through three protein kinases that are sequentially acti- 
vated by phosphorylation. In one case, the kinases are 
held in a signaling complex by a scaffolding protein; in 
the other, the kinases are freely diffusible (Figure Q15-1). 
Discuss the properties of these two types of organization 
in terms of signal amplification, speed, and potential for 
cross-talk between signaling pathways. 


Figure Q15-1 A kinase 
cascade organized by 
a scaffolding protein 

or composed of freely 
diffusing components 
(Problem 15-12). 
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15-13 Describe three ways in which a gradual increase in 
an extracellular signal can be sharpened by the target cell 
to produce an abrupt or nearly all-or-none response. 


15-14 Activation (“maturation”) of frog oocytes is sig- 
naled through a MAP kinase signaling module. An increase 
in the hormone progesterone triggers the module by stim- 
ulating the translation of Mos mRNA, which is the frog’s 
MAP kinase kinase kinase (Figure Q15-2). Maturation is 
easy to score visually by the presence of a white spot in 
the middle of the brown surface of the oocyte (see Figure 
Q15-2). To determine the dose-response curve for pro- 
gesterone-induced activation of MAP kinase, you place 16 
oocytes in each of six plastic dishes and add various con- 
centrations of progesterone. After an overnight incubation, 
you crush the oocytes, prepare an extract, and determine 
the state of MAP kinase phosphorylation (hence, activa- 
tion) by SDS polyacrylamide-gel electrophoresis (Figure 
Q15-3A). This analysis shows a graded response of MAP 
kinase to increasing concentrations of progesterone. 


progesterone Figure Q15—2 Progesterone-induced 
l MAP kinase activation, leading to oocyte 
Mos maturation (Problem 15-14). (Courtesy 
| of Helfrid Hochegger.) 
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Figure Q15-3 Activation 
of frog oocytes 
(Problem 15-14). 
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Before you crushed the oocytes, you noticed that not all 
oocytes in individual dishes had white spots. Had some 
oocytes undergone partial activation and not yet reached 
the white-spot stage? To answer this question, you repeat 
the experiment, but this time you analyze MAP kinase acti- 
vation in individual oocytes. You are surprised to find that 
each oocyte has either a fully activated or a completely 
inactive MAP kinase (Figure Q15-3B). How can an all-or- 
none response in individual oocytes give rise to a graded 
response in the population? 


15-15 Propose specific types of mutations in the gene for 
the regulatory subunit of cyclic-AMP-dependent protein 
kinase (PKA) that could lead to either a permanently active 
PKA or a permanently inactive PKA. 


15-16 Phosphorylase kinase integrates signals from the 
cyclic-AMP-dependent and Ca?+-dependent signaling 
pathways that control glycogen breakdown in liver and 
muscle cells (Figure Q15-4). Phosphorylase kinase is com- 
posed of four subunits. One is the protein kinase that cata- 
lyzes the addition of phosphate to glycogen phosphorylase 
to activate it for glycogen breakdown. The other three sub- 
units are regulatory proteins that control the activity of the 
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Figure Q15-4 Integration of cyclic-AMP-dependent and Ca?+- 
dependent signaling pathways by phosphorylase kinase in liver and 
muscle cells (Problem 15-16). 


catalytic subunit. Two contain sites for phosphorylation by 
PKA, which is activated by cyclic AMP. The remaining sub- 
unit is calmodulin, which binds Ca** when the cytosolic 
Ca** concentration rises. The regulatory subunits control 
the equilibrium between the active and inactive confor- 
mations of the catalytic subunit, with each phosphate and 
Ca** nudging the equilibrium toward the active confor- 
mation. How does this arrangement allow phosphorylase 
kinase to serve its role as an integrator protein for the mul- 
tiple pathways that stimulate glycogen breakdown? 


15-17 The Wnt planar polarity signaling pathway nor- 
mally ensures that each wing cell in Drosophila has a sin- 
gle hair. Overexpression of the Frizzled gene from a heat- 
shock promoter (hs-Fz) causes multiple hairs to grow from 
many cells (Figure Q15-5A). This phenotype is suppressed 
if hs-Fz is combined with a heterozygous deletion (Dsh4) 
of the Dishevelled gene (Figure Q15-5B). Do these results 
allow you to order the action of Frizzled and Dishevelled 
in the signaling pathway? If so, what is the order? Explain 
your reasoning. 


-+P AT z => oar 
r / f aea ¢ A - Ln F - 
F Pa s “ ¢ ~ 2 at TP cm & 
f E fo * &-. OP as “ 
“ot d Se: p > " epr 
Ad all - <= M x~ 
_ te ai è — — ~~~. 
—_ ~~.) *, Le = | 
hs-Fz/+ hs-Fz/+ 
+/+ Dsh4/+ 


Figure Q15-5 Pattern of hair growth on wing cells in genetically 
different Drosophila (Problem 15-17). (From C.G. Winter et al., Cell 
105:81-91, 2001. With permission from Elsevier.) 
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The Cytoskeleton 


For cells to function properly, they must organize themselves in space and inter- 
act mechanically with each other and with their environment. They have to be 
correctly shaped, physically robust, and properly structured internally. Many 
have to change their shape and move from place to place. All cells have to be able 
to rearrange their internal components as they grow, divide, and adapt to chang- 
ing circumstances. These spatial and mechanical functions depend on a remark- 
able system of filaments called the cytoskeleton (Figure 16-1). 

The cytoskeleton’s varied functions depend on the behavior of three families 
of protein filaments—actin filaments, microtubules, and intermediate filaments. 
Each type of filament has distinct mechanical properties, dynamics, and biolog- 
ical roles, but all share certain fundamental features. Just as we require our lig- 
aments, bones, and muscles to work together, so all three cytoskeletal filament 
systems must normally function collectively to give a cell its strength, its shape, 
and its ability to move. 

In this chapter, we describe the function and conservation of the three main 
filament systems. We explain the basic principles underlying filament assembly 
and disassembly, and how other proteins interact with the filaments to alter their 
dynamics, enabling the cell to establish and maintain internal order, to shape and 
remodel its surface, and to move organelles in a directed manner from one place 
to another. Finally, we discuss how the integration and regulation of the cytoskel- 
eton allows a cell to move to new locations. 


FUNCTION AND ORIGIN OF THE CYTOSKELETON 


The three major cytoskeletal filaments are responsible for different aspects of the 
cell’s spatial organization and mechanical properties. Actin filaments determine 
the shape of the cell’s surface and are necessary for whole-cell locomotion; they 
also drive the pinching of one cell into two. Microtubules determine the positions 
of membrane-enclosed organelles, direct intracellular transport, and form the 
mitotic spindle that segregates chromosomes during cell division. Intermediate 
filaments provide mechanical strength. All of these cytoskeletal filaments interact 
with hundreds of accessory proteins that regulate and link the filaments to other 
cell components, as well as to each other. The accessory proteins are essential for 
the controlled assembly of the cytoskeletal filaments in particular locations, and 
they include the motor proteins, remarkable molecular machines that convert the 
energy of ATP hydrolysis into mechanical force that can either move organelles 
along the filaments or move the filaments themselves. 

In this section, we discuss the general features of the proteins that make up 
the filaments of the cytoskeleton. We focus on their ability to form intrinsically 


Figure 16-1 The cytoskeleton. (A) A cell in culture has been fixed and 
labeled to show its cytoplasmic arrays of microtubules (green) and actin 
filaments (red). (B) This dividing cell has been labeled to show its spindle 
microtubules (green) and surrounding cage of intermediate filaments (req). 
The DNA in both cells is labeled in blue. (A, courtesy of Albert Tousson; 

B, courtesy of Conly Rieder.) 
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polarized and self-organized structures that are highly dynamic, allowing the cell 
to rapidly modify cytoskeletal structure and function under different conditions. 


Cytoskeletal Filaments Adapt to Form Dynamic or Stable 
Structures 


Cytoskeletal systems are dynamic and adaptable, organized more like ant trails 
than interstate highways. A single trail of ants may persist for many hours, extend- 
ing from the ant nest to a delectable picnic site, but the individual ants within the 
trail are anything but static. If the ant scouts find a new and better source of food, 
or if the picnickers clean up and leave, the dynamic structure adapts with aston- 
ishing rapidity. In a similar way, large-scale cytoskeletal structures can change or 
persist, according to need, lasting for lengths of time ranging from less than a min- 
ute up to the cell’s lifetime. But the individual macromolecular components that 
make up these structures are in a constant state of flux. Thus, like the alteration of 
an ant trail, a structural rearrangement in a cell requires little extra energy when 
conditions change. 

Regulation of the dynamic behavior and assembly of cytoskeletal filaments 
allows eukaryotic cells to build an enormous range of structures from the three 
basic filament systems. The micrographs in Panel 16-1 illustrate some of these 
structures. Actin filaments underlie the plasma membrane of animal cells, pro- 
viding strength and shape to its thin lipid bilayer. They also form many types of 
cell-surface projections. Some of these are dynamic structures, such as the lamel- 
lipodia and filopodia that cells use to explore territory and move around. More 
stable arrays allow cells to brace themselves against an underlying substratum 
and enable muscle to contract. The regular bundles of stereocilia on the surface 
of hair cells in the inner ear contain stable bundles of actin filaments that tilt as 
rigid rods in response to sound, and similarly organized microvilli on the surface 
of intestinal epithelial cells vastly increase the apical cell-surface area to enhance 
nutrient absorption. In plants, actin filaments drive the rapid streaming of cyto- 
plasm inside cells. 

Microtubules, which are frequently found in a cytoplasmic array that extends 
to the cell periphery, can quickly rearrange themselves to form a bipolar mitotic 
spindle during cell division. They can also form cilia, which function as motile 
whips or sensory devices on the surface of the cell, or tightly aligned bundles that 
serve as tracks for the transport of materials down long neuronal axons. In plant 
cells, organized arrays of microtubules help to direct the pattern of cell wall syn- 
thesis, and in many protozoans they form the framework upon which the entire 
cell is built. 

Intermediate filaments line the inner face of the nuclear envelope, forming 
a protective cage for the cell’s DNA; in the cytosol, they are twisted into strong 
cables that can hold epithelial cell sheets together or help nerve cells to extend 
long and robust axons, and they allow us to form tough appendages such as hair 
and fingernails. 

An important and dramatic example of rapid reorganization of the cytoskele- 
ton occurs during cell division, as shown in Figure 16-2 for a fibroblast growing 
in a tissue-culture dish. After the chromosomes have replicated, the interphase 
microtubule array that spreads throughout the cytoplasm is reconfigured into the 
bipolar mitotic spindle, which segregates the two copies of each chromosome into 
daughter nuclei. At the same time, the specialized actin structures that enable the 
fibroblast to crawl across the surface of the dish rearrange so that the cell stops 
moving and assumes a more spherical shape. Actin and its associated motor pro- 
tein myosin then form a belt around the middle of the cell, the contractile ring, 
which constricts like a tiny muscle to pinch the cell in two. When division is com- 
plete, the cytoskeletons of the two daughter fibroblasts reassemble into their 
interphase structures to convert the two rounded-up daughter cells into smaller 
versions of the flattened, crawling mother cell. 

Many cells require rapid cytoskeletal rearrangements for their normal func- 
tioning during interphase as well. For example, the neutrophil, a type of white 
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Actin filaments (also known as microfilaments) are helical polymers of 
the protein actin. They are flexible structures with a diameter of 8 nm 
that organize into a variety of linear bundles, two-dimensional 
networks, and three-dimensional gels. Although actin filaments are 
dispersed throughout the cell, they are most highly concentrated in the 
cortex, just beneath the plasma membrane. (i) Single actin filament; 

(ii) microvilli; (iii) stress fibers (red) terminating in focal adhesions 
(green); (iv) striated muscle. 


r= s 


MICROTUBUL 


oak a 


aa Emna 


a ? a Pa b pms 

ee eet ce gs s =i ei ee Fpl eh, 
Farti ee ee Ge ene ie See g ey 

j wa e E, EEF = j 
peta ba ly te al A da Sal aa a A 
iE ramp, eae Ml ary hi ttt pee besiege Ege i í 

aa i E a ke es yD eA E a e a a a i a a 

y r- ea! == nin w is ae i. k ew 97 

i = ae 1 wh he PP ia I elles Py it Pa ae | iM Se ee ae Aen i 

At is ah a T ies ste R SE he NTE 

E, j r See ye ee a 

È Te Pe = i T: 


AE a tree en ee 
ae ee! 


25nm 


| Microtubules are long, hollow cylinders made of the protein tubulin. 

| With an outer diameter of 25 nm, they are much more rigid than actin 
filaments. Microtubules are long and straight and frequently have one 
end attached to a microtubule-organizing center (MTOC) called a 
centrosome. (i) Single microtubule; (ii) cross section at the base of three 
cilia showing triplet microtubules; (iii) interphase microtubule array 
(green) and organelles (red); (iv) ciliated protozoan. 
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are ropelike fibers with a diameter of about 
10 nm; they are made of intermediate filament proteins, which constitute 
a large and heterogeneous family. One type of intermediate filament 
forms a meshwork called the nuclear lamina just beneath the inner 
nuclear membrane. Other types extend across the cytoplasm, giving cells 
mechanical strength. In an epithelial tissue, they span the cytoplasm from 
one cell-cell junction to another, thereby strengthening the entire 
epithelium. (i) Individual intermediate filaments; (ii) Intermediate 
filaments (b/ue) in neurons and (iii) epithelial cell; (iv) nuclear lamina. 
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892 Chapter 16: The Cytoskeleton 





blood cell, chases and engulfs bacterial and fungal cells that accidentally gain 
access to the normally sterile parts of the body, as through a cut in the skin. Like 
most crawling cells, neutrophils advance by extending a protrusive structure filled 
with newly polymerized actin filaments. When the elusive bacterial prey moves in 
a different direction, the neutrophil is poised to reorganize its polarized protru- 
sive structures within seconds (Figure 16-3). 


The Cytoskeleton Determines Cellular Organization and Polarity 


In cells that have achieved a stable, differentiated morphology—such as mature 
neurons or epithelial cells—the dynamic elements of the cytoskeleton must also 
provide stable, large-scale structures for cellular organization. On specialized epi- 
thelial cells that line organs such as the intestine and the lung, cytoskeletal-based 
cell-surface protrusions including microvilli and cilia are able to maintain a con- 
stant location, length, and diameter over the entire lifetime of the cell. For the 
actin bundles at the cores of microvilli on intestinal epithelial cells, this is only a 
few days. But the actin bundles at the cores of stereocilia on the hair cells of the 
inner ear must maintain their stable organization for the entire lifetime of the ani- 
mal, since these cells do not turn over. Nonetheless, the individual actin filaments 





Figure 16-2 Diagram of changes in 
cytoskeletal organization associated 
with cell division. The crawling fibroblast 
drawn here has a polarized, dynamic 

actin cytoskeleton (shown in red) that 
assembles lamellipodia and filopodia to 
push its leading edge toward the right. The 
polarization of the actin cytoskeleton Is 
assisted by the microtubule cytoskeleton 
(green), consisting of long microtubules 
that emanate from a single microtubule- 
organizing center located in front of 

the nucleus. When the cell divides, the 
polarized microtubule array rearranges 

to form a bipolar mitotic spindle, which 

is responsible for aligning and then 
segregating the duplicated chromosomes 
(brown). The actin filaments form a 
contractile ring at the center of the cell 

that pinches the cell in two after the 
chromosome segregation. After cell 
division is complete, the two daughter 
cells reorganize both the microtubule and 
actin cytoskeletons into smaller versions of 
those that were present in the mother cell, 
enabling them to crawl their separate ways. 


Figure 16-3 A neutrophil in pursuit of 
bacteria. In this preparation of human 
blood, a clump of bacteria (white arrow) is 
about to be captured by a neutrophil. As 
the bacteria move, the neutrophil quickly 
reassembles the dense actin network at 
its leading edge (highlighted in red) to 
push toward the location of the bacteria 
(Movie 16.1). Rapid disassembly and 
reassembly of the actin cytoskeleton in 
this cell enables it to change its orientation 
and direction of movement within a few 
minutes. (From a video recorded by 

David Rogers.) 
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remain strikingly dynamic and are continuously remodeled and replaced every 48 
hours, even within stable cell-surface structures that persist for decades. 

Besides forming stable, specialized cell-surface protrusions, the cytoskele- 
ton is also responsible for large-scale cellular polarity, enabling cells to tell the 
difference between top and bottom, or front and back. The large-scale polarity 
information conveyed by cytoskeletal organization is often maintained over the 
lifetime of the cell. Polarized epithelial cells use organized arrays of microtubules, 
actin filaments, and intermediate filaments to maintain the critical differences 
between the apical surface and the basolateral surface. They also must maintain 
strong adhesive contacts with one another to enable this single layer of cells to 
serve as an effective physical barrier (Figure 16-4). 


Filaments Assemble from Protein Subunits That Impart Specific 
Physical and Dynamic Properties 


Cytoskeletal filaments can reach from one end of the cell to the other, spanning 
tens or even hundreds of micrometers. Yet the individual protein molecules that 
form the filaments are only a few nanometers in size. The cell builds the filaments 
by assembling large numbers of the small subunits, like building a skyscraper out 
of bricks. Because these subunits are small, they can diffuse rapidly in the cyto- 
sol, whereas the assembled filaments cannot. In this way, cells can undergo rapid 
structural reorganizations, disassembling filaments at one site and reassembling 
them at another site far away. 

Actin filaments and microtubules are built from subunits that are compact 
and globular—actin subunits for actin filaments and tubulin subunits for microtu- 
bules—whereas intermediate filaments are made up of smaller subunits that are 
themselves elongated and fibrous. All three major types of cytoskeletal filaments 
form as helical assemblies of subunits (see Figure 3-22) that self-associate, using 
a combination of end-to-end and side-to-side protein contacts. Differences in 
the structures of the subunits and the strengths of the attractive forces between 
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Figure 16-4 Organization of the 
cytoskeleton in polarized epithelial cells. 
All the components of the cytoskeleton 
cooperate to produce the characteristic 
shapes of specialized cells, including the 
epithelial cells that line the small intestine, 
diagrammed here. At the apical (upper) 
surface, facing the intestinal lumen, 
bundled actin filaments (red) form microvilli 
that increase the cell surface area available 
for absorbing nutrients from food. Below 
the microvilli, a circumferential band of 
actin filaments is connected to cell-cell 
adherens junctions that anchor the cells to 
each other. Intermediate filaments (b/ue) 
are anchored to other kinds of adhesive 
structures, including desmosomes and 
hemidesmosomes, that connect the 
epithelial cells into a sturdy sheet and 
attach them to the underlying extracellular 
matrix; these structures are discussed 

in Chapter 19. Microtubules (green) run 
vertically from the top of the cell to the 
bottom and provide a global coordinate 
system that enables the cell to direct newly 
synthesized components to their proper 
locations. 
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them produce important differences in the stability and mechanical properties 
of each type of filament. Whereas covalent linkages between their subunits hold 
together the backbones of many biological polymers—including DNA, RNA, and 
proteins—it is weak noncovalent interactions that hold together the three types of 
cytoskeletal polymers. Consequently, their assembly and disassembly can occur 
rapidly, without covalent bonds being formed or broken. 

The subunits of actin filaments and microtubules are asymmetrical and bind 
to one another head-to-tail so that they all point in one direction. This subunit 
polarity gives the filaments structural polarity along their length, and makes the 
two ends of each polymer behave differently. In addition, actin and tubulin sub- 
units are both enzymes that catalyze the hydrolysis of a nucleoside triphosphate— 
ATP and GTP, respectively. As we discuss later, the energy derived from nucleo- 
tide hydrolysis enables the filaments to remodel rapidly. By controlling when and 
where actin and microtubules assemble, the cell harnesses the polar and dynamic 
properties of these filaments to generate force in a specific direction, to move the 
leading edge of a migrating cell forward, for example, or to pull chromosomes 
apart during cell division. In contrast, the subunits of intermediate filaments are 
symmetrical, and thus do not form polarized filaments with two different ends. 
Intermediate filament subunits also do not catalyze the hydrolysis of nucleotides. 
Nevertheless, intermediate filaments can be disassembled rapidly when required. 
In mitosis , for example, kinases phosphorylate the subunits, leading to their dis- 
sociation. 

Cytoskeletal filaments in living cells are not built by simply stringing sub- 
units together in single file. A thousand tubulin subunits lined up end-to-end, 
for example, would span the diameter of a small eukaryotic cell, but a filament 
formed in this way would lack the strength to avoid breakage by ambient thermal 
energy, unless each subunit in the filament was bound extremely tightly to its two 
neighbors. Such tight binding would limit the rate at which the filaments could 
disassemble, making the cytoskeleton a static and less useful structure. To provide 
both strength and adaptability, microtubules are built of 13 protofilaments—lin- 
ear strings of subunits joined end-to-end—that associate with one another later- 
ally to form a hollow cylinder. The addition or loss of a subunit at the end of one 
protofilament makes or breaks a small number of bonds. In contrast, loss of a sub- 
unit from the middle of the filament requires breaking many more bonds, while 
breaking it in two requires breaking bonds in multiple protofilaments all at the 
same time (Figure 16-5). The greater energy required to break multiple noncova- 
lent bonds simultaneously allows microtubules to resist thermal breakage, while 
allowing rapid subunit addition and loss at the filament ends. Helical actin fila- 
ments are much thinner and therefore require much less energy to break. How- 
ever, multiple actin filaments are often bundled together inside cells, providing 
mechanical strength, while allowing dynamic behavior of filament ends. 

As with other specific protein-protein interactions, many hydrophobic interac- 
tions and noncovalent bonds hold the subunits in a cytoskeletal filament together 
(see Figure 3-4). The locations and types of subunit-subunit contacts differ for 
the different filaments. Intermediate filaments, for example, assemble by forming 
strong lateral contacts between a-helical coiled-coils, which extend over most of 
the length of each elongated fibrous subunit. Because the individual subunits are 
staggered in the filament, intermediate filaments form strong, ropelike structures 
that tolerate stretching and bending to a greater extent than do either actin fila- 
ments or microtubules (Figure 16-6). 


Accessory Proteins and Motors Regulate Cytoskeletal Filaments 


The cell regulates the length and stability of its cytoskeletal filaments, as well as 
their number and geometry. It does so largely by regulating their attachments 
to one another and to other components of the cell, so that the filaments can 
form a wide variety of higher-order structures. Direct covalent modification of 
the filament subunits regulates some filament properties, but most of the regu- 
lation is performed by hundreds of accessory proteins that determine the spatial 
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distribution and the dynamic behavior of the filaments, converting information 
received through signaling pathways into cytoskeletal action. These accessory 
proteins bind to the filaments or their subunits to determine the sites of assembly 
of new filaments, to regulate the partitioning of polymer proteins between fila- 
ment and subunit forms, to change the kinetics of filament assembly and disas- 
sembly, to harness energy to generate force, and to link filaments to one another 
or to other cell structures such as organelles and the plasma membrane. In these 
processes, the accessory proteins bring cytoskeletal structure under the control 
of extracellular and intracellular signals, including those that trigger the dramatic 
transformations of the cytoskeleton that occur during each cell cycle. Acting 
together, the accessory proteins enable a eukaryotic cell to maintain a highly orga- 
nized but flexible internal structure and, in many cases, to move. 
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Figure 16-5 The thermal stability of 
cytoskeletal filaments with dynamic 
ends. A protofilament consisting of a single 
strand of subunits is thermally unstable, 
since breakage of a single bond between 
subunits is sufficient to break the filament. 
In contrast, formation of a cytoskeletal 
filament from more than one protofilament 
allows the ends to be dynamic, while 
enabling the filaments themselves to 

be resistant to thermal breakage. In a 
microtubule, for example, removing a single 
subunit dimer from the end of the filament 
requires breaking noncovalent bonds with a 
maximum of three other subunits, whereas 
fracturing the filament in the middle requires 
breaking noncovalent bonds in all thirteen 
orotofilaments. 


Figure 16-6 Flexibility and stretch in 

an intermediate filament. Intermediate 
filaments are formed from elongated fibrous 
subunits with strong lateral contacts, 
resulting in resistance to stretching 
forces. When a tiny mechanical probe is 
dragged across an intermediate filament, 
the filament is stretched over three times 
its length before it breaks, as illustrated 
by the fluorescently labeled filaments in 
the photomicrographs. This technique 

is termed atomic force microscopy (see 
Figure 9-33). (Adapted from L. Kreplak et 
al., J. Mol. Biol. 354:569-577, 2005. With 
permission from Elsevier.) 
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Among the most fascinating proteins that associate with the cytoskeleton are 
the motor proteins. These proteins bind to a polarized cytoskeletal filament and 
use the energy derived from repeated cycles of ATP hydrolysis to move along it. 
Dozens of different motor proteins coexist in every eukaryotic cell. They differ in 
the type of filament they bind to (either actin or microtubules), the direction in 
which they move along the filament, and the “cargo” they carry. Many motor pro- 
teins carry membrane-enclosed organelles—such as mitochondria, Golgi stacks, 
or secretory vesicles—to their appropriate locations in the cell. Other motor pro- 
teins cause cytoskeletal filaments to exert tension or to slide against each other, 
generating the force that drives such phenomena as muscle contraction, ciliary 
beating, and cell division. 

Cytoskeletal motor proteins that move unidirectionally along an oriented 
polymer track are reminiscent of some other proteins and protein complexes dis- 
cussed elsewhere in this book, such as DNA and RNA polymerases, helicases, and 
ribosomes. All of these proteins have the ability to use chemical energy to propel 
themselves along a linear track, with the direction of sliding dependent on the 
structural polarity of the track. All of them generate motion by coupling nucleo- 
side triphosphate hydrolysis to a large-scale conformational change (see Figure 
3-75). 


Bacterial Cell Organization and Division Depend on Homologs of 
Eukaryotic Cytoskeletal Proteins 


While eukaryotic cells are typically large and morphologically complex, bacterial 
cells are usually only a few micrometers long and assume simple shapes such 
as spheres or rods. Bacteria also lack elaborate networks of intracellular mem- 
brane-enclosed organelles. Historically, biologists assumed that a cytoskeleton 
was not necessary in such simple cells. We now know, however, that bacteria con- 
tain homologs of all three of the eukaryotic cytoskeletal filaments. Furthermore, 
bacterial actins and tubulins are more diverse than their eukaryotic versions, both 
in the types of assemblies they form and in the functions they carry out. 

Nearly all bacteria and many archaea contain a homolog of tubulin called FtsZ, 
which can polymerize into filaments and assemble into a ring (called the Z-ring) 
at the site where the septum forms during cell division (Figure 16-7). Although 
the Z-ring persists for many minutes, the individual filaments within it are highly 
dynamic, with an average filament half-life of about thirty seconds. As the bacte- 
rium divides, the Z-ring becomes smaller until it has completely disassembled. 
FtsZ filaments in the Z-ring are thought to generate a bending force that drives 
the membrane invagination necessary to complete cell division. The Z-ring may 
also serve as a site for localization of enzymes required for building the septum 
between the two daughter cells. 

Many bacteria also contain homologs of actin. Two of these, MreB and Mbl, 
are found primarily in rod-shaped or spiral-shaped cells where they assemble 
to form dynamic patches that move circumferentially along the length of the 
cell (Figure 16-8A). These proteins contribute to cell shape by serving as a scaf- 
fold to direct the synthesis of the peptidoglycan cell wall, in much the same way 
that microtubules help organize the synthesis of the cellulose cell wall in higher 


Figure 16-7 The bacterial FtsZ protein, a tubulin homolog in 
prokaryotes. (A) A band of FtsZ protein forms a ring in a dividing bacterial 
cell. This ring has been labeled by fusing the FtsZ protein to green fluorescent 
protein (GFP), which allows it to be observed in living E. coli cells with a 
fluorescence microscope. (B) FtsZ filaments and circles, formed in vitro, as 
visualized using electron microscopy. (C) Dividing chloroplasts (red) from a red 
alga also cleave using a protein ring made from FtsZ (yellow). (A, from X. Ma, 
D.W. Ehrhardt and W. Margolin, Proc. Natl Acad. Sci. USA 93:12998-13008, 
1996; B, from H.P. Erickson et al., Proc. Natl Acad. Sci. USA 93:519-528, 
1996. Both with permission from National Academy of Sciences; C, from 

S. Miyagishima et al., Plant Cell 13:2257-2268, 2001, with permission from 
American Society of Plant Biologists.) 
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Figure 16-8 Actin homologs in bacteria determine cell shape. (A) The MreB protein forms abundant patches made up 

of many short, interwoven linear or helical filaments that are seen to move circumferentially along the length of the bacterium 
and are associated with sites of cell wall synthesis. (B) The common soil bacterium Bacillus subtilis normally forms cells with a 
regular rodlike shape when viewed by scanning electron microscopy (left). In contrast, B. subtilis cells lacking the actin homolog 
MreB or Mbl grow in distorted or twisted shapes and eventually die (center and right). (A, from P. Vats and L. Rothfield, Proc. 
Natl Acad. Sci. USA 104:17795-17800, 2007. With permission from National Academy of Sciences; B, from A. Chastanet and 
R. Carballido-Lopez, Front. Biosci. 4S:1582-1606, 2012. With permission Frontiers in Bioscience.) 


plant cells (see Figure 19-65). As with FtsZ, MreB and Mbl filaments are highly 
dynamic, with half-lives of a few minutes, and nucleotide hydrolysis accompanies 
the polymerization process. Mutations disrupting MreB or Mbl expression cause 
extreme abnormalities in cell shape and defects in chromosome segregation (Fig- 
ure 16-8B). 

Relatives of MreB and Mbl have more specialized roles. A particularly intriguing 
bacterial actin homolog is ParM, which is encoded by a gene on certain bacterial 
plasmids that also carry genes responsible for antibiotic resistance and cause the 
spread of multidrug resistance in epidemics. Bacterial plasmids typically encode 
all the gene products that are necessary for their own segregation, presumably 
as a strategy to ensure their inheritance and propagation in bacterial hosts fol- 
lowing plasmid replication. ParM assembles into filaments that associate at each 
end with a copy of the plasmid, and growth of the ParM filament pushes the rep- 
licated plasmid copies apart (Figure 16-9). This spindle-like structure apparently 
arises from the selective stabilization of filaments that bind to specialized proteins 
recruited to the origins of replication on the plasmids. A distant relative of both 
tubulin and FtsZ, called TubZ, has a similar function in other bacterial species. 

Thus, self-association of nucleotide-binding proteins into dynamic filaments 
is used in all cells, and the actin and tubulin families are very ancient, predating 
the split between the eukaryotic and bacterial kingdoms. 

At least one bacterial species, Caulobacter crescentus, appears to harbor a pro- 
tein with significant structural similarity to the third major class of cytoskeletal 
filaments found in animal cells, the intermediate filaments. A protein called cres- 
centin forms a filamentous structure that influences the unusual crescent shape 
of this species; when the gene encoding crescentin is deleted, the Caulobacter 
cells grow as straight rods (Figure 16-10). 
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Figure 16-9 Role of the actin homolog 
ParM in plasmid segregation in bacteria. 
(A) Some bacterial drug-resistance 
plasmids (orange) encode an actin 
homolog, ParM, that will soontaneously 
nucleate to form small, dynamic filaments 
(green) throughout the bacterial cytoplasm. 
A second plasmid-encoded protein 

called ParR (blue) binds to specific DNA 
sequences in the plasmid and also 
stabilizes the dynamic ends of the ParM 
filaments. When the plasmid duplicates, 
both ends of the ParM filaments become 
stabilized, and the growing ParM filaments 
push the duplicated plasmids to opposite 
ends of the cell. (B) In these bacterial 

cells harboring a drug-resistance plasmid, 
the plasmids are labeled in red and the 
ParM protein in green. Left, a short ParM 
filament bundle connects the two daughter 
plasmids shortly after their duplication. 
Right, the fully assembled ParM filament 
has pushed the duplicated plasmids to the 
cell poles. (A, adapted from E.C. Garner, 
C.S. Campbell and R.D. Mullins, Science 
306:1021-1025, 2004; B, from J. Møller- 
Jensen et al., Mol. Cell 12:1477-1487, 
2003. With permission from Elsevier.) 
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Summary 


The cytoplasm of eukaryotic cells is spatially organized by a network of protein fil- 
aments known as the cytoskeleton. This network contains three principal types of 
filaments: actin filaments, microtubules, and intermediate filaments. All three types 
of filaments form as helical assemblies of subunits that self-associate using a combi- 
nation of end-to-end and side-to-side protein contacts. Differences in the structure 
of the subunits and the manner of their self-assembly give the filaments different 
mechanical properties. Subunit assembly and disassembly constantly remodel all 
three types of cytoskeletal filaments. Actin and tubulin (the subunits of actin fila- 
ments and microtubules, respectively) bind and hydrolyze nucleoside triphosphates 
(ATP and GTP. respectively), and assemble head-to-tail to generate polarized fila- 
ments capable of generating force. In living cells, accessory proteins modulate the 
dynamics and organization of cytoskeletal filaments, resulting in complex events 
such as cell division or migration, and generating elaborate cellular architecture 
to form polarized tissues such as epithelia. Bacterial cells also contain homologs of 
actin, tubulin, and intermediate filaments that form dynamic structures that help 
control cell shape and division. 


ACTIN AND ACTIN-BINDING PROTEINS 


The actin cytoskeleton performs a wide range of functions in diverse cell types. 
Each actin subunit, sometimes called globular or G-actin, is a 375-amino-acid 
polypeptide carrying a tightly associated molecule of ATP or ADP (Figure 
16-11A). Actin is extraordinarily well conserved among eukaryotes. The amino 
acid sequences of actins from different eukaryotic species are usually about 90% 
identical. Small variations in actin amino acid sequence can cause significant 
functional differences: In vertebrates, for example, there are three isoforms of 
actin, termed a, P, and y, that differ slightly in their amino acid sequences and 
have distinct functions. a-Actin is expressed only in muscle cells, while B- and 
y-actins are found together in almost all non-muscle cells. 


Actin Subunits Assemble Head-to- Tail to Create Flexible, 
Polar Filaments 


Actin subunits assemble head-to-tail to form a tight, right-handed helix, forming 
a structure about 8 nm wide called filamentous or F-actin (Figure 16-11B and C). 
Because the asymmetrical actin subunits of a filament all point in the same direc- 
tion, filaments are polar and have structurally different ends: a slower-growing 
minus end and a faster-growing plus end. The minus end is also referred to as the 
“pointed end” and the plus end as the “barbed end,’ because of the “arrowhead” 
appearance of the complex formed between actin filaments and the motor pro- 
tein myosin (Figure 16-12). Within the filament, the subunits are positioned with 
their nucleotide-binding cleft directed toward the minus end. 

Individual actin filaments are quite flexible. The stiffness of a filament can be 
characterized by its persistence length, the minimum filament length at which ran- 
dom thermal fluctuations are likely to cause it to bend. The persistence length 
of an actin filament is only a few tens of micrometers. In a living cell, however, 


Figure 16-10 Caulobacter and 
crescentin. The sickle-shaped bacterium 
Caulobacter crescentus expresses a 
protein, crescentin, with a series of coiled- 
coil domains similar in size and organization 
to the domains of eukaryotic intermediate 
filaments. (A) The crescentin protein forms 
a fiber (labeled in red) that runs down 

the inner side of the curving bacterial cell 
wall. (B) When the gene is disrupted, the 
bacteria grow as straight rods (bottom). 
(From N. Ausmees, J.R. Kuhn and 

C. Jacobs-Wagner, Cell 115:705-718, 
2003. With permission from Elsevier.) 
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accessory proteins cross-link and bundle the filaments together, making large- 
scale actin structures that are much more rigid than an individual actin filament. 


Nucleation Is the Rate-Limiting Step in the Formation of Actin 
Filaments 


The regulation of actin filament formation is an important mechanism by which 
cells control their shape and movement. Small oligomers of actin subunits can 
assemble spontaneously, but they are unstable and disassemble readily because 
each monomer is bound to only one or two other monomers. For a new actin fila- 
ment to form, subunits must assemble into an initial aggregate, or nucleus, that is 
stabilized by multiple subunit-subunit contacts and can then elongate rapidly by 
addition of more subunits. This process is called filament nucleation. 

Many features of actin nucleation and polymerization have been studied with 
purified actin in a test tube (Figure 16-13). The instability of smaller actin aggre- 
gates creates a kinetic barrier to nucleation. When polymerization is initiated, this 
barrier results in a lag phase during which no filaments are observed. During this 
lag phase, however, a few of the small, unstable aggregates succeed in making the 
transition to a more stable form that resembles an actin filament. This leads to a 
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Figure 16-11 The structures of an actin 
monomer and actin filament. (A) The 
actin monomer has a nucleotide (either 
ATP or ADP) bound in a deep cleft in the 
center of the molecule. (B) Arrangement 
of monomers in a filament consisting 

of two protofilaments, held together by 
lateral contacts, which wind around each 
other as two parallel strands of a helix, 
with a twist repeating every 37 nm. All the 
subunits within the filament have the same 
orientation. (C) Electron micrograph of 
negatively stained actin filament. 

(C, courtesy of Roger Craig.) 


Figure 16-12 Structural polarity of the 
actin filament. (A) This electron micrograph 
shows an actin filament polymerized from a 
short actin filament seed that was decorated 
with myosin motor domains, resulting in an 
arrowhead pattern. The filament has grown 
much faster at the barbed (plus) end than at 
the pointed (minus) end. (B) Enlarged image 
and model showing the arrowhead pattern. 
(A, courtesy of Tom Pollard; B, adapted 
from M. Whittaker, B.O. Carragher and 

K.A. Milligan, Ultramicro. 54:245-260, 1995.) 
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Figure 16-13 The time course of actin polymerization in a test tube. (A) Polymerization of pure actin subunits into filaments 
occurs after a lag phase. (B) Polymerization occurs more rapidly in the presence of preformed fragments of actin filaments, 
which act as nuclei for filament growth. As indicated, the % free subunits after polymerization reflects the critical concentration 
(Co), at which there is no net change in polymer. Actin polymerization is often studied by observing the change in the light 
emission from a fluorescent probe, called pyrene, that has been covalently attached to the actin. Pyrene-actin fluoresces more 
brightly when it is incorporated into actin filaments. 


phase of rapid filament elongation during which subunits are added quickly to 
the ends of the nucleated filaments (Figure 16-13A). Finally, as the concentration 
of actin monomers declines, the system approaches a steady state at which the 
rate of addition of new subunits to the filament ends exactly balances the rate 
of subunit dissociation. The concentration of free subunits left in solution at this 
point is called the critical concentration, C,. As explained in Panel 16-2, the value 
of the critical concentration is equal to the rate constant for subunit loss divided 
by the rate constant for subunit addition—that is, Ce = koff/kon, which is equal 
to the dissociation constant, Kg, and the inverse of the equilibrium constant, K 
(see Figure 3-44). In a test tube, the Ce for actin polymerization—that is, the free 
actin monomer concentration at which the fraction of actin in the polymer stops 
increasing—is about 0.2 uM. Inside the cell, the concentration of unpolymerized 
actin is much higher than this, and the cell has evolved mechanisms to prevent 
most of its monomeric actin from assembling into filaments, as we discuss later. 

The lag phase in filament growth is eliminated if preexisting seeds (such as 
fragments of actin filaments that have been chemically cross-linked) are added to 
the solution at the beginning of the polymerization reaction (Figure 16-13B). The 
cell takes great advantage of this nucleation requirement: it uses special proteins 
to catalyze filament nucleation at specific sites, thereby determining the location 
at which new actin filaments are assembled. 


Actin Filaments Have Two Distinct Ends That Grow at Different 
Rates 


Due to the uniform orientation of asymmetric actin subunits in the filament, the 
structures at its two ends are different. This orientation makes the two ends of 
each polymer different in ways that have a profound effect on filament growth 
rates. The kinetic rate constants for actin subunit association and dissociation— 
kon and Kost, respectively—are much greater at the plus end than the minus end. 
This can be seen when an excess of purified actin monomers is allowed to assem- 
ble onto polarity-marked filaments—the plus end of the filament elongates up to 
ten times faster (see Figure 16-12). If filaments are rapidly diluted so that the free 
subunit concentration drops below the critical concentration, the plus end also 
depolymerizes faster. 

It is important to note, however, that the two ends of an actin filament have the 
same net affinity for actin subunits, if all of the subunits are in the same nucleotide 
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state. Addition of a subunit to either end of a filament of n subunits results in a fil- 
ament of n + 1 subunits. Thus, the free-energy difference, and therefore the equi- 
librium constant (and the critical concentration), must be the same for addition 
of subunits at either end of the polymer. In this case, the ratio of the rate con- 
stants, koff/kon, must be identical at the two ends, even though the absolute values 
of these rate constants are very different at each end (see Panel 16-2). 

The cell takes advantage of actin filament dynamics and polarity to do mechan- 
ical work. Filament elongation proceeds spontaneously when the free-energy 
change (AG) for addition of the soluble subunit is less than zero. This is the case 
when the concentration of subunits in solution exceeds the critical concentra- 
tion. A cell can couple an energetically unfavorable process to this spontaneous 
process; thus, the cell can use free energy released during spontaneous filament 
polymerization to move an attached load. For example, by orienting the fast-grow- 
ing plus ends of actin filaments toward its leading edge, a motile cell can push its 
plasma membrane forward, as we discuss later. 


ATP Hydrolysis Within Actin Filaments Leads to Treadmilling at 
Steady State 


Thus far in our discussion of actin filament dynamics, we have ignored the critical 
fact that actin can catalyze the hydrolysis of the nucleoside triphosphate ATP. For 
free actin subunits, this hydrolysis proceeds very slowly; however, it is acceler- 
ated when the subunits are incorporated into filaments. Shortly after ATP hydro- 
lysis occurs, the free phosphate group is released from each subunit, but the ADP 
remains trapped in the filament structure. Thus, two different types of filament 
structures can exist, one with the “T form” of the nucleotide bound (ATP), and one 
with the “D form” bound (ADP). 

When the nucleotide is hydrolyzed, much of the free energy released by cleav- 
age of the phosphate-phosphate bond is stored in the polymer. This makes the 
free-energy change for dissociation of a subunit from the D-form polymer more 
negative than the free-energy change for dissociation of a subunit from the T-form 
polymer. Consequently, the ratio of koff/Kon for the D-form polymer, which is 
numerically equal to its critical concentration [C,(D)], is larger than the corre- 
sponding ratio for the T-form polymer. Thus, C,(D) is greater than C,(T). At cer- 
tain concentrations of free subunits, D-form polymers will therefore shrink while 
T-form polymers grow. 

In living cells, most soluble actin subunits are in the T form, as the free con- 
centration of ATP is about tenfold higher than that of ADP. However, the longer the 
time that subunits have been in the actin filament, the more likely they are to have 
hydrolyzed their ATP. Whether the subunit at each end of a filament is in the T or 
the D form depends on the rate of this hydrolysis compared with the rate of sub- 
unit addition. If the concentration of actin monomers is greater than the critical 
concentration for both the T-form and D-form polymer, then subunits will add to 
the polymer at both ends before the nucleotides in the previously added subunits 
are hydrolyzed; as a result, the tips of the actin filament will remain in the T form. 
On the other hand, if the subunit concentration is less than the critical concentra- 
tions for both the T-form and D-form polymer, then hydrolysis may occur before 
the next subunit is added and both ends of the filament will be in the D form and 
will shrink. At intermediate concentrations of actin subunits, it is possible for the 
rate of subunit addition to be faster than nucleotide hydrolysis at the plus end, 
but slower than nucleotide hydrolysis at the minus end. In this case, the plus end 
of the filament remains in the T conformation, while the minus end adopts the D 
conformation. The filament then undergoes a net addition of subunits at the plus 
end, while simultaneously losing subunits from the minus end. This leads to the 
remarkable property of filament treadmilling (Figure 16-14; see Panel 16-2). 

At a particular intermediate subunit concentration, the filament growth at the 
plus end exactly balances the filament shrinkage at the minus end. Under these 
conditions, the subunits cycle rapidly between the free and filamentous states, 
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ON RATES AND OFF RATES 


A linear polymer of protein molecules, such 
as an actin filament or a microtubule, 
assembles (polymerizes) and disassembles 
(depolymerizes) by the addition and removal 
of subunits at the ends of the polymer. The 
rate of addition of these subunits (called 
monomers) is given by the rate constant Kon: 
which has units of M7! sec. The rate of loss 
is given by koş (units of sec’). 


NUCLEATION A helical polymer is stabilized by multiple contacts between 

adjacent subunits. In the case of actin, two actin molecules bind relatively weakly 
to each other, but addition of a third actin monomer to form a trimer makes the 
entire group more stable. 










polymer (with n subunits) subunit 
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Further monomer addition can take place onto this trimer, which therefore acts 
as a nucleus for polymerization. For tubulin, the nucleus is larger and has a more 
complicated structure (possibly a ring of 13 or more tubulin molecules)—but the 
principle is the same. 

The assembly of a nucleus is relatively slow, which explains the lag phase 
seen during polymerization. The lag phase can be reduced or abolished 
entirely by adding premade nuclei, such as fragments of already polymerized 
microtubules or actin filaments. 
























THE CRITICAL CONCENTRATION 


The number of monomers that add to the 
polymer (actin filament or microtubule) per 
second will be proportional to the 
concentration of the free subunit (k,,C), but 
the subunits will leave the polymer end at a 
constant rate (Kor) that does not depend on C. 
As the polymer grows, subunits are used up, 
and C is observed to drop until it reaches a 
constant value, called the critical concentration 
(C.). At this concentration, the rate of subunit 
addition equals the rate of subunit loss. 

At this equilibrium, 


TIME COURSE OF POLYMERIZATION 
The assembly of a protein into a long helical polymer such as a cytoskeletal 
filament or a bacterial flagellum typically shows the following time course: 
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The lag phase corresponds to time taken for nucleation. 


The growth phase occurs as monomers add to the exposed ends of the 
growing filament, causing filament elongation. 











so that 








The equilibrium phase, or steady state, is reached when the growth of the 
polymer due to monomer addition precisely balances the shrinkage of the 
polymer due to disassembly back to monomers. 





(where Kg is the dissociation constant; see 
Figure 3-44), 











PLUS AND MINUS ENDS a 
The two ends of an actin filament or microtubule polymerize @ e arias 

at different rates. The fast-growing end is called the plus end, iia E E ae r.. e 
whereas the slow-growing end is called the minus end. The minus E E EEK plus 
difference in the rates of growth at the two ends is made end end 


possible by changes in the conformation of each subunit as 


it enters the polymer. 
free C x subunit in 
subunit polymer 
This conformational change affects the rates at which subunits add to 
the two ends. loss, which determines the equilibrium constant for its association 
Even though Kon and koff will have different values for the plus and with the end, is identical at both ends: if the plus end grows four 
minus ends of the polymer, their ratio ko/k,,—and hence C_—must be times faster than the minus end, it must also shrink four times 
the same at both ends for a simple polymerization reaction (no ATP or faster. Thus, for C > C, both ends grow; for C < C,, both ends 
GTP hydrolysis). This is because exactly the same subunit interactions shrink. 
are broken when a subunit is lost at either end, and the final state of The nucleoside triphosphate hydrolysis that accompanies 
the subunit after dissociation is identical. Therefore, the AG for subunit actin and tubulin polymerization removes this constraint. 









































NUCLEOTIDE HYDROLYSIS ATP CAPS AND GTP CAPS 


Each actin molecule carries a tightly bound ATP molecule that is hydrolyzed to a 
tightly bound ADP molecule soon after its assembly into the polymer. Similarly, each 
tubulin molecule carries a tightly bound GTP that is converted to a tightly bound 
GDP molecule soon after the molecule assembles into the polymer. 


The rate of addition of subunits to a 
growing actin filament or microtubule 
can be faster than the rate at which their 
bound nucleotide is hydrolyzed. Under 
such conditions, the end has a “cap” 

of subunits containing the nucleoside 
free monomer subunit in polymer triphosphate—an ATP cap on an actin 


Hydrolysis of the bound nucleotide reduces the binding affinity of the subunit for filament or a GTP cap on a microtubule. 
neighboring subunits and makes it more likely to dissociate from each end of the 
filament (see Figure 16—44 for a possible mechanism). It is usually the € T form 
that adds to the filament and the D_ form that leaves. DED D 
Considering events at the plus end only: D DD 


T = 


k! 
D D D:D D €i An 


DECEDECDITCECDTCDECD 
N Bott 
off 
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As before, the polymer will grow until C = C.. For illustrative purposes, we can ignore 
kP on and k! „p since they are usually very small, so that polymer growth ceases when 


ATP or GTP cap 


and 
are two behaviors 
observed in cytoskeletal polymers. Both 
are associated with nucleoside 
triphosphate hydrolysis. Dynamic instability 
is believed to predominate in microtubules, 
whereas treadmilling may predominate 


or 


This is a steady state and not a true equilibrium, because the ATP or GTP that is 
hydrolyzed must be replenished by a nucleotide exchange reaction of the 


free a | a AM ). in actin filaments. 


TREADMILLING 


One consequence of the nucleotide hydrolysis that accompanies polymer 
formation is to change the critical concentration at the two ends of the 
polymer. Since kPt and k'on refer to different reactions, their ratio 

kP o¢/k'5, need not be the same at both ends of the polymer, so that: 


Thus, if both ends of a polymer are exposed, polymerization proceeds 
until the concentration of free monomer reaches a value that 

is above C, for the plus end but below C, for the minus end. At this 
steady state, subunits undergo a net assembly at the plus end and a 

net disassembly at the minus end at an identical rate. The polymer 
maintains a constant length, even though there is a net flux of subunits 
through the polymer, known as 


DYNAMIC INSTABILITY 


Microtubules depolymerize about 100 times faster from an end containing 
GDP-tubulin than from one containing GTP-tubulin. A GTP cap favors growth, 
but if it is lost, then depolymerization ensues. 


GTP cap 
| 


RIDA, 
DOLS 
OES 


GROWING SHRINKING 


Individual microtubules can therefore alternate between a period of slow growth and 
a period of rapid disassembly, a phenomenon called 
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(A) (B) 


treadmilling occurs 


while the total length of the filament remains unchanged. This “steady-state tread- 
milling” requires a constant consumption of energy in the form of ATP hydrolysis. 


The Functions of Actin Filaments Are Inhibited by Both Polymer- 
stabilizing and Polymer-destabilizing Chemicals 


Chemical compounds that stabilize or destabilize actin filaments are important 
tools in studies of the filaments’ dynamic behavior and function in cells. The cyto- 
chalasins are fungal products that prevent actin polymerization by binding to the 
plus end of actin filaments. Latrunculin prevents actin polymerization by binding 
to actin subunits. The phalloidins are toxins isolated from the Amanita mushroom 
that bind tightly all along the side of actin filaments and stabilize them against 
depolymerization. All of these compounds cause dramatic changes in the actin 
cytoskeleton and are toxic to cells, indicating that the function of actin filaments 
depends on a dynamic equilibrium between filaments and actin monomers 
(Table 16-1). 


Actin-Binding Proteins Influence Filament Dynamics and 
Organization 


In a test tube, polymerization of actin is controlled simply by its concentration, as 
described above, and by pH and the concentrations of salts and ATP. Within a cell, 
however, actin behavior is also regulated by numerous accessory proteins that 
bind actin monomers or filaments (summarized in Panel 16-3). At steady state 


TABLE 16-1 


Actin 


Depolymerizes | Binds actin subunits 
Cytochalasin B | Depolymerizes | Caps filament plus ends 


Microtubules 


Taxol® Stabilizes Binds along filaments Yew tree 
(paclitaxel) 


Depolymerizes | Binds tubulin subunits Synthetic 


Phalloidin Stabilizes Binds along filaments 





Figure 16-14 Treadmilling of an actin 
filament, made possible by the ATP 
hydrolysis that follows subunit addition. 
(A) Explanation for the different critical 
concentrations (Co) at the plus and minus 
ends. Subunits with bound ATP (T-form 
subunits) polymerize at both ends of 

a growing filament, and then undergo 
nucleotide hydrolysis within the filament. As 
the filament grows, elongation is faster than 
hydrolysis at the plus end in this example, 
and the terminal subunits at this end are 
therefore always in the T form. However, 
hydrolysis is faster than elongation at the 
minus end, and so terminal subunits at 

this end are in the D form. (B) Treadmilling 
occurs at intermediate concentrations of 
free subunits. The critical concentration for 
polymerization on a filament end in the 

T form is lower than for a filament end 

in the D form. If the actual subunit 
concentration is somewhere between these 
two values, the plus end grows while the 
minus end shrinks, resulting in treadmilling. 


PANEL 16-3 : Actin Filaments 
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Some of the major accessory proteins of the actin cytoskeleton. Except for the myosin motor proteins, an example of each 
major type is shown. Each of these is discussed in the text. However, most cells contain more than a hundred different 
actin-binding proteins, and it is likely that there are important types of actin-associated proteins that are not yet recognized. 
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in vitro, when the monomer concentration is 0.2 uM, filament half-life, a measure 
of how long an individual actin monomer spends in a filament as it treadmills, is 
approximately 30 minutes. In a non-muscle vertebrate cell, actin half-life in fila- 
ments is only 30 seconds, demonstrating that cellular factors modify the dynamic 
behavior of actin filaments. Actin-binding proteins dramatically alter actin fila- 
ment dynamics and organization through spatial and temporal control of mono- 
mer availability, filament nucleation, elongation, and depolymerization. In the 
following sections, we describe the ways in which these accessory proteins modify 
actin function in the cell. 


Monomer Availability Controls Actin Filament Assembly 


In most non-muscle vertebrate cells, approximately 50% of the actin is in 
filaments and 50% is soluble—and yet the soluble monomer concentration is 
50-200 uM, well above the critical concentration. Why does so little of the actin 
polymerize into filaments? The reason is that the cell contains proteins that bind 
to the actin monomers and make polymerization much less favorable (an action 
similar to that of the drug latrunculin). A small protein called thymosin is the most 
abundant of these proteins. Actin monomers bound to thymosin are in a locked 
state, where they cannot associate with either the plus or minus ends of actin fila- 
ments and can neither hydrolyze nor exchange their bound nucleotide. 

How do cells recruit actin monomers from this buffered storage pool and use 
them for polymerization? The answer depends on another monomer-binding pro- 
tein called profilin. Profilin binds to the face of the actin monomer opposite the 
ATP-binding cleft, blocking the side of the monomer that would normally associ- 
ate with the filament minus end, while leaving exposed the site on the monomer 
that binds to the plus end (Figure 16-15). When the profilin-actin complex binds 
a free plus end, a conformational change in actin reduces its affinity for profilin 
and the profilin falls off, leaving the actin filament one subunit longer. Profilin 
competes with thymosin for binding to individual actin monomers. Thus, by reg- 
ulating the local activity of profilin, cells can control the movement of actin sub- 
units from the sequestered thymosin-bound pool onto filament plus ends. 

Several mechanisms regulate profilin activity, including profilin phosphoryla- 
tion and profilin binding to inositol phospholipids. These mechanisms can define 
the sites where profilin acts. For example, profilin is required for filament assem- 
bly at the plasma membrane, where it is recruited by an interaction with acidic 
membrane phospholipids. At this location, extracellular signals can activate pro- 
filin to produce local actin polymerization and the extension of actin-rich motile 
structures such as filopodia and lamellipodia. 


Actin-Nucleating Factors Accelerate Polymerization and Generate 
Branched or Straight Filaments 


In addition to the availability of active actin subunits, a second prerequisite for 
cellular actin polymerization is filament nucleation. Proteins that contain actin 
monomer binding motifs linked in tandem mediate the simplest mechanism of 
filament nucleation. These actin-nucleating proteins bring several actin subunits 
together to form a seed. In most cases, actin nucleation is catalyzed by one of two 
different types of factors: the Arp 2/3 complex or the formins. ‘The first of these is 
a complex of proteins that includes two actin-related proteins, or ARPs, each of 
which is about 45% identical to actin. The Arp 2/3 complex nucleates actin fila- 
ment growth from the minus end, allowing rapid elongation at the plus end (Fig- 
ure 16-16A and B). The complex can attach to the side of another actin filament 
while remaining bound to the minus end of the filament that it has nucleated, 
thereby building individual filaments into a treelike web (Figure 16-16C and D). 
Formins are dimeric proteins that nucleate the growth of straight, unbranched 
filaments that can be cross-linked by other proteins to form parallel bundles. Each 
formin subunit has a binding site for monomeric actin, and the formin dimer 
appears to nucleate actin filament polymerization by capturing two monomers. 
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Figure 16-15 Effects of thymosin and profilin on actin polymerization. An actin monomer bound to thymosin is sterically 


prevented from binding to and elongating the plus end of an actin filament (/eft). An actin monomer bound to profilin, on the 


other hand, is capable of elongating a filament (right). Thymosin and profilin cannot both bind to a single actin monomer at the 
same time. In a cell in which most of the actin monomer is bound to thymosin, the activation of a small amount of profilin can 


produce rapid filament assembly. As indicated (bottom), profilin binds to actin monomers that are transiently released from 


the thymosin-bound monomer pool, shuttles them onto the plus ends of actin filaments, and is then released and recycled for 


further rounds of filament elongation. 


As the newly nucleated filament grows, the formin dimer remains associated with 
the rapidly growing plus end while still allowing the addition of new subunits at 
that end (Figure 16-17). This mechanism of filament assembly is clearly differ- 
ent from that used by the Arp 2/3 complex, which remains stably bound to the 
filament minus end, preventing subunit addition or loss at that end. Formin-de- 
pendent actin filament growth is strongly enhanced by the association of actin 
monomers with profilin (Figure 16-18). 

Like profilin activation, actin filament nucleation by Arp 2/3 complexes and 
formins occurs primarily at the plasma membrane, and the highest density of 
actin filaments in most cells is at the cell periphery. The layer just beneath the 
plasma membrane is called the cell cortex, and the actin filaments in this region 
determine the shape and movement of the cell surface, allowing the cell to change 
its shape and stiffness rapidly in response to changes in its external environment. 


Actin-Filament-Binding Proteins Alter Filament Dynamics 


Actin filament behavior is regulated by two major classes of binding proteins: 
those that bind along the side of a filament and those that bind to the ends (see 
Panel 16-3). Side-binding proteins include tropomyosin, an elongated protein 
that binds simultaneously to six or seven adjacent actin subunits along each of the 
two grooves of the helical actin filament. In addition to stabilizing and stiffening 
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the filament, the binding of tropomyosin can prevent the actin filament from 
interacting with other proteins; this aspect of tropomyosin function is important 
in the control of muscle contraction, as we discuss later. 

An actin filament that stops growing and is not specifically stabilized in the cell 
will depolymerize rapidly, particularly at its plus end, once the actin molecules 
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Figure 16-16 Nucleation and actin web formation by the Arp 2/3 complex. (A) The structures of Aro2 and Aro3 compared to the structure 

of actin. Although the face of the molecule equivalent to the plus end (top) in both Arp2 and Arp3 is very similar to the plus end of actin itself, 
differences on the sides and minus end prevent these actin-related proteins from forming filaments on their own or coassembling into filaments 

with actin. (B) A model for actin filament nucleation by the Arp 2/3 complex. In the absence of an activating factor, Arp2 and Arp3 are held by 

their accessory proteins in an orientation that prevents them from nucleating a new actin filament. When an activating factor (indicated by the blue 
triangle) binds the complex, Arp2 and Arp3 are brought together into a new configuration that resembles the plus end of an actin filament. Actin 
subunits can then assemble onto this structure, bypassing the rate-limiting step of filament nucleation. (C) The Arp 2/3 complex nucleates filaments 
most efficiently when it is bound to the side of a preexisting actin filament. The result is a filament branch that grows at a 70° angle relative to the 
original filament. Repeated rounds of branching nucleation result in a treelike web of actin filaments. (D) Top, electron micrographs of branched actin 
filaments formed by mixing purified actin subunits with purified Arp 2/3 complexes. Bottom, reconstructed image of a branch where the crystal 
structures of actin (pink) and the Arp 2/3 complex have been fitted to the electron density. The mother filament runs from top to bottom, and the 
daughter filament branches off to the right where the Arp 2/3 complex binds to three actin subunits in the mother filament. (D, top, from R.D. Mullins 
et al., Proc. Natl Acad. Sci. USA 95:6181-6186, 1998, with permission from National Academy of Sciences; botttom, from N. Volkmann et al., 
Science 293:2456-2459, 2001, with permission from AAAS.) 
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have hydrolyzed their ATP. The binding of plus-end capping protein (also called 
CapZ for its location in the muscle Z band) stabilizes an actin filament at its plus 
end by rendering it inactive, greatly reducing the rates of filament growth and 
depolymerization (Figure 16-19). At the minus end, an actin filament may be 
capped by the Arp 2/3 complex that was responsible for its nucleation, although 
many minus ends in a typical cell are released from the Arp 2/3 complex and are 
uncapped. 

Tropomodulin, best known for its function in the capping of exceptionally 
long-lived actin filaments in muscle, binds tightly to the minus ends of actin fil- 
aments that have been coated and thereby stabilized by tropomyosin. It can also 
transiently cap pure actin filaments and significantly reduce their elongation and 
depolymerization rates. A large family of tropomodulin proteins regulates actin 
filament length and stability in many cell types. 

For maximum effect, proteins that bind the side of actin filaments coat the fil- 
ament completely, and must therefore be present in high amounts. In contrast, 
end-binding proteins can affect filament dynamics even when they are present at 
very low levels. Since subunit addition and loss occur primarily at filament ends, 
one molecule of an end-binding protein per actin filament (roughly one molecule 
per 200-500 actin subunits) can be enough to transform the architecture of an 
actin filament network. 


Severing Proteins Regulate Actin Filament Depolymerization 


Another important mechanism of actin filament regulation depends on proteins 
that break an actin filament into many smaller filaments, thereby generating a 
large number of new filament ends. The fate of these new ends depends on the 
presence of other accessory proteins. Under some conditions, newly formed ends 
nucleate filament elongation, thereby accelerating the assembly of new filament 
structures. Under other conditions, severing promotes the depolymerization of 
old filaments, speeding up the depolymerization rate by tenfold or more. In addi- 
tion, severing changes the physical and mechanical properties of the cytoplasm: 
stiff, large bundles and gels become more fluid. 

One class of actin-severing proteins is the gelsolin superfamily. These proteins 
are activated by high levels of cytosolic Ca**. Gelsolin interacts with the side of the 
actin filament and contains subdomains that bind to two different sites: one that 
is exposed on the surface of the filament and one that is hidden between adjacent 


Figure 16-18 Profilin and formins. Some members of the formin protein 
family have unstructured domains or “whiskers” that contain several binding 
sites for profilin or the profilin-actin complex. These flexible domains serve 
as a staging area for addition of actin to the growing plus end of the actin 
filament when formin is bound. Under some conditions, this can enhance 
the rate of actin filament elongation so that filament growth is faster than that 
expected for a diffusion-controlled reaction, and faster in the presence of 
formin and profilin than the rate for pure actin alone (see also Figure 3-78). 
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Figure 16-17 Actin elongation mediated 
by formins. Formin proteins (green) form 

a dimeric complex that can nucleate the 
formation of a new actin filament (red) and 
remain associated with the rapidly growing 
plus end as it elongates. The formin protein 
maintains its binding to one of the two 
actin subunits exposed at the plus end as it 
allows each new subunit to assemble. Only 
part of the large dimeric formin molecule 

is shown here. Other regions regulate its 
activity and link it to particular structures 

in the cell. Many formins are indirectly 
connected to the cell plasma membrane 
and aid the insertional polymerization of 
the actin filament directly beneath the 
membrane surface. 
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subunits. According to one model, gelsolin binds the side of an actin filament until 
a thermal fluctuation creates a small gap between neighboring subunits, at which 
point gelsolin inserts itself into the gap to break the filament. After the severing 
event, gelsolin remains attached to the actin filament and caps the new plus end. 

Another important actin-filament destabilizing protein, found in all eukary- 
otic cells, is cofilin. Also called actin depolymerizing factor, cofilin binds along the 
length of the actin filament, forcing the filament to twist a little more tightly (Fig- 
ure 16-20). This mechanical stress weakens the contacts between actin subunits 
in the filament, making the filament brittle and more easily severed by thermal 
motions, generating filament ends that undergo rapid disassembly. As a result, 
most of the actin filaments inside cells are shorter lived than are filaments formed 
from pure actin in a test tube. 

Cofilin binds preferentially to ADP-containing actin filaments rather than to 
ATP-containing filaments. Since ATP hydrolysis is usually slower than filament 
assembly, the newest actin filaments in the cell still contain mostly ATP and are 
resistant to depolymerization by cofilin. Cofilin therefore tends to dismantle the 
older filaments in the cell. As we will discuss later, the cofilin-mediated disassem- 
bly of old but not new actin filaments is critical for the polarized, directed growth 
of the actin network that is responsible for unidirectional cell crawling and the 
intracellular motility of pathogens. Actin filaments can be protected from cofilin 
by tropomyosin binding. Thus, the dynamics of actin in different subcellular loca- 
tions depends on the balance of stabilizing and destabilizing accessory proteins. 
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Figure 16-20 Twisting of an actin filament induced by cofilin. (A) Three-dimensional 
reconstruction from cryoelectron micrographs of filaments made of pure actin. The bracket shows 
the span of two twists of the actin helix. (B) Reconstruction of an actin filament coated with cofilin, 
which binds in a 1:1 stoichiometry to actin subunits all along the filament. Cofilin is a small protein 
(14 kD) compared to actin (43 kD), and so the filament appears only slightly thicker. The energy of 
cofilin binding serves to deform the actin filament, twisting it more tightly and reducing the distance 
spanned by each twist of the helix. (From A. McGough et al., J. Cell Biol. 188:771-781, 1997. With 
permission from the authors.) 


Figure 16-19 Filament capping and 

its effects on filament dynamics. A 
population of uncapped filaments adds 
and loses subunits at both the plus and 
minus ends, resulting in rapid growth or 
shrinkage, depending on the concentration 
of available free monomers (green line). 

In the presence of a protein that caps the 
plus end (red line), only the minus end is 
able to add or lose subunits; consequently, 
filament growth will be slower at all 
monomer concentrations above the critical 
concentration, and filament shrinkage will 
be slower at all monomer concentrations 
below the critical concentration. In addition, 
the critical concentration for the population 
shifts to that of the filament minus end. 
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Higher-Order Actin Filament Arrays Influence Cellular Mechanical 
Properties and Signaling 


Actin filaments in animal cells are organized into several types of arrays: dendritic 
networks, bundles, and weblike (gel-like) networks (Figure 16-21). Different 
structures are initiated by the action of distinct nucleating proteins: the actin fila- 
ments of dendritic networks are nucleated by the Arp 2/3 complex, while bundles 
are made of the long, straight filaments produced by formins. The proteins nucle- 
ating the filaments in the gel-like networks are not yet well defined. 

The structural organization of different actin networks depends on specialized 
accessory proteins. As explained earlier, Arp 2/3 organizes filaments into den- 
dritic networks by attaching filament minus ends to the side of other filaments. 
Other actin filament structures are assembled and maintained by two classes of 
proteins: bundling proteins, which cross-link actin filaments into a parallel array, 
and gel-forming proteins, which hold two actin filaments together at a large angle 
to each other, thereby creating a looser meshwork. Both bundling and gel-form- 
ing proteins generally have two similar actin-filament-binding sites, which can 
either be part of a single polypeptide chain or contributed by each of two polypep- 
tide chains held together in a dimer (Figure 16-22). The spacing and arrangement 
of these two filament-binding domains determine the type of actin structure that 
a given cross-linking protein forms. 

Each type of bundling protein also determines which other molecules can 
interact with the cross-linked actin filaments. Myosin II is the motor protein that 
enables stress fibers and other contractile arrays to contract. The very close pack- 
ing of actin filaments caused by the small monomeric bundling protein fimbrin 
apparently excludes myosin, and thus the parallel actin filaments held together 
by fimbrin are not contractile. On the other hand, a-actinin cross-links oppositely 
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Figure 16-21 Actin arrays in a cell. 

A fibroblast crawling in a tissue-culture 
dish is shown with four areas enlarged to 
show the arrangement of actin filaments. 
The actin filaments are shown in red, with 
arrowheads pointing toward the minus 
end. Stress fibers are contractile and exert 
tension. The actin cortex underlies the 
plasma membrane and consists of gel- 
like networks or dendritic actin networks 
that enable membrane protrusion at 
lamellopodia. Filopodia are spike-like 
projections of the plasma membrane that 
allow a cell to explore its environment. 


Figure 16—22 The modular structures of 
four actin-cross-linking proteins. Each of 
the proteins shown has two actin-binding 
sites (red) that are related in sequence. 
Fimbrin has two directly adjacent actin- 
binding sites, so that it holds its two actin 
filaments very close together (14 nm apart), 
aligned with the same polarity (See Figure 
16-23A). The two actin-binding sites in 
a-actinin are separated by a spacer around 
30 nm long, so that it forms more loosely 
packed actin bundles (see Figure 16-23A). 
Filamin has two actin-binding sites with a 
V-shaped linkage between them, so that it 
cross-links actin filaments into a network 
with the filaments oriented almost at right 
angles to one another (see Figure 16-24). 
Spectrin is a tetramer of two a and two 

B subunits, and the tetramer has two actin- 
binding sites spaced about 200 nm apart 
(see Figure 10-38). 
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polarized actin filaments into loose bundles, allowing the binding of myosin and 
formation of contractile actin bundles (Figure 16-23). Because of the very differ- 
ent spacing and orientation of the actin filaments, bundling by fimbrin automat- 
ically discourages bundling by a-actinin, and vice versa, so that the two types of 
bundling protein are mutually exclusive. 

The bundling proteins that we have discussed so far have straight, stiff connec- 
tions between their two actin-filament-binding domains. Other actin cross-link- 
ing proteins have either a flexible or a stiff, bent connection between their two 
binding domains, allowing them to form actin filament webs or gels, rather than 
actin bundles. Filamin (see Figure 16-22) promotes the formation of a loose 
and highly viscous gel by clamping together two actin filaments roughly at right 
angles (Figure 16-24A). Cells require the actin gels formed by filamin to extend 
the thin, sheetlike membrane projections called lamellipodia that help them to 
crawl across solid surfaces. In humans, mutations in the filamin A gene cause 
defects in nerve-cell migration during early embryonic development. Cells in the 
periventricular region of the brain fail to migrate to the cortex and instead form 
nodules, causing a syndrome called periventricular heterotopia (Figure 16-24B). 
Interestingly, in addition to binding actin, filamins have been reported to interact 
with a large number of cellular proteins of great functional diversity, including 
membrane receptors for signaling molecules, and filamin mutations can also lead 
to defects in development of bone, the cardiovascular system, and other organs. 
Thus, filamins may also function as signaling scaffolds by connecting and coordi- 
nating a wide variety of cellular processes with the actin cytoskeleton. 

A very different, well-studied web-forming protein is spectrin, which was first 
identified in red blood cells. Spectrin is a long, flexible protein made out of four 
elongated polypeptide chains (two a subunits and two B subunits), arranged so 
that the two actin-filament-binding sites are about 200 nm apart (compared with 
14 nm for fimbrin and about 30 nm for a-actinin; see Figure 16-23). In the red 
blood cell, spectrin is concentrated just beneath the plasma membrane, where it 
forms a two-dimensional weblike network held together by short actin filaments 
whose precise lengths are tightly regulated by capping proteins at each end; spec- 
trin links this web to the plasma membrane because it has separate binding sites 
for peripheral membrane proteins, which are themselves positioned near the lipid 
bilayer by integral membrane proteins (see Figure 10-38). The resulting network 
creates a strong, yet flexible cell cortex that provides mechanical support for the 
overlying plasma membrane, allowing the red blood cell to spring back to its origi- 
nal shape after squeezing through a capillary. Close relatives of spectrin are found 
in the cortex of most other vertebrate cell types, where they also help to shape and 
stiffen the surface membrane. A particularly striking example of spectrin’s role 





Figure 16-23 The formation of two 
types of actin filament bundles. 

(A) Fimbrin cross-links actin filaments into 
tight bundles, which exclude the motor 
protein myosin Il from participating in the 
assembly. In contrast, a-actinin, which is 
a homodimer, cross-links actin filaments 
into loose bundles, which allow myosin 
(not shown) to incorporate into the bundle. 
Fimbrin and a-actinin tend to exclude 
one another because of the very different 
spacing of the actin filament bundles that 
they form. (B) Electron micrograph of 
purified a-actinin molecules. (B, courtesy 
of John Heuser.) 
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Figure 16-24 Filamin cross-links actin 
filaments into a three-dimensional 
network and is required for normal 
neuronal migration. (A) Each filamin 
homodimer is about 160 nm long when 
fully extended and forms a flexible, high- 
angle link between two adjacent actin 
filaments. A set of actin filaments cross- 
linked by filamin forms a mechanically 
strong web or gel. (B) Magnetic resonance 
imaging of a normal human brain (left) 

and of a patient with periventricular 
heterotopia (right) caused by mutation 

in the filamin A gene. In contrast to the 
smooth ventricular surface in the normal 
brain, a rough zone of cortical neurons 
(arrowheads) is seen along the lateral walls 
of the ventricles, representing neurons 
that have failed to migrate to the cortex 
during brain develooment. Remarkably, 
although many neurons are not in the 

right place, the intelligence of affected 
individuals is frequently normal or only 
mildly compromised, and the major clinical 
syndrome is epilepsy that often starts in the 
second decade of life. (B, adapted from 

Y. Feng and C.A. Walsh, Nat. Cell Biol. 
6:1034-1038, 2004. With permission 
from Macmillan Publishers Ltd.) 
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in promoting mechanical stability is the long, thin axon of neurons in the nema- 
tode worm Caenorhabditis elegans, where spectrin is required to keep them from 
breaking during the twisting motions the worms make during crawling. 

The connections of the cortical actin cytoskeleton to the plasma membrane 
are only partially understood. Members of the ERM family (named for its first 
three members, ezrin, radixin, and moesin), help organize membrane domains 
through their ability to interact with transmembrane proteins and the underlying 
cytoskeleton. In so doing, they not only provide structural links to strengthen the 
cell cortex, but also regulate the activities of signal transduction pathways. Moesin 
also increases cortical stiffness to promote cell rounding during mitosis. Measure- 
ments by atomic force microscopy indicate that the cell cortex remains soft during 
mitosis when moesin is depleted. ERM proteins are thought to bind to and orga- 
nize the cortical actin cytoskeleton in a variety of contexts, thereby affecting the 
shape and stiffness of the membrane as well as the localization and activity of 
signaling molecules. 


Bacteria Can Hijack the Host Actin Cytoskeleton 


The importance of accessory proteins in actin-based motility and force produc- 
tion is illustrated beautifully by studies of certain bacteria and viruses that use 
components of the host-cell actin cytoskeleton to move through the cytoplasm. 
The cytoplasm of mammalian cells is extremely viscous, containing organelles 
and cytoskeletal elements that inhibit diffusion of large particles like bacteria or 
viruses. To move around in a cell and invade neighboring cells, several patho- 
gens, including Listeria monocytogenes (which causes a rare but serious form of 
food poisoning), overcome this problem by recruiting and activating the Arp 2/3 
complex at their surface. The Arp 2/3 complex nucleates the assembly of actin 
filaments that generate a substantial force and push the bacterium through the 
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Figure 16-25 The actin-based movement of Listeria monocytogenes. (A) Fluorescence micrograph of an infected cell 
that has been stained to reveal bacteria in red and actin filaments in green. Note the cometlike tail of actin filaments behind 
each moving bacterium. Regions of overlap between red and green fluorescence appear yellow. (B) Listeria motility can be 
reconstituted in a test tube with ATP and just four purified proteins: actin, Aro 2/3 complex, capping protein, and cofilin. This 
micrograph shows the dense actin tails behind bacteria (black). (C) The ActA protein on the bacterial surface activates the 
Arp 2/3 complex to nucleate new filament assembly along the sides of existing filaments. Filaments grow at their plus end 
until capped by capping protein. Actin is recycled through the action of cofilin, which enhances depolymerization at the minus 
ends of the filaments. By this mechanism, polymerization is focused at the rear surface of the bacterium, propelling it forward 
(see Movie 23.7). (A, courtesy of Julie Theriot and Tim Mitchison; B, from T.P. Loisel et al., Nature 401:613-616, 1999. With 
permission from Macmillan Publishers Ltd.) 


cytoplasm at rates of up to 1 um/sec, leaving behind a long actin “comet tail” 
(Figure 16-25; see also Figures 23-28 and 23-29). This motility can be reconsti- 
tuted in a test tube by adding the bacteria to a mixture of pure actin, Arp 2/3 com- 
plex, cofilin, and capping protein, illustrating how actin polymerization dynamics 
generate movement through spatial regulation of filament assembly and disas- 
sembly. As we shall see, actin-based movement of this sort also underlies mem- 
brane protrusion at the leading edge of motile cells. 


Summary 


Actin is a highly conserved cytoskeletal protein that is present in high concentra- 
tions in nearly all eukaryotic cells. Nucleation presents a kinetic barrier to actin 
polymerization, but once formed, actin filaments undergo dynamic behavior due 
to hydrolysis of the bound nucleotide ATP. Actin filaments are polarized and can 
undergo treadmilling when a filament assembles at the plus end while simultane- 
ously depolymerizing at the minus end. In cells, actin filament dynamics are reg- 
ulated at every step, and the varied forms and functions of actin depend on a ver- 
satile repertoire of accessory proteins. Approximately half of the actin is kept in a 
monomeric form through association with sequestering proteins such as thymosin. 
Nucleation factors such as the Arp 2/3 complex and formins promote formation 
of branched and parallel filaments, respectively. Interplay between proteins that 
bind or cap actin filaments and those that promote filament severing or depolym- 
erization can slow or accelerate the kinetics of filament assembly and disassembly. 
Another class of accessory proteins assembles the filaments into larger ordered struc- 
tures by cross-linking them to one another in geometrically defined ways. Connec- 
tions between these actin arrays and the plasma membrane of cells give an animal 
cell mechanical strength and permit the elaboration of cortical cellular structures 
such as lamellipodia, filopodia, and microvilli. By inducing actin filament polym- 
erization at their surface, intracellular pathogens can hijack the host-cell cytoskele- 
ton and move around inside the cell. 
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Figure 16-26 Myosin Il. (A) The two globular heads and long tail of a myosin Il molecule shadowed with platinum can be seen 
in this electron micrograph. (B) A myosin II molecule is composed of two heavy chains (each about 2000 amino acids long; 
green) and four light chains (blue). The light chains are of two distinct types, and one copy of each type is present on each 
myosin head. Dimerization occurs when the two a helices of the heavy chains wrap around each other to form a coiled-coil, 
driven by the association of regularly soaced hydrophobic amino acids (see Figure 3-9). The coiled-coil arrangement makes an 
extended rod in solution, and this part of the molecule forms the tail. (A, courtesy of David Shotton.) 


MYOSIN AND ACTIN 


A crucial feature of the actin cytoskeleton is that it can form contractile structures 
that cross-link and slide actin filaments relative to one another through the action 
of myosin motor proteins. In addition to driving muscle contraction, actin-myo- 
sin assemblies perform important functions in non-muscle cells. 


Actin-Based Motor Proteins Are Members of the Myosin 
Superfamily 


The first motor protein to be identified was skeletal muscle myosin, which gen- 
erates the force for muscle contraction. This protein, now called myosin II, is an 
elongated protein formed from two heavy chains and two copies of each of two 
light chains. Each heavy chain has a globular head domain at its N-terminus that 
contains the force-generating machinery, followed by a very long amino acid 
sequence that forms an extended coiled-coil that mediates heavy-chain dimeriza- 
tion (Figure 16-26). The two light chains bind close to the N-terminal head 
domain, while the long coiled-coil tail bundles itself with the tails of other myosin 
molecules. These tail-tail interactions form large, bipolar “thick filaments” that 
have several hundred myosin heads, oriented in opposite directions at the two 
ends of the thick filament (Figure 16-27). 
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Figure 16-27 The myosin II bipolar thick filament in muscle. (A) Electron micrograph of a myosin II thick filament isolated from frog muscle. Note 
the central bare zone, which is free of head domains. (B) Schematic diagram, not drawn to scale. The myosin Il molecules aggregate by means of 
their tail regions, with their heads projecting to the outside of the filament. The bare zone in the center of the filament consists entirely of myosin Il 
tails. (C) A small section of a myosin Il filament as reconstructed from electron micrographs. An individual myosin molecule is highlighted in green. 
The cytoplasmic myosin II filaments in non-muscle cells are much smaller, although similarly organized (see Figure 16-39). (A, courtesy of Murray 
Stewart; C, based on R.A. Crowther, R. Padrón and R. Craig, J. Mol. Biol. 184:429-439, 1985. With permission from Academic Press.) 
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Each myosin head binds and hydrolyzes ATP, using the energy of ATP hydroly- 
sis to walk toward the plus end of an actin filament (Figure 16-28). The opposing 
orientation of the heads in the thick filament makes the filament efficient at slid- 
ing pairs of oppositely oriented actin filaments toward each other, shortening the 
muscle. In skeletal muscle, in which carefully arranged actin filaments are aligned 
in “thin filament” arrays surrounding the myosin thick filaments, the ATP-driven 
sliding of actin filaments results in a powerful contraction. Cardiac and smooth 
muscle contain myosin II molecules that are similarly arranged, although differ- 
ent genes encode them. 


Myosin Generates Force by Coupling ATP Hydrolysis to 
Conformational Changes 


Motor proteins use structural changes in their ATP-binding sites to produce cyclic 
interactions with a cytoskeletal filament. Each cycle of ATP binding, hydrolysis, 
and release propels them forward in a single direction to a new binding site along 
the filament. For myosin II, each step of the movement along actin is generated by 
the swinging of an 8.5-nm-long a helix, or lever arm, which is structurally stabi- 
lized by the binding of light chains. At the base of this lever arm next to the head, 
there is a pistonlike helix that connects movements at the ATP-binding cleft in the 
head to small rotations of the so-called converter domain. A small change at this 
point can swing the helix like a long lever, causing the far end of the helix to move 
by about 5.0 nm. 

These changes in the conformation of the myosin are coupled to changes in 
its binding affinity for actin, allowing the myosin head to release its grip on the 
actin filament at one point and snatch hold of it again at another. The full mech- 
anochemical cycle of nucleotide binding, nucleotide hydrolysis, and phosphate 
release (which causes the “power stroke”) produces a single step of movement 
(Figure 16-29). At low ATP concentrations, the interval between the force-pro- 
ducing step and the binding of the next ATP is long enough that single steps can 
be observed (Figure 16-30). 


Sliding of Myosin Il Along Actin Filaments Causes Muscles to 
Contract 
Muscle contraction is the most familiar and best-understood form of movement in 


animals. In vertebrates, running, walking, swimming, and flying all depend on the 
rapid contraction of skeletal muscle on its scaffolding of bone, while involuntary 


Figure 16-28 Direct evidence for the 
motor activity of the myosin head. In this 
experiment, purified myosin heads were 
attached to a glass slide, and then actin 
filaments labeled with fluorescent phalloidin 
were added and allowed to bind to the 
myosin heads. (A) When ATP was added, 
the actin filaments began to glide along 

the surface, owing to the many individual 
steps taken by each of the dozens of 
myosin heads bound to each filament. The 
video frames shown in this Sequence were 
recorded about 0.6 second apart; the two 
actin filaments shown (one red and one 
green) were moving in opposite directions 
at a rate of about 4 um/sec. (B) Diagram 
of the experiment. The large red arrows 
indicate the direction of actin filament 
movement (Movie 16.2). (A, courtesy of 
James Spudich.) 


MYOSIN AND ACTIN 


actin filament 






us QOL og Ge, :: 


myosin head 


myosin 
thick filament 





ATTACHED At the start of the cycle shown in this figure, 
a myosin head lacking a bound nucleotide is locked 
tightly onto an actin filament in a rigor configuration (so 
named because it is responsible for rigor mortis, the 
rigidity of death). In an actively contracting muscle, this 
state is very short-lived, being rapidly terminated by the 
binding of a molecule of ATP. 


RELEASED A molecule of ATP binds to the large cleft on 
the “back” of the head (that is, on the side furthest from 
the actin filament) and immediately causes a slight 
change in the conformation of the actin-binding site, 
reducing the affinity of the head for actin and allowing 
it to move along the filament. (The space drawn here 
between the head and actin emphasizes this change, 
although in reality the head probably remains very close 
to the actin.) 


COCKED The cleft closes like a clam shell around the 
ATP molecule, triggering a movement in the lever arm 
that causes the head to be displaced along the filament 
by a distance of about 5 nm. Hydrolysis of ATP occurs, but 
the ADP and inorganic phosphate (Pj) remain tightly 
bound to the protein. 


FORCE-GENERATING Weak binding of the myosin head 
to a new site on the actin filament causes release of the 
inorganic phosphate produced by ATP hydrolysis, 


concomitantly with the tight binding of the head to actin. 


This release triggers the power stroke—the 
force-generating change in shape during which the head 
regains its original conformation. In the course of the 
power stroke, the head loses its bound ADP, thereby 
returning to the start of a new cycle. 


ATTACHED At the end of the cycle, the myosin head is 
again locked tightly to the actin filament in a rigor 
configuration. Note that the head has moved to a new 
position on the actin filament. 
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Figure 16-29 The cycle of structural changes used by myosin II to walk along an actin filament. In the myosin II cycle, the 
head remains bound to the actin filament for only about 5% of the entire cycle time, allowing many myosins to work together to 
move a single actin filament (Movie 16.3). (Based on |. Rayment et al., Science 261:50-58, 1993.) 


movements such as heart pumping and gut peristalsis depend on the contrac- 
tion of cardiac muscle and smooth muscle, respectively. All these forms of muscle 
contraction depend on the ATP-driven sliding of highly organized arrays of actin 


filaments against arrays of myosin II filaments. 


Skeletal muscle was a relatively late evolutionary development, and muscle 
cells are highly specialized for rapid and efficient contraction. The long, thin 
muscle fibers of skeletal muscle are actually huge single cells that form during 
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development by the fusion of many separate cells. The large muscle cell retains 
the many nuclei of the contributing cells. These nuclei lie just beneath the plasma 
membrane (Figure 16-31). The bulk of the cytoplasm inside is made up of myofi- 
brils, which is the name given to the basic contractile elements of the muscle cell. 
A myofibril is a cylindrical structure 1-2 um in diameter that is often as long as 
the muscle cell itself. It consists of a long, repeated chain of tiny contractile units— 
called sarcomeres, each about 2.2 um long—which give the vertebrate myofibril its 
striated appearance (Figure 16-32). 

Each sarcomere is formed from a miniature, precisely ordered array of parallel 
and partly overlapping thin and thick filaments. The thin filaments are composed 
of actin and associated proteins, and they are attached at their plus ends to a Z 
disc at each end of the sarcomere. The capped minus ends of the actin filaments 
extend in toward the middle of the sarcomere, where they overlap with thick fil- 
aments, the bipolar assemblies formed from specific muscle isoforms of myosin 
II (see Figure 16-27). When this region of overlap is examined in cross section by 
electron microscopy, the myosin filaments are arranged in a regular hexagonal 
lattice, with the actin filaments evenly spaced between them (Figure 16-33). Car- 
diac muscle and smooth muscle also contain sarcomeres, although the organiza- 
tion is not as regular as that in skeletal muscle. 


Figure 16-31 Skeletal muscle cells (also called muscle fibers). (A) These 
huge multinucleated cells form by the fusion of many muscle cell precursors, 
called myoblasts. Here, a single muscle cell is depicted. In an adult human, 
a muscle cell is typically 50 um in diameter and can be up to several 
centimeters long. (B) Fluorescence micrograph of rat muscle, showing the 
peripherally located nuclei (b/ue) in these giant cells. Myofibrils are stained 
red. (B, courtesy of Nancy L. Kedersha.) 


yy ml 


í a mi) myofibril 


















Mf 





(A) 


Figure 16-30 The force of a single 
myosin molecule moving along an 

actin filament measured using an 
optical trap. (A) Schematic of the 
experiment, showing an actin filament 
with beads attached at both ends and held 
in place by focused beams of light called 
optical tweezers (Movie 16.4). 

The tweezers trap and move the bead, 
and can also be used to measure the 
force exerted on the bead through the 
filament. In this experiment, the filament 
was positioned over another bead to 
which myosin Il motors were attached, 
and the optical tweezers were used to 
determine the effects of myosin binding 

on movement of the actin filament. 

(B) These traces show filament movement 
in two separate experiments. Initially, when 
the actin filament is unattached to myosin, 
thermal motion of the filament produces 
noisy fluctuations in filament position. When 
a single myosin binds to the actin filament, 
thermal motion decreases abruptly and a 
roughly 10-nm displacement results from 
movement of the filament by the motor. The 
motor then releases the filament. Because 
the ATP concentration is very low in this 
experiment, the myosin remains attached 
to the actin filament for much longer than 
it would in a muscle cell. (Adapted from 

C. Ruegg et al., Physiology 17:213-218, 
2002. With permission from the American 
Physiological Society.) 
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Figure 16-32 Skeletal muscle myofibrils. (A) Low-magnification electron 
micrograph of a longitudinal section through a skeletal muscle cell of a 
rabbit, showing the regular pattern of cross-striations. The cell contains 
many myofibrils aligned in parallel (see Figure 16-31). (B) Detail of the 
skeletal muscle shown in (A), Showing portions of two adjacent myofibrils 
and the definition of a sarcomere (black arrow). (C) Schematic diagram of 
a single sarcomere, showing the origin of the dark and light bands seen in 
the electron micrographs. The Z discs, at each end of the sarcomere, are 
attachment sites for the plus ends of actin filaments (thin filaments); the 

M line, or midline, is the location of proteins that link adjacent myosin II 
filaments (thick filaments) to one another. (D) When the sarcomere contracts, 
the actin and myosin filaments slide past one another without shortening. 
(A and B, courtesy of Roger Craig.) 


Sarcomere shortening is caused by the myosin filaments sliding past the actin 
thin filaments, with no change in the length of either type of filament (see Figure 
16-32C and D). Bipolar thick filaments walk toward the plus ends of two sets of 
thin filaments of opposite orientations, driven by dozens of independent myosin 
heads that are positioned to interact with each thin filament. Because there is no 
coordination among the movements of the myosin heads, it is critical that they 
remain tightly bound to the actin filament for only a small fraction of each ATPase 
cycle so that they do not hold one another back. Each myosin thick filament has 
about 300 heads (294 in frog muscle), and each head cycles about five times per 
second in the course of a rapid contraction—sliding the myosin and actin fila- 
ments past one another at rates of up to 15 um/sec and enabling the sarcomere 
to shorten by 10% of its length in less than one-fiftieth of a second. The rapid syn- 
chronized shortening of the thousands of sarcomeres lying end-to-end in each 
myofibril enables skeletal muscle to contract rapidly enough for running and fly- 
ing, or for playing the piano. 

Accessory proteins produce the remarkable uniformity in filament organiza- 
tion, length, and spacing in the sarcomere (Figure 16-34). The actin filament plus 
ends are anchored in the Z disc, which is built from CapZ and a-actinin; the Z disc 
caps the filaments (preventing depolymerization), while holding them together in 
a regularly spaced bundle. The precise length of each thin filament is influenced 
by a protein of enormous size, called nebulin, which consists almost entirely of a 
repeating 35-amino-acid actin-binding motif. Nebulin stretches from the Z disc 
toward the minus end of each thin filament, which is capped and stabilized by tro- 
pomodulin. Although there is some slow exchange of actin subunits at both ends 
of the muscle thin filament, such that the components of the thin filament turn 
over with a half-life of several days, the actin filaments in sarcomeres are remark- 
ably stable compared with those found in most other cell types, whose dynamic 
actin filaments turn over with half-lives of a few minutes or less. 
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Figure 16-33 Electron micrographs 

of an insect flight muscle viewed in 
cross section. The myosin and actin 
filaments are packed together with almost 
crystalline regularity. Unlike their vertebrate 
counterparts, these myosin filaments 
have a hollow center, as seen in the 
enlargement on the right. The geometry 
of the hexagonal lattice is slightly different 
in vertebrate muscle. (From J. Auber, 

J. de Microsc. 8:197-232, 1969. With 
permission from Societé Française de 
Microscopie Electronique.) 
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Opposing pairs of an even longer template protein, called titin, position the Figure 16-34 Organization of accessory 
thick filaments midway between the Z discs. Titin acts as a molecular spring, with Proteins in a sarcomere. Each giant titin 
l fi iobalim-liked meat fold b t molecule extends from the Z disc to the 
a long series ofimmunoglobulin-like domains that can unfold one by one as stress \y line—a distance of over 1 um. Part of 
is applied to the protein. A springlike unfolding and refolding of these domains each titin molecule is closely associated 
keeps the thick filaments poised in the middle of the sarcomere and allows the with a myosin thick filament (which 
muscle fiber to recover after being overstretched. In C. elegans, whose sarcomeres switches polarity at the M line); the rest of 
are longer than those in vertebrates, titin is longer as well, suggesting that it serves Me ttin molecule is elastic and changes 
l lecular ruler, determining in this case the overall length of each sar- e aianreacaaen gis 
oO ds. ROS ) 8 8 relaxes. Each nebulin molecule is exactly 
comere. the length of a thin filament. The actin 
filaments are also coated with tropomyosin 


A Sudden Rise in Cytosolic Ca?* Concentration Initiates Muscle ia pale Nie Cla Sl 
—36) and are capped at both ends. 


Contraction Tropomodulin caps the minus end of the 


: : À , ; ; actin filaments, and CapZ anchors the 
The force-generating molecular interaction between myosin thick filaments and plus end at the Z disc, which also contains 


actin thin filaments takes place only when a signal passes to the skeletal muscle —_a-actinin (not shown). 
from the nerve that stimulates it. Immediately upon arrival of the signal, the mus- 
cle cell needs to be able to contract very rapidly, with all the sarcomeres shorten- 
ing simultaneously. Two major features of the muscle cell make extremely rapid 
contraction possible. First, as previously discussed, the individual myosin motor 
heads in each thick filament spend only a small fraction of the ATP cycle time 
bound to the filament and actively generating force, so many myosin heads can 
act in rapid succession on the same thin filament without interfering with one 
another. Second, a specialized membrane system relays the incoming signal rap- 
idly throughout the entire cell. The signal from the nerve triggers an action poten- 
tial in the muscle cell plasma membrane (discussed in Chapter 11), and this elec- 
trical excitation spreads swiftly into a series of membranous folds—the transverse 
tubules, or T tubules—that extend inward from the plasma membrane around 
each myofibril. The signal is then relayed across a small gap to the sarcoplasmic 
reticulum, an adjacent weblike sheath of modified endoplasmic reticulum that 
surrounds each myofibril like a net stocking (Figure 16-35A and B). 

When the incoming action potential activates a Ca** channel in the T-tubule 
membrane, Ca** influx triggers the opening of Ca*t-release channels in the sar- 
coplasmic reticulum (Figure 16-35C). Ca** flooding into the cytosol then initiates 
the contraction of each myofibril. Because the signal from the muscle cell plasma 
membrane is passed within milliseconds (via the T tubules and sarcoplasmic 
reticulum) to every sarcomere in the cell, all of the myofibrils in the cell contract 
at once. The increase in Ca** concentration is transient because the Ca** is rapidly 
pumped back into the sarcoplasmic reticulum by an abundant, ATP-dependent 
Ca**-pump (also called a Ca**-ATPase) in its membrane (see Figure 11-13). Typ- 
ically, the cytoplasmic Ca** concentration is restored to resting levels within 30 
msec, allowing the myofibrils to relax. Thus, muscle contraction depends on two 
processes that consume enormous amounts of ATP: filament sliding, driven by 
the ATPase of the myosin motor domain, and Ca** pumping, driven by the Ca?*- 


pump. 


MYOSIN AND ACTIN 921 







transverse (T) 
tubules formed 
from invaginations 
of plasma membrane 





= 







1 


—__— sarcoplasmic ae 
5 reticulum EE 


\ 


myofibril 





Ca?*-release channels 
Lo o ooo | 


plasma membrane 


A 

° (B) 0.5 um 
LUMEN OF T-TUBULE Figure 16-35 T tubules and the 
(EXTRACELLULAR SPACE) depolarized T-tubule membrane sarcoplasmic reticulum. (A) Drawing of 


the two membrane systems that relay the 


ee 
voltage-gated ® e® © o signal to contract from the muscle cell 













Ca** di e Se g plasma membrane to all of the myofibrils in 
polarized T eee the cell. (B) Electron micrograph showing 
Tube | PO a cross section of a T tubule. Note the 
membrane position of the large Ca**+-release channels 

action in the sarcoplasmic reticulum membrane 
ieaiosany) ee ..%s 25 nim that connect to the adjacent T-tubule 





membrane. (C) Schematic diagram 
showing how a Ca?t-release channel in 

the sarcoplasmic reticulum membrane is 
thought to be opened by the activation of a 
voltage-gated Ca?+ channel (Movie 16.5). 
(B, courtesy of Clara Franzini-Armstrong.) 
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The Ca?+-dependence of vertebrate skeletal muscle contraction, and hence 
its dependence on commands transmitted via nerves, is due entirely to a set of 
specialized accessory proteins that are closely associated with the actin thin fila- 
ments. One of these accessory proteins is a muscle form of tropomyosin, the elon- 
gated protein that binds along the groove of the actin filament helix. The other is 
troponin, a complex of three polypeptides, troponins T, I, and C (named for their 
tropomyosin-binding, inhibitory, and Ca**-binding activities, respectively). Tro- 
ponin I binds to actin as well as to troponin T. In a resting muscle, the troponin I-T 
complex pulls the tropomyosin out of its normal binding groove into a position 
along the actin filament that interferes with the binding of myosin heads, thereby 
preventing any force-generating interaction. When the level of Ca** is raised, tro- 
ponin C—which binds up to four molecules of Ca*t—causes troponin I to release 
its hold on actin. This allows the tropomyosin molecules to slip back into their 
normal position so that the myosin heads can walk along the actin filaments 
(Figure 16-36). Troponin C is closely related to the ubiquitous Ca**-binding pro- 
tein calmodulin (see Figure 15-33); it can be thought of as a specialized form of 
calmodulin that has acquired binding sites for troponin I and troponin T, thereby 
ensuring that the myofibril responds extremely rapidly to an increase in Ca** con- 
centration. 

In smooth muscle cells, so called because they lack the regular striations of 
skeletal muscle, contraction is also triggered by an influx of calcium ions, but the 
regulatory mechanism is different. Smooth muscle forms the contractile portion 
of the stomach, intestine, and uterus, as well as the walls of arteries and many 
other structures requiring slow and sustained contractions. Smooth muscle is 
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Figure 16-36 The control of skeletal muscle contraction by troponin. (A) A skeletal-muscle-cell thin filament, showing the 
positions of tropomyosin and troponin along the actin filament. Each tropomyosin molecule has seven evenly spaced regions 
with similar amino acid sequences, each of which is thought to bind to an actin subunit in the filament. (B) Reconstructed 
cryoelectron microscopy image of an actin filament showing the relative position of a superimposed tropomyosin strand in the 
presence (dark purple) or absence (light purple) of calcium. (A, adapted from G.N. Phillips, J.P. Fillers and C. Cohen, J. Mol. Biol. 
192:111-131, 1986. With permission from Academic Press; B, adapted from C. Xu et al., Biophys. J. 77: 985-992, 1999. With 


permission from Elsevier.) 


composed of sheets of highly elongated spindle-shaped cells, each with a sin- 
gle nucleus. Smooth muscle cells do not express the troponins. Instead, elevated 
intracellular Ca?* levels regulate contraction by a mechanism that depends on 
calmodulin (Figure 16-37). Ca?*-bound calmodulin activates myosin light-chain 
kinase (MLCK), thereby inducing the phosphorylation of smooth muscle myosin 
on one ofits two light chains. When the light chain is phosphorylated, the myosin 
head can interact with actin filaments and cause contraction; when it is dephos- 
phorylated, the myosin head tends to dissociate from actin and becomes inactive. 

The phosphorylation events that regulate contraction in smooth muscle cells 
occur relatively slowly, so that maximum contraction often requires nearly a 
second (compared with the few milliseconds required for contraction of a skel- 
etal muscle cell). But rapid activation of contraction is not important in smooth 
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Figure 16-37 Smooth muscle 
contraction. (A) Upon muscle stimulation 
by activation of cell-surface receptors, 
Ca?* released into the cytoplasm from 
the sarcoplasmic reticulum (SR) binds to 
calmodulin (see Figure 15-29). Ca?+-bound 
calmodulin then binds myosin light-chain 
kinase (MLCK), which phosphorylates 
myosin light chain, stimulating myosin 
activity. Non-muscle myosin is regulated 
by the same mechanism (see Figure 
16-39). (B) Smooth muscle cells in a cross 
section of cat intestinal wall. The outer 
layer of smooth muscle is oriented with 
the long axis of its cells extending parallel 
along the length of the intestine, and upon 
contraction will shorten the intestine. The 
inner layer is oriented circularly around 
the intestine and when contracted will 
cause the intestine to become narrower. 
Contraction of both layers squeezes 
material through the intestine, much like 
squeezing toothpaste out of a tube. 

(C) A model for the contractile apparatus 
in a smooth muscle cell, with bundles of 
contractile filaments containing actin and 
myosin (red) oriented obliquely to the long 
axis of the cell. Their contraction greatly 
shortens the cell. Only a few of the many 
bundles are shown. (B, courtesy of 

Gwen V. Childs.) 
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muscle: its myosin II hydrolyzes ATP about 10 times more slowly than skeletal 
muscle myosin, producing a slow cycle of myosin conformational changes that 
results in slow contraction. 


Heart Muscle Is a Precisely Engineered Machine 


The heart is the most heavily worked muscle in the body, contracting about 3 bil- 
lion (3 x 10°) times during the course of a human lifetime (Movie 16.6). Heart cells 
express several specific isoforms of cardiac muscle myosin and cardiac muscle 
actin. Even subtle changes in these cardiac-specific contractile proteins—changes 
that would not cause any noticeable consequences in other tissues—can cause 
serious heart disease (Figure 16-38). 

The normal cardiac contractile apparatus is such a highly tuned machine 
that a tiny abnormality anywhere in the works can be enough to gradually wear 
it down over years of repetitive motion. Familial hypertrophic cardiomyopathy is 
a common cause of sudden death in young athletes. It is a genetically dominant 
inherited condition that affects about two out of every thousand people, and it is 
associated with heart enlargement, abnormally small coronary vessels, and dis- 
turbances in heart rhythm (cardiac arrhythmias). The cause of this condition is 
either any one of over 40 subtle point mutations in the genes encoding cardiac f 
myosin heavy chain (almost all causing changes in or near the motor domain) or 
one of about a dozen mutations in other genes encoding contractile proteins— 
including myosin light chains, cardiac troponin, and tropomyosin. Minor mis- 
sense mutations in the cardiac actin gene cause another type of heart condition, 
called dilated cardiomyopathy, which can also result in early heart failure. 


Actin and Myosin Perform a Variety of Functions in Non-Muscle 
Cells 


Most non-muscle cells contain small amounts of contractile actin-myosin II bun- 
dles that form transiently under specific conditions and are much less well orga- 
nized than muscle fibers. Non-muscle contractile bundles are regulated by myo- 
sin phosphorylation rather than the troponin mechanism (Figure 16-39). These 
contractile bundles function to provide mechanical support to cells, for example, 
by assembling into cortical stress fibers that connect the cell to the extracellular 
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Figure 16-38 Effect on the heart ofa 
subtle mutation in cardiac myosin. Left, 
normal heart from a 6-day-old mouse 
pup. Right, heart from a pup with a point 
mutation in both copies of its cardiac 
myosin gene, changing Arg403 to Gln. 
The arrows indicate the atria. In the heart 
from the pup with the cardiac myosin 
mutation, both atria are greatly enlarged 
(hypertrophic), and the mice die within a 
few weeks of birth. (From D. Fatkin et al., 
J. Clin. Invest. 103:147-1538, 1999. With 
permission from The American Society for 
Clinical Investigation.) 





Figure 16-39 Light-chain phosphorylation and the regulation of the assembly of myosin II into thick filaments. (A) The controlled 
phosphorylation by the enzyme myosin light-chain kinase (MLCk) of one of the two light chains (the so-called regulatory light chain, shown in light 


blue) on non-muscle myosin Il in a test tube has at least two effects: it causes a change in the conformation of the myosin head, exposing its actin- 
binding site, and it releases the myosin tail from a “sticky patch” on the myosin head, thereby allowing the myosin molecules to assemble into short, 
bipolar, thick filaments. Smooth muscle is regulated by the same mechanism (see Figure 16-37). (B) Electron micrograph of negatively stained short 
filaments of myosin II that have been induced to assemble in a test tube by phosphorylation of their light chains. These myosin Il filaments are much 
smaller than those found in skeletal muscle cells (see Figure 16-27). (B, courtesy of John Kendrick-Jones.) 
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matrix through focal adhesions or by forming a circumferential belt in an epithe- 
lial cell, connecting it to adjacent cells through adherens junctions (discussed in 
Chapter 19). As described in Chapter 17, actin and myosin II in the contractile 
ring generate the force for cytokinesis, the final stage in cell division. Finally, as 
discussed later, contractile bundles also contribute to the adhesion and forward 
motion of migrating cells. 

Non-muscle cells also express a large family of other myosin proteins, which 
have diverse structures and functions in the cell. Following the discovery of con- 
ventional muscle myosin, a second member of the family was found in the fresh- 
water amoeba Acanthamoeba castellanii. This protein had a different tail structure 
and seemed to function as a monomer, and so it was named myosin I (for one- 
headed). Conventional muscle myosin was renamed myosin II (for two-headed). 
Subsequently, many other myosin types were discovered. The heavy chains gen- 
erally start with a recognizable myosin motor domain at the N-terminus and 
then diverge widely with a variety of C-terminal tail domains (Figure 16-40). The 
myosin family includes a number of one-headed and two-headed varieties that 
are about equally related to myosin I and myosin II, and the nomenclature now 
reflects their approximate order of discovery (myosin III through at least myosin 
XVIII). Sequence comparisons among diverse eukaryotes indicate that there are 
at least 37 distinct myosin families in the superfamily. All of the myosins except 
one move toward the plus end of an actin filament, although they do so at differ- 
ent speeds. The exception is myosin VI, which moves toward the minus end. The 
myosin tails (and the tails of motor proteins generally) have apparently diversified 
during evolution to permit the proteins to bind other subunits and to interact with 
different cargoes. 

Some myosins are found only in plants, and some are found only in vertebrates. 
Most, however, are found in all eukaryotes, suggesting that myosins arose early in 
eukaryotic evolution. The human genome includes about 40 myosin genes. Nine 
of the human myosins are expressed primarily or exclusively in the hair cells of 
the inner ear, and mutations in five of them are known to cause hereditary deaf- 
ness. These extremely specialized myosins are important for the construction and 
function of the complex and beautiful bundles of actin found in stereocilia that 
project from the apical surface of these cells (see Figure 9-51); these cellular pro- 
trusions tilt in response to sound and convert sound waves into electrical signals. 

The functions of most of the myosins remain to be determined, but several are 
well characterized. The myosin I proteins often contain either a second actin-bind- 
ing site or amembrane-binding site in their tails, and they are generally involved 
in intracellular organization—including the protrusion of actin-rich structures at 
the cell surface, such as microvilli (see Panel 16-1 and Figure 16-4), and endo- 
cytosis. Myosin V is a two-headed myosin with a large step size (Figure 16-41A) 
and is involved in organelle transport along actin filaments. In contrast to myo- 
sin II motors, which work in ensembles and are attached only transiently to actin 
filaments so as not to interfere with one another, myosin V moves continuously, 


Figure 16-40 Myosin superfamily 
members. Comparison of the domain 
structure of the heavy chains of some 
myosin types. All myosins share similar 
motor domains (shown in dark green), 

but their C-terminal tails (light green) and 
N-terminal extensions (light blue) are very 
diverse. On the right are depictions of 

the molecular structure for these family 
members. Many myosins form dimers, with 
two motor domains per molecule, but a 
few (such as |, Ill, and XIV) seem to function 
as monomers, with just one motor domain. 
Myosin VI, despite its overall structural 
similarity to other family members, is unique 
in moving toward the minus end (instead of 
the plus end) of an actin filament. The small 
insertion within its motor head domain, 

not found in other myosins, is probably 
responsible for this change in direction. 


MYOSIN AND ACTIN 


(B) 















30 to 40 nm swing 
of lever arm 


Y 


Myosin V b `». a Fa 


>y 
plus 
end 


mother 


or processively, along actin filaments without letting go. Myosin V functions are 
well studied in the yeast Saccharomyces cerevisiae, which undergoes a stereotyp- 
ical pattern of growth and division called budding. Actin cables in the mother 
cell point toward the bud, where actin is found in patches that concentrate where 
cell wall growth is taking place. Myosin V motors carry a wide range of cargoes— 
including mRNA, endoplasmic reticulum, and secretory vesicles—along the actin 
cables and into the bud. In addition, myosin V mediates the correct partitioning of 
organelles such as peroxisomes and mitochondria between mother and daughter 
cells (see Figure 16-41B). 


Summary 


Using their neck domain as a lever arm, myosins convert ATP hydrolysis into 
mechanical work to move along actin filaments in a stepwise fashion. Skeletal 
muscle is made up of myofibrils containing thousands of sarcomeres assembled 
from highly ordered arrays of actin and myosin II filaments, together with many 
accessory proteins. Muscle contraction is stimulated by calcium, which causes the 
actin-filament-associated protein tropomyosin to move, uncovering myosin bind- 
ing sites and allowing the filaments to slide past one another. Smooth muscle and 
non-muscle cells have less well-ordered contractile bundles of actin and myosin, 
which are regulated by myosin light-chain phosphorylation. Myosin V transports 
cargo by walking along actin filaments. 


MICROTUBULES 


Microtubules are structurally more complex than actin filaments, but they are 
also highly dynamic and play comparably diverse and important roles in the cell. 
Microtubules are polymers of the protein tubulin. The tubulin subunit is itself a 
heterodimer formed from two closely related globular proteins called a-tubulin 
and /-tubulin, each comprising 445-450 amino acids, which are tightly bound 
together by noncovalent bonds (Figure 16-42A). These two tubulin proteins are 
found only in this heterodimer, and each a or P monomer has a binding site for 
one molecule of GTP. The GTP that is bound to a-tubulin is physically trapped at 
the dimer interface and is never hydrolyzed or exchanged; it can therefore be con- 
sidered to be an integral part of the tubulin heterodimer structure. The nucleotide 
on the $-tubulin, in contrast, may be in either the GTP or the GDP form and is 
exchangeable within the soluble (unpolymerized) tubulin dimer. 

Tubulin is found in all eukaryotic cells, and it exists in multiple isoforms. Yeast 
and human tubulins are 75% identical in amino acid sequence. In mammals, 
there are at least six forms of a-tubulin and a similar number of B-tubulins, each 
encoded by a different gene. The different forms of tubulin are very similar, and 
they generally copolymerize into mixed microtubules in the test tube. However, 
they can have distinct locations in cells and tissues and perform subtly different 
functions. As a striking example, mutations in a particular human f-tubulin gene 
give rise to a paralytic eye-movement disorder due to loss of ocular nerve func- 
tion. Numerous human neurological diseases have been linked to specific muta- 
tions in different tubulin genes. 
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Figure 16-41 Myosin V carries cargo 
along actin filaments. (A) The lever arm of 
myosin V is long, allowing it to take a bigger 
step along an actin filament than myosin ll 
(see Figure 16-29). (B) Myosin V transports 
cargo and organelles along actin cables, in 
this example moving a mitochondrion into 
the growing bud of a yeast cell. 
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Microtubules Are Hollow Tubes Made of Protofilaments 


A microtubule is a hollow cylindrical structure built from 13 parallel protofila- 
ments, each composed of aß-tubulin heterodimers stacked head to tail and then 
folded into a tube (Figure 16-42B-D). Microtubule assembly generates two new 
types of protein-protein contacts. Along the longitudinal axis of the microtubule, 
the “top” of one -tubulin molecule forms an interface with the “bottom” of the 
a-tubulin molecule in the adjacent heterodimer. This interface is very similar to 
the interface holding the a and B monomers together in the dimer subunit, and 
the binding energy is high. Perpendicular to these interactions, neighboring pro- 
tofilaments form lateral contacts. In this dimension, the main lateral contacts are 
between monomers of the same type (a-a and B-P). As longitudinal and lateral 
contacts are repeated during assembly, a slight stagger in lateral contacts gives 
rise to the helical microtubule lattice. Because multiple contacts within the lattice 
hold most of the subunits in a microtubule in place, the addition and loss of sub- 
units occurs almost exclusively at the microtubule ends (see Figure 16-5). These 
multiple contacts among subunits make microtubules stiff and difficult to bend. 
The persistence length of a microtubule is several millimeters, making microtu- 
bules the stiffest and straightest structural elements found in most animal cells. 
The subunits in each protofilament in a microtubule all point in the same 
direction, and the protofilaments themselves are aligned in parallel (see Figure 
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Figure 16-42 The structure of a microtubule and its subunit. (A) The subunit of each protofilament is a tubulin heterodimer, formed from a tightly 
linked pair of a- and B-tubulin monomers. The GTP molecule in the a-tubulin monomer is so tightly bound that it can be considered an integral part 
of the protein. The GTP molecule in the B-tubulin monomer, however, is less tightly bound and has an important role in filament dynamics. Both 
nucleotides are shown in red. (B) One tubulin subunit (aB-heterodimer) and one protofilament are shown schematically. Each protofilament consists 
of many adjacent subunits with the same orientation. (C) The microtubule is a stiff hollow tube formed from 13 protofilaments aligned in parallel. 

(D) A short segment of a microtubule viewed in an electron microscope. (E) Electron micrograph of a cross section of a microtubule showing a ring 
of 13 distinct protofilaments. (D, courtesy of Richard Wade; E, courtesy of Richard Linck.) 
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Figure 16-43 The preferential growth of microtubules at the plus end. 
Microtubules grow faster at one end than at the other. A stable bundle 

of microtubules obtained from the core of a cilium (called an axoneme) 
was incubated for a short time with tubulin subunits under polymerizing 
conditions. Microtubules grew fastest from the plus end of the microtubule 
bundle, the end at the top in this micrograph. (Courtesy of Gary Borisy.) 


16-42). Therefore, the microtubule lattice itself has a distinct structural polarity, 
with a-tubulins exposed at the minus end and f-tubulins exposed at the plus 
end. As for actin filaments, the regular, parallel orientation of their subunits gives 
microtubules structural and dynamic polarity (Figure 16-43), with plus ends 
growing and shrinking more rapidly. 


Microtubules Undergo Dynamic Instability 


Microtubule dynamics, like those of actin filaments, are profoundly influenced 
by the binding and hydrolysis of nucleotide—GTP in this case. GTP hydrolysis 
occurs only within the B-tubulin subunit of the tubulin dimer. It proceeds very 
slowly in free tubulin subunits but is accelerated when they are incorporated into 
microtubules. Following GTP hydrolysis, the free phosphate group is released and 
the GDP remains bound to -tubulin within the microtubule lattice. Thus, as in 
the case of actin filaments, two different types of microtubule structures can exist, 
one with the “T form” of the nucleotide bound (GTP) and one with the “D form” 
bound (GDP). The energy of nucleotide hydrolysis is stored as elastic strain in 
the polymer lattice, making the free-energy change for dissociation of a subunit 
from the D-form polymer more negative than the free-energy change for dissoci- 
ation of a subunit from the T-form polymer. In consequence, the ratio of Ko¢¢/kon 
for GDP-tubulin (its critical concentration [C,(D)]) is much higher than that of 
GTP-tubulin. Thus, under physiological conditions, GTP-tubulin tends to polym- 
erize and GDP-tubulin to depolymerize. 

Whether the tubulin subunits at the very end of a microtubule are in the T or 
the D form depends on the relative rates of GTP hydrolysis and tubulin addition. 
If the rate of subunit addition is high—and thus the filament is growing rapidly— 
then it is likely that a new subunit will be added to the polymer before the nucle- 
otide in the previously added subunit has been hydrolyzed. In this case, the tip 
of the polymer remains in the T form, forming a GTP cap. However, if the rate of 
subunit addition is low, hydrolysis may occur before the next subunit is added, 
and the tip of the filament will then be in the D form. If GTP-tubulin subunits 
assemble at the end of the microtubule at a rate similar to the rate of GTP hydro- 
lysis, then hydrolysis will sometimes “catch up” with the rate of subunit addition 
and transform the end to a D form. This transformation is sudden and random, 
with a certain probability per unit time that depends on the concentration of free 
GTP-tubulin subunits. 

Suppose that the concentration of free tubulin is intermediate between the 
critical concentration for a T-form end and the critical concentration for a D-form 
end (that is, above the concentration necessary for T-form assembly, but below 
that for the D form). Now, any end that happens to be in the T form will grow, 
whereas any end that happens to be in the D form will shrink. On a single microtu- 
bule, an end might grow for a certain length of time in a T form, but then suddenly 
change to the D form and begin to shrink rapidly, even while the free subunit 
concentration is held constant. At some later time, it might then regain a T-form 
end and begin to grow again. This rapid interconversion between a growing and 
shrinking state, at a uniform free subunit concentration, is called dynamic insta- 
bility (Figure 16-44A and Figure 16-45; see Panel 16-2). The change from growth 
to shrinkage is called a catastrophe, while the change to growth is called a rescue. 

In a population of microtubules, at any instant some of the ends are in the T 
form and some are in the D form, with the ratio depending on the hydrolysis rate 
and the free subunit concentration. In vitro, the structural difference between a 
T-form end and a D-form end is dramatic. Tubulin subunits with GTP bound to 
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Figure 16-44 Dynamic instability due to the structural differences between a growing and a shrinking microtubule 
end. (A) If the free tubulin concentration in solution is between the critical concentrations of the GTP- and GDP-bound forms, 

a single microtubule end may undergo transitions between a growing state and a shrinking state. A growing microtubule has 
GTP-containing subunits at its end, forming a GTP cap. If nucleotide hydrolysis proceeds more rapidly than subunit addition, the 
cap is lost and the microtubule begins to shrink, an event called a “catastrophe.” But GTP-containing subunits may still add to 
the shrinking end, and if enough add to form a new cap, then microtubule growth resumes, an event called “rescue.” (B) Model 
for the structural consequences of GTP hydrolysis in the microtubule lattice. The addition of GTP-containing tubulin subunits to 
the end of a protofilament causes the end to grow in a linear conformation that can readily pack into the cylindrical wall of the 
microtubule. Hydrolysis of GTP after assembly changes the conformation of the subunits and tends to force the protofilament 
into a curved shape that is less able to pack into the microtubule wall. (C) In an intact microtubule, protofilaments made from 
GDP-containing subunits are forced into a linear conformation by the many lateral bonds within the microtubule wall, given a 
stable cap of GTP-containing subunits. Loss of the GTP cap, however, allows the GDP-containing protofilaments to relax into 
their more curved conformation. This leads to a progressive disruption of the microtubule. Above the drawings of a growing 
and a shrinking microtubule, electron micrographs show actual microtubules in each of these two states. Note particularly the 
curling, disintegrating GDP-containing protofilaments at the end of the shrinking microtubule. (C, from E.M. Mandelkow, 

E. Mandelkow and R.A. Milligan, J. Cell Biol. 114:977-991, 1991. With permission from The Rockefeller University Press.) 
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the B-monomer produce straight protofilaments that make strong and regular lat- 
eral contacts with one another. But the hydrolysis of GTP to GDP is associated 
with a subtle conformational change in the protein, which makes the protofila- 
ments curved (Figure 16-44B). On a rapidly growing microtubule, the GTP cap 
is thought to constrain the curvature of the protofilaments, and the ends appear 
straight. But when the terminal subunits have hydrolyzed their nucleotides, this 
constraint is removed, and the curved protofilaments spring apart. This cooper- 
ative release of the energy of hydrolysis stored in the microtubule lattice causes 
the curled protofilaments to peel off rapidly, and curved oligomers of GDP-con- 
taining tubulin are seen near the ends of depolymerizing microtubules (Figure 
16-44C). 


Microtubule Functions Are Inhibited by Both Polymer-stabilizing 
and Polymer-destabilizing Drugs 


Chemical compounds that impair polymerization or depolymerization of micro- 
tubules are powerful tools for investigating the roles of these polymers in cells. 
Whereas colchicine and nocodazole interact with tubulin subunits and lead to 
microtubule depolymerization, Taxol binds to and stabilizes microtubules, caus- 
ing a net increase in tubulin polymerization (see Table 16-1). Drugs like these 
have a rapid and profound effect on the organization of the microtubules in living 
cells. Both microtubule-depolymerizing drugs (such as nocodazole) and micro- 
tubule-polymerizing drugs (such as Taxol) preferentially kill dividing cells, since 
microtubule dynamics are crucial for correct function of the mitotic spindle (dis- 
cussed in Chapter 17). Some of these drugs efficiently kill certain types of tumor 
cells in a human patient, although not without toxicity to rapidly dividing normal 
cells, including those in the bone marrow, intestine, and hair follicles. Taxol in 
particular has been widely used to treat cancers of the breast and lung, and it is 
frequently successful in treatment of tumors that are resistant to other chemo- 
therapeutic agents. 


A Protein Complex Containing y- Tubulin Nucleates Microtubules 


Because formation of a microtubule requires the interaction of many tubulin het- 
erodimers, the concentration of tubulin subunits required for spontaneous nucle- 
ation of microtubules is very high. Microtubule nucleation therefore requires 
help from other factors. While a- and B-tubulins are the regular building blocks of 
microtubules, another type of tubulin, called y-tubulin, is present in much smaller 
amounts than a- and -tubulin and is involved in the nucleation of microtubule 
growth in organisms ranging from yeasts to humans. Microtubules are generally 
nucleated from a specific intracellular location known as a microtubule-organiz- 
ing center (MTOC) where y-tubulin is most enriched. Nucleation in many cases 
depends on the y-tubulin ring complex (y-TuRC). Within this complex, two 
accessory proteins bind directly to the y-tubulin, along with several other proteins 
that help create a spiral ring of y-tubulin molecules, which serves as a template 
that creates a microtubule with 13 protofilaments (Figure 16-46). 
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Figure 16-45 Direct observation of the 
dynamic instability of microtubules 

in a living cell. Microtubules in a newt 
lung epithelial cell were observed after 
the cell was injected with a small amount 
of rhodamine-labeled tubulin. Notice 

the dynamic instability of microtubules 

at the edge of the cell. Four individual 
microtubules are highlighted for clarity; 
each of these shows alternating shrinkage 
and growth (Movie 16.7). (Courtesy 

of Wendy C. Salmon and Clare 
Waterman-Storer.) 
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Figure 16-46 Microtubule nucleation by the y-tubulin ring complex. (A) Two copies of y-tubulin associate with a pair 

of accessory proteins to form the y-tubulin small complex (y-TuSC). This image was generated by high-resolution electron 
microscopy of individual purified complexes. (B) Seven copies of the y-TuSC associate to form a spiral structure in which the last 
y-tubulin lies beneath the first, resulting in 13 exposed y-tubulin subunits in a circular orientation that matches the orientation of 
the 13 protofilaments in a microtubule. (C) In many cell types, the y-TuSC spiral associates with additional accessory proteins 

to form the y-tubulin ring complex (y-TuRC), which is likely to nucleate the minus end of a microtubule as shown here. Note 

the longitudinal discontinuity between two protofilaments, which results from the spiral orientation of the y-tulbulin subunits. 
Microtubules often have one such “seam” breaking the otherwise uniform helical packing of the protofilaments. (A and B, from 
J.M. Kollman et al., Nature 466:879-883, 2010. With permission from Macmillan Publishers Ltd.) 


Microtubules Emanate from the Centrosome in Animal Cells 


Many animal cells have a single, well-defined MTOC called the centrosome, 
which is located near the nucleus and from which microtubules are nucleated 
at their minus ends, so the plus ends point outward and continuously grow and 
shrink, probing the entire three-dimensional volume of the cell. A centrosome 
typically recruits more than fifty copies of y-TuRC. In addition, y-TuRC molecules 
are found in the cytoplasm, and centrosomes are not absolutely required for 
microtubule nucleation, since destroying them with a laser pulse does not pre- 
vent microtubule nucleation elsewhere in the cell. A variety of proteins have been 
identified that anchor y-TuRC to the centrosome, but mechanisms that activate 
microtubule nucleation at MTOCs and at other sites in the cell are poorly under- 
stood. 

Embedded in the centrosome are the centrioles, a pair of cylindrical struc- 
tures arranged at right angles to each other in an L-shaped configuration (Figure 
16-47). A centriole consists of a cylindrical array of short, modified microtubules 
arranged into a barrel shape with striking ninefold symmetry (Figure 16-48). 
Together with a large number of accessory proteins, the centrioles organize the 
pericentriolar material, where microtubule nucleation takes place. As described 
in Chapter 17, the centrosome duplicates and splits into two parts before mitosis, 
each containing a duplicated centriole pair. The two centrosomes move to oppo- 
site sides of the nucleus when mitosis begins, and they form the two poles of the 
mitotic spindle (see Panel 17-1). 

Microtubule organization varies widely among different species and cell types. 
In budding yeast, microtubules are nucleated at an MTOC that is embedded in 
the nuclear envelope as a small, multilayered structure called the spindle pole 
body, also found in other fungi and diatoms. Higher-plant cells appear to nucleate 
microtubules at sites distributed all around the nuclear envelope and at the cell 
cortex. Neither fungi nor most plant cells contain centrioles. Despite these differ- 
ences, all these cells seem to use y-tubulin to nucleate their microtubules. 

In cultured animal cells, the aster-like configuration of microtubules is robust, 
with dynamic plus ends pointing outward toward the cell periphery and stable 
minus ends collected near the nucleus. The system of microtubules radiating from 
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Figure 16-47 The centrosome. (A) The centrosome is the major MTOC of animal cells. Located in the cytoplasm next to the 
nucleus, it consists of an amorphous matrix of fibrous proteins to which the y-tubulin ring complexes that nucleate microtubule 
growth are attached. This matrix is organized by a pair of centrioles, as described in the text. (B) A centrosome with attached 
microtubules. The minus end of each microtubule is embedded in the centrosome, having grown from a y-tubulin ring complex, 
whereas the plus end of each microtubule is free in the cytoplasm. (C) In a reconstructed image of the MTOC from a C. elegans 
cell, a dense thicket of microtubules can be seen emanating from the centrosome. (C, from E.T. O’Toole et al., J. Cell Biol. 


163:451-456, 2003. With permission from the authors.) 


the centrosome acts as a device to survey the outlying regions of the cell and to 
position the centrosome at its center. Even in an isolated cell fragment lacking the 
centrosome, dynamic microtubules arrange themselves into a star-shaped array 
with the microtubule minus ends clustered at the center by minus-end-binding 
proteins (Figure 16-49). This ability of the microtubule cytoskeleton to find the 
center of the cell establishes a general coordinate system, which is then used to 
position many organelles within the cell. Highly differentiated cells with complex 
morphologies such as neurons, muscles, and epithelial cells must use additional 
measuring mechanisms to establish their more elaborate internal coordinate 
systems. Thus, for example, when an epithelial cell forms cell-cell junctions and 
becomes highly polarized, the microtubule minus ends move to a region near the 
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Figure 16-48 A pair of centrioles in the 
centrosome. (A) An electron micrograph 
of a thin section of an isolated centrosome 
showing the mother centriole with its 
distal appendages and the adjacent 
daughter centriole, which formed through 
a duplication event during S phase (see 
Figure 17-26). In the centrosome, the 
centriole pair is Surrounded by a dense 
matrix of pericentriolar material from which 
microtubules nucleate. Centricles also 
function as basal bodies to nucleate the 
formation of ciliary axonemes (See Figure 
16-68). (B) Electron micrograph of a cross 
section through a centriole in the cortex of 
a protozoan. Each centriole is composed 
of nine sets of triplet microtubules 
arranged to form a cylinder. (C) Each triplet 
contains one complete microtubule (the 

A microtubule) fused to two incomplete 
microtubules (the B and C microtubules). 
(D) The centriolar protein SAS-6 forms a 
coiled-coil dimer. Nine SAS-6 dimers can 
self-associate to form a ring. Located at 
the hub of the centriole cartwheel structure, 
the SAS-6 ring is thought to generate the 
ninefold symmetry of the centriole. (A, from 
from M. Paintrand, et al. J. Struct. Biol. 
108:107, 1992. With permission from 
Elsevier; B, courtesy of Richard Linck; 

D, courtesy of Michel Steinmetz.) 
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centrosome Figure 16-49 A microtubule array can 
containing centriole find the center of a cell. After the arm of a 
fish pigment cell is cut off with a needle, the 
microtubules in the detached cell fragment 
reorganize so that their minus ends end up 
near the center of the fragment, buried in a 
new microtubule-organizing center. 


pair 











WAIT 
4 HOURS 
= 
severed new microtubule- 
cell organizing 
fragment center lacking cell fragment 
centrioles with 


reorganized 
microtubules 


melanophore cell 


apical plasma membrane. From this asymmetrical location, a microtubule array 
extends along the long axis of the cell, with plus ends directed toward the basal 
surface (see Figure 16-4). 


Microtubule-Binding Proteins Modulate Filament Dynamics and 
Organization 


Microtubule polymerization dynamics are very different in cells than in solutions 
of pure tubulin. Microtubules in cells exhibit a much higher polymerization rate 
(typically 10-15 um/min, relative to about 1.5 um/min with purified tubulin at 
similar concentrations), a greater catastrophe frequency, and extended pauses in 
microtubule growth, a dynamic behavior rarely observed in pure tubulin solu- 
tions. These and other differences arise because microtubule dynamics inside the 
cell are governed by a variety of proteins that bind tubulin dimers or microtu- 
bules, as summarized in Panel 16-4. 

Proteins that bind to microtubules are collectively called microtubule-associ- 
ated proteins, or MAPs. Some MAPs can stabilize microtubules against disassem- 
bly. A subset of MAPs can also mediate the interaction of microtubules with other 
cell components. This subset is prominent in neurons, where stabilized micro- 
tubule bundles form the core of the axons and dendrites that extend from the 
cell body (Figure 16-50). These MAPs have at least one domain that binds to the 
microtubule surface and another that projects outward. The length of the project- 
ing domain can determine how closely MAP-coated microtubules pack together, 
as demonstrated in cells engineered to overproduce different MAPs. Cells over- 
expressing MAP2, which has a long projecting domain, form bundles of stable 
microtubules that are kept widely spaced, while cells overexpressing tau, a MAP 
with a much shorter projecting domain, form bundles of more closely packed 
microtubules (Figure 16-51). MAPs are the targets of several protein kinases, and 
phosphorylation of a MAP can control both its activity and localization inside 
cells. 


Microtubule Plus-End-Binding Proteins Modulate Microtubule 
Dynamics and Attachments 


Cells contain numerous proteins that bind the ends of microtubules and thereby 
influence microtubule stability and dynamics. These proteins can influence the 


Figure 16-50 Localization of MAPs in the axon and dendrites of a 
neuron. This immunofluorescence micrograph shows the distribution of 
the proteins tau (green) and MAP2 (orange) in a hippocampal neuron in 
culture. Whereas tau staining is confined to the axon (long and branched in 
this neuron), MAP2 staining is confined to the cell body and its dendrites. 
The antibody used here to detect tau binds only to unphosphorylated tau; 
phosphorylated tau is also present in dendrites. (Courtesy of James 

W. Mandell and Gary A. Banker.) 10 um 





PANEL 16-4 : Microtubules 
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Some of the major accessory proteins of the microtubule cytoskeleton. Except for two classes of motor proteins, 
an example of each major type is shown. Each of these is discussed in the text. However, most cells contain more 
than a hundred different microtubule-binding proteins, and — as for the actin-associated proteins — it is likely 
that there are important types of microtubule-associated proteins that are not yet recognized. 
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rate at which a microtubule switches from a growing to a shrinking state (the fre- 
quency of catastrophes) or from a shrinking to a growing state (the frequency of 
rescues). For example, members of a family of kinesin-related proteins known as 
catastrophe factors (or kinesin-13) bind to microtubule ends and appear to pry 
protofilaments apart, lowering the normal activation-energy barrier that prevents 
a microtubule from springing apart into the curved protofilaments that are char- 
acteristic of the shrinking state (Figure 16-52). Another protein, called Nezha or 
Patronin, protects microtubule minus ends from the effects of catastrophe factors. 

While very few microtubule minus-end-binding proteins have been charac- 
terized, a large subset of MAPs has been identified that are enriched at microtu- 
bule plus ends. A particularly ubiquitous example is XMAP215, which has close 
homologs in organisms that range from yeast to humans. XMAP215 binds free 
tubulin subunits and delivers them to the plus end, thereby promoting microtu- 
bule polymerization and simultaneously counteracting catastrophe factor activ- 
ity (see Figure 16-52). The phosphorylation of XMAP215 during mitosis inhibits 
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Figure 16-51 Organization of 
microtubule bundles by MAPs. (A) MAP2 
binds along the microtubule lattice at one 
of its ends and extends a long projecting 
arm with a second microtubule-binding 
domain at the other end. (B) Tau possesses 
a shorter microtubule cross-linking domain. 
(C) Electron micrograph showing a cross 
section through a microtubule bundle in 

a cell overexpressing MAP2. The regular 
spacing of the microtubules (MTs) in this 
bundle results from the constant length of 
the projecting arms of the MAP2. 

(D) Similar cross section through a 
microtubule bundle in a cell overexpressing 
tau. Here the microtubules are spaced 
more closely together than they are in (C) 
because of tau’s relatively short projecting 
arm. (C and D, courtesy of J. Chen et 

al., Nature 360:674-677, 1992. With 
permission from Macmillan Publishers Ltd.) 


Figure 16-52 The effects of proteins 
that bind to microtubule ends. The 
transition between microtubule growth 

and shrinkage is controlled in cells by a 
variety of proteins. Catastrophe factors 
such as kinesin-13, a member of the 
kinesin motor protein superfamily, bind 

to microtubule ends and pry them apart, 
thereby promoting depolymerization. On 
the other hand, a MAP such as XMAP215 
stabilizes the end of a growing microtubule 
(XMAP stands for Xenopus microtubule- 
associated protein, and the number refers 
to its molecular mass in kilodaltons). 

XMAP 215 binds tubulin dimers and delivers 
them to the microtubule plus end, thereby 
increasing the microtubule growth rate and 
suppressing catastrophes. 
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its activity and shifts the balance of its competition with catastrophe factors. This 
shift results in a tenfold increase in the dynamic instability of microtubules during 
mitosis, a transition that is critical for the efficient construction of the mitotic 
spindle (discussed in Chapter 17). 

In many cells, the minus ends of microtubules are stabilized by association 
with a capping protein or the centrosome, or else they serve as microtubule depo- 
lymerization sites. The plus ends, in contrast, efficiently explore and probe the 
entire cell space. Microtubule-associated proteins called plus-end tracking pro- 
teins (+TIPs) accumulate at these active ends and appear to rocket around the cell 
as passengers at the ends of rapidly growing microtubules, dissociating from the 
ends when the microtubules begin to shrink (Figure 16-53). 

The kinesin-related catastrophe factors and XMAP215 mentioned above 
behave as +TIPs and act to modulate the growth and shrinkage of the microtubule 
end to which they are attached. Other +TIPs control microtubule positioning by 
helping to capture and stabilize the growing microtubule end at specific cellu- 
lar targets, such as the cell cortex or the kinetochore of a mitotic chromosome. 
EB1 and its relatives, small dimeric proteins that are highly conserved in animals, 
plants, and fungi, are key players in this process. EB1 proteins do not actively 
move toward plus ends, but rather recognize a structural feature of the growing 
plus end (see Figure 16-53). Several of the +TIPs depend on EB1 proteins for their 
plus-end accumulation and also interact with each other and with the microtu- 
bule lattice. By attaching to the plus end, these factors allow the cell to harness the 
energy of microtubule polymerization to generate pushing forces that can be used 
for positioning the spindle, chromosomes, or organelles. 


Tubulin-Sequestering and Microtubule-Severing Proteins 
Destabilize Microtubules 


As it does with actin monomers, the cell sequesters unpolymerized tubulin sub- 
units to maintain a pool of active subunits at a level near the critical concentra- 
tion. One molecule of the small protein stathmin (also called Op18) binds to two 
tubulin heterodimers and prevents their addition to the ends of microtubules 
(Figure 16-54). Stathmin thus decreases the effective concentration of tubulin 
subunits that are available for polymerization (an action analogous to that of the 
drug colchicine), and enhances the likelihood that a growing microtubule will 
switch to the shrinking state. Phosphorylation of stathmin inhibits its binding to 
tubulin, and signals that cause stathmin phosphorylation can increase the rate 
of microtubule elongation and suppress dynamic instability. Stathmin has been 
implicated in the regulation of both cell proliferation and cell death. Interestingly, 
mice lacking stathmin develop normally but are less fearful than wild-type mice, 
reflecting a role for stathmin in neurons of the amygdala, where it is normally 
expressed at high levels. 

Severing is another mechanism employed by the cell to destabilize microtu- 
bules. To sever a microtubule, thirteen longitudinal bonds must be broken, one 
for each protofilament. The protein katanin, named after the Japanese word for 


Figure 16-53 +TIP proteins found at the 
growing plus ends of microtubules. 

(A) Frames from a fluorescence time-lapse 
movie of the edge of a cell expressing 
fluorescently labeled tubulin that incorporates 
into microtubules (red) as well as the +TIP 
protein EB1 tagged with a different color 
(green). The same microtubule is marked 
(asterisk) in Successive movie frames. When 
the microtubule is growing (frames 1, 2), 
EB1 is associated with the tip. When the 
microtubule undergoes a catastrophe and 
begins shrinking, EB1 is lost (frames 3, 4). 
The labeled EB1 is regained when growth 
of the microtubule is rescued (frame 5). 

See Movie 16.8. (B) In the fission yeast 
Schizosaccharomyces pombe, the plus ends 
of the microtubules (green) are associated 
with the homolog of EB1 (red) at the two 
poles of the rod-shaped cells. (A, courtesy 
of Anna Akhmanova and llya Grigoriev; 

B, courtesy of Takeshi Toda.) 
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Figure 16-54 Sequestration of tubulin by 
stathmin. Structural studies with electron 
microscopy and crystallography suggest 
that the elongated stathmin protein binds 
along the side of two tubulin heterodimers. 
(Adapted from M.O. Steinmetz et al., 
EMBO J. 19:572-580, 2000. With 
permission from John Wiley and Sons.) 
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“sword,” accomplishes this demanding task (Figure 16-55). Katanin is made up 
of two subunits: a smaller subunit that hydrolyzes ATP and performs the actual 
severing, and a larger one that directs katanin to the centrosome. Katanin releases 
microtubules from their attachment to a microtubule-organizing center and is 
thought to contribute to the rapid microtubule depolymerization observed at the 
poles of spindles during mitosis. It may also be involved in microtubule release 
and depolymerization in proliferating cells in interphase and in postmitotic cells 
such as neurons. 


Two Types of Motor Proteins Move Along Microtubules 


Like actin filaments, microtubules also use motor proteins to transport cargo and 
perform a variety of other functions within the cell. There are two major classes 
of microtubule-based motors, kinesins and dyneins. Kinesin-1, also called “con- 
ventional kinesin,” was first purified from squid neurons, where it carries mem- 
brane-enclosed organelles away from the cell body toward the axon terminal by 
walking toward the plus end of microtubules. Kinesin-1 is similar to myosin II in 
having two heavy chains per active motor; these form two globular head motor 
domains that are held together by an elongated coiled-coil tail that is responsi- 
ble for heavy-chain dimerization. One kinesin-1 light chain associates with each 
heavy chain through its tail domain and mediates cargo binding. Like myosin, 
kinesin is amember of a large protein superfamily, for which the motor domain is 
the common element (Figure 16-56). The yeast Saccharomyces cerevisiae has six 
distinct kinesins. The nematode C. elegans has 20 kinesins, and humans have 45. 

There are at least fourteen distinct families in the kinesin superfamily. Most of 
them have the motor domain at the N-terminus of the heavy chain and walk toward 
the plus end of the microtubule. One family has the motor domain at the C-termi- 
nus and walks in the opposite direction, toward the minus end of the microtubule, 
while kinesin-13 has a central motor domain and does not walk at all, but uses the 
energy of ATP hydrolysis to depolymerize microtubule ends, as described above 
(see Figure 16-52). Some kinesin heavy chains are homodimers, and others are 
heterodimers. Most kinesins have a binding site in the tail for another microtu- 
bule; alternatively, they may link the motor to a membrane-enclosed organelle 
via a light chain or an adaptor protein. Many of the kinesin superfamily members 
have specific roles in mitotic spindle formation and in chromosome segregation 
during cell division. 

In kinesin-1, instead of the rocking of a lever arm, small movements at the 
nucleotide-binding site regulate the docking and undocking of the motor head 
domain to a long linker region. This acts to throw the second head forward along 
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Figure 16-55 Microtubule severing by 
katanin. Taxol stabilized, rhodamine- 
labeled microtubules were adsorbed on the 
surface of a glass slide, and purified katanin 
was added along with ATP. (A) There are a 
few breaks in the microtubules 30 seconds 
after the addition of katanin. (B) The same 
field 3 minutes after the addition of katanin. 
The filaments have been severed in many 
places, leaving a series of small fragments 
at the previous locations of the long 
microtubules. (From J.J. Hartman et al., 
Cell 9383:277-287, 1998. With permission 
from Elsevier.) 


Figure 16-56 Kinesin and kinesin- 
related proteins. Structures of four kinesin 
superfamily members. As in the myosin 
superfamily, only the motor domains are 
conserved. Kinesin-1 has the motor domain 
at the N-terminus of the heavy chain. The 
middle domain forms a long coiled-coll, 
mediating dimerization. The C-terminal 
domain forms a tail that attaches to cargo, 
such as a membrane-enclosed organelle. 
Kinesin-5 forms tetramers where two 
dimers associate by their tails. The bipolar 
kinesin-5 tetramer is able to slide two 
microtubules past each other, analogous 
to the activity of the bipolar thick filaments 
formed by myosin Il. Kinesin-13 has its 
motor domain located in the middle of the 
heavy chain. It is a member of a family of 
kinesins that have lost typical motor activity 
and instead bind to microtubule ends to 
promote depolymerization (see Figure 
16-52). Kinesin-14 is a C-terminal kinesin 
that includes the Drosophila protein Ncd 
and the yeast protein Kar3. These kinesins 
generally travel in the opposite direction 
from the majority of kinesins, toward the 
minus end instead of the plus end of a 
microtubule. 
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Figure 16-57 The mechanochemical cycle of kinesin. Kinesin-1 is a 
dimer of two nucleotide-binding motor domains (heads) that are connected 
through a long coiled-coil tail (See Figure 16-56). The two kinesin motor 
domains work in a coordinated manner; during a kinesin “step,” the rear head 
detaches from its tubulin binding site, passes the partner motor domain, 
and then rebinds to the next available tubulin binding site. Using this “hand- 
over-hand” motion, the kinesin dimer can move for long distances on the 
microtubule without completely letting go of its track. 

At the start of each step, one of the two kinesin motor domain heads, 
the rear or lagging head (dark green), is tightly bound to the microtubule and 
to ATP, while the front or leading head is loosely bound to the microtubule 
with ADP in its binding site. The forward displacement of the rear motor 
domain is driven by the dissociation of ADP and binding of ATP in the leading 
head (between panels 2 and 3 in this drawing). The binding of ATP to this 
motor domain causes a small peptide called the “neck linker” to shift from 
a rearward-pointing to a forward-pointing conformation (the neck linker is 
drawn here as a purple connecting line between the leading motor domain 
and the intertwined coiled-coil). This shift pulls the rear head forward, once 
it has detached from the microtubule with ADP bound [detachment requires 
ATP hydrolysis and phosphate (Pj) release]. The kinesin molecule is now 
poised for the next step, which proceeds by an exact repeat of the process 
shown (Movie 16.9). 


the protofilament to a binding site 8 nm closer to the microtubule plus end, which 
is the distance between tubulin dimers of a protofilament. The nucleotide-hydro- 
lysis cycles in the two heads are closely coordinated, so that this cycle of linker 
docking and undocking allows the two-headed motor to move in a hand-over- 
hand (or head-over-head) stepwise manner (Figure 16-57). 

The dyneins are a family of minus-end directed microtubule motors unrelated 
to the kinesins. They are composed of one, two, or three heavy chains (that include 
the motor domain) and a large and variable number of associated intermediate, 
light-intermediate, and light chains. The dynein family has two major branches 
(Figure 16-58). The first branch contains the cytoplasmic dyneins, which are 
homodimers of two heavy chains. Cytoplasmic dynein 1 is encoded by a single 
gene in almost all eukaryotic cells, but is missing from flowering plants and some 
algae. It is used for organelle and mRNA trafficking, for positioning the centro- 
some and nucleus during cell migration, and for construction of the microtubule 
spindle in mitosis and meiosis. Cytoplasmic dynein 2 is found only in eukaryotic 
organisms that have cilia and is used to transport material from the tip to the base 
of the cilia, a process called intraflagellar transport. Axonemal dyneins (also called 
ciliary dyneins) comprise the second branch and include monomers, heterodi- 
mers, and heterotrimers, with one, two, or three motor-containing heavy chains, 
respectively. They are highly specialized for the rapid and efficient sliding move- 
ments of microtubules that drive the beating of cilia and flagella (discussed later). 
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Figure 16-58 Dyneins. (A) Freeze-etch 
electron micrographs of a molecule of 
cytoplasmic dynein and a molecule of 
Ciliary (axonemal) dynein. Like myosin II 
and kinesin-1, cytoplasmic dynein is a 
two-headed molecule. The ciliary dynein 
shown is from a protozoan and has three 
heads; ciliary dynein from animals has two 
heads. Note that the dynein head is very 
large compared with the head of either 
myosin or kinesin. (B) Schematic depiction 
of cytoplasmic dynein showing the two 
heavy chains (blue and gray) that contain 
domains for microtubule (MT) binding and 
ATP hydrolysis, connected by a long stalk. 
Bound to the heavy chain are multiple 
intermediate chains (dark green) and light 
chains (light green) that help to mediate 
many of dynein’s functions. (A, courtesy of 
John Heuser; B, adapted from R. Vale, Cell 
112:467-480, 2003. With permission from 
Cell Press.) 
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Dyneins are the largest of the known molecular motors, and they are also 
among the fastest: axonemal dyneins attached to a glass slide can move micro- 
tubules at the rate of 14 um/sec. The dynein motor is structurally unrelated to 
myosins and kinesins, but still follows the general rule of coupling nucleotide 
hydrolysis to microtubule binding and unbinding as well as to a force-generating 
conformational change (Figure 16-59). 


Microtubules and Motors Move Organelles and Vesicles 


A major function of cytoskeletal motors in interphase cells is the transport and 
positioning of membrane-enclosed organelles (Movie 16.10). Kinesin was orig- 
inally identified as the protein responsible for fast anterograde axonal transport, 
the rapid movement of mitochondria, secretory vesicle precursors, and various 
synapse components down the microtubule highways of the axon to the distant 
nerve terminals. Cytoplasmic dynein was identified as the motor responsible for 
transport in the opposite direction, retrograde axonal transport. Although organ- 
elles in most cells need not cover such long distances, their polarized transport 
is equally necessary. A typical microtubule array in an interphase cell is oriented 
with the minus ends near the center of the cell at the centrosome and the plus 
ends extending to the cell periphery. Thus, centripetal movements of organelles 
or vesicles toward the cell center require the action of minus-end directed cyto- 
plasmic dynein motors, whereas centrifugal movements toward the periphery 
require plus-end directed kinesin motors. Interestingly, in animal cells, nearly all 
minus-end directed transport is driven by the single cytoplasmic dynein 1 motor, 
whereas 15 different kinesins are used for plus-end directed transport. 


Figure 16-59 The power stroke of 
dynein. (A) The organization of the 
domains in each dynein heavy chain. This 
is a huge polypeptide, containing nearly 
4000 amino acids. The number of heavy 
chains in a dynein is equal to its number 

of motor heads. (B) Illustration of dynein 

c, a monomeric axonemal dynein found in 
the unicellular green alga Chlamydomonas 
reinhardtii. The large dynein motor head 

is a planar ring containing a C-terminal 
domain (gray) and six AAA domains, four 
of which retain ATP-binding sequences, 
but only one of which (dark red) has the 
major ATPase activity. Extending from 

the head are a long, coiled-coil stalk with 
the microtubule-binding site at the tip, 

and a tail that attaches to an adjacent 
microtubule in the axoneme. In the ATP- 
bound state, the stalk is detached from 
the microtubule, but ATP hydrolysis 
causes stalk—microtubule attachment (left). 
Subsequent release of ADP and phosphate 
(P;) then leads to a large conformational 
“power stroke” involving rotation of the 
head and stalk relative to the tail (right). 
Each cycle generates a step of about 8 nm, 
thereby contributing to flagellar beating (See 
Figure 16-65). In the case of cytoplasmic 
dynein, the tail is attached to a cargo such 
as a vesicle, and a single power stroke 
transports the cargo about 8-nm along 

the microtubule toward its minus end (see 
Figure 16-60). (C) Electron micrographs of 
purified monomeric dyneins in two different 
conformations representing different steps 
in the mechanochemical cycle. (C, from 
S.A. Burgess et al., Nature 421:715-718, 
2008. With permission from Macmillan 
Publishers Ltd.) 


MICROTUBULES 


A clear example of the effect of microtubules and microtubule motors on 
the behavior of intracellular membranes is their role in organizing the endo- 
plasmic reticulum (ER) and the Golgi apparatus. The network of ER membrane 
tubules aligns with microtubules and extends almost to the edge of the cell 
(Movie 16.11), whereas the Golgi apparatus is located near the centrosome. When 
cells are treated with a drug that depolymerizes microtubules, such as colchicine 
or nocodazole, the ER collapses to the center of the cell, while the Golgi appa- 
ratus fragments and disperses throughout the cytoplasm. In vitro, kinesins can 
tether ER-derived membranes to preformed microtubule tracks and walk toward 
the microtubule plus ends, dragging the ER membranes out into tubular protru- 
sions and forming a membranous web that looks very much like the ER in cells. 
Conversely, dyneins are required for positioning the Golgi apparatus near the cell 
center of animal cells; they do this by moving Golgi vesicles along microtubule 
tracks toward the microtubules’ minus ends at the centrosome. 

The different tails and their associated light chains on specific motor proteins 
allow the motors to attach to their appropriate organelle cargo. Membrane-as- 
sociated motor receptors that are sorted to specific membrane-enclosed com- 
partments interact directly or indirectly with the tails of the appropriate kinesin 
family members. Many viruses take advantage of microtubule motor-based trans- 
port during infection and use kinesin to move from their site of replication and 
assembly to the plasma membrane, from which they are poised to infect neigh- 
boring cells. An outer-membrane protein of Vaccinia virus, for example, contains 
an amino acid motif that mediates binding to kinesin-1 light chain and transport 
along microtubules to the plasma membrane. Interestingly, this motif is present 
in over 450 human proteins, one-third of which are associated with human dis- 
eases. Thus, kinesin transports a diverse set of cargoes involved in a wide range of 
important cellular functions. 

For dynein, a large macromolecular assembly often mediates attachment to 
membranes. Cytoplasmic dynein, itself a huge protein complex, requires associa- 
tion with a second large protein complex called dynactin to translocate organelles 
effectively. The dynactin complex includes a short, actin-like filament that forms 
from the actin-related protein Arp] (distinct from Arp2 and Arp3, the components 
of the Arp 2/3 complex involved in the nucleation of conventional actin filaments) 
(Figure 16-60). A number of other proteins also contribute to dynein cargo bind- 
ing and motor regulation, and their function is especially important in neurons, 
where defects in microtubule-based transport have been linked to neurological 
diseases. A striking example is smooth brain, or lissencephaly, a human disor- 
der in which cells fail to migrate to the cerebral cortex of the developing brain. 
One type of lissencephaly is caused by defects in Lisl, a dynein-binding protein 
required for nuclear migration in several species. In the normal brain, migration 
of the nucleus directs the developing neural cell body toward its correct position 
in the cortex. In the absence of Lisl, however, the nuclei of migrating neurons 
fail to attach to dynein, resulting in nuclear-migration defects. Dynein is required 
continuously for neuronal function, as mutations in a dynactin subunit or in the 
tail region of cytoplasmic dynein lead to neuronal degeneration in humans and 
mice. These effects are associated with decreased retrograde axonal transport and 
provide strong evidence for the importance of robust axonal transport in neuronal 
viability. 

The cell can regulate the activity of motor proteins and thereby cause either 
a change in the positioning of its membrane-enclosed organelles or whole-cell 
movements. Fish melanocytes provide one of the most dramatic examples. These 
giant cells, which are responsible for rapid changes in skin coloration in several 
species of fish, contain large pigment granules that can alter their location in 
response to neuronal or hormonal stimulation (Figure 16-61). The pigment gran- 
ules aggregate or disperse by moving along an extensive network of microtubules 
that are anchored at the centrosome by their minus ends. The tracking of indi- 
vidual pigment granules reveals that the inward movement is rapid and smooth, 
while the outward movement is jerky, with frequent backward steps. Both dynein 
and kinesin microtubule motors are associated with the pigment granules. The 
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Figure 16-60 Dynactin mediates the 
attachment of dynein to a membrane- 
enclosed organelle. Dynein requires the 
presence of a large number of accessory 
proteins to associate with membrane- 
enclosed organelles. Dynactin is a large 
complex that includes components that 
bind weakly to microtubules, components 
that bind to dynein itself, and components 
that form a small, actin-like filament made 
of the actin-related protein Arp1. 
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jerky outward movements may result from a tug-of-war between the two opposing 
microtubule motor proteins, with the stronger kinesin winning out overall. When 
intracellular cyclic AMP levels decrease, kinesin is inactivated, leaving dynein free 
to drag the pigment granules rapidly toward the cell center, changing the fish’s 
color. In a similar way, the movement of other membrane organelles coated with 
particular motor proteins is controlled by a complex balance of competing signals 
that regulate both motor protein attachment and activity. 


Construction of Complex Microtubule Assemblies Requires 
Microtubule Dynamics and Motor Proteins 


The construction of the mitotic spindle and the neuronal cytoskeleton are import- 
ant and fascinating examples of the power of organization by teams of motor pro- 
teins interacting with dynamic cytoskeletal filaments. As described in Chapter 
17, mitotic spindle assembly depends on reorganization of the interphase array 
of microtubules to form a bipolar array of microtubules, with their minus ends 
focused at the poles and their plus ends overlapping in the center or connecting to 
chromosomes. Spindle assembly depends on the coordinated actions of several 
motor proteins and other factors that modulate polymerization dynamics (see 
Figures 17-23 and 17-25). 

Neurons also contain complex cytoskeletal structures. As they differentiate, 
neurons send out specialized processes that will either receive electrical signals 
(dendrites) or transmit electrical signals (axons) (see Figure 16-50). The beautiful 
and elaborate branching morphology of axons and dendrites enables neurons to 
form tremendously complex signaling networks, interacting with many other cells 
simultaneously and making possible the complicated behavior of the higher ani- 
mals. Both axons and dendrites (collectively called neurites) are filled with bun- 
dles of microtubules that are critical to both their structure and their function. 

In axons, all the microtubules are oriented in the same direction, with their 
minus end pointing back toward the cell body and their plus end pointing toward 
the axon terminals (Figure 16-62). The microtubules do not reach from the cell 


Figure 16-61 Regulated melanosome 
movements in fish pigment cells. 

These giant cells, which are responsible 

for changes in skin coloration in several 
species of fish, contain large pigment 
granules, or melanosomes (brown). The 
melanosomes can change their location 

in the cell in response to a hormonal or 
neuronal stimulus. (A) Schematic view of 

a pigment cell, showing the dispersal and 
aggregation of melanosomes in response 
to an increase or decrease in intracellular 
cyclic AMP (cAMP), respectively. Both 
redistributions of melanosomes occur along 
microtubules. (B) Bright-field images of a 
single cell in a scale of an African cichlid 
fish, showing its melanosomes either 
dispersed throughout the cytoplasm (left) or 
aggregated in the center of the cell (right). 
(B, courtesy of Leah Haimo.) 


neuron cell 
body 








dendrite 





e vesicle with active dynein 
e vesicle with active kinesin 


— microtubule synapse 


Figure 16-62 Microtubule organization 
in a neuron. In a neuron, microtubule 
organization is complex. In the axon, all 
microtubules share the same polarity, with 
the plus ends pointing outward toward 
the axon terminus. No single microtubule 
stretches the entire length of the axon; 
instead, short overlapping segments of 
parallel microtubules make the tracks 

for fast axonal transport. In dendrites, 
the microtubules are of mixed polarity, 
with some plus ends pointing outward 
and some pointing inward. Vesicles can 
associate with both kinesin and dynein 
and move in either direction along the 
microtubules in axons and dendrites, 
depending on which motor is active. 
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body all the way to the axon terminals; each is typically only a few micrometers 
in length, but large numbers are staggered in an overlapping array. These aligned 
microtubule tracks act as a highway to transport specific proteins, protein-con- 
taining vesicles, and mRNAs to the axon terminals, where synapses are con- 
structed and maintained. The longest axon in the human body reaches from the 
base of the spinal cord to the foot and is up to a meter in length. By comparison, 
dendrites are generally much shorter than axons. The microtubules in dendrites 
lie parallel to one another but their polarities are mixed, with some pointing their 
plus ends toward the dendrite tip, while others point back toward the cell body, 
reminiscent of the antiparallel microtubule array of the mitotic spindle. 


Motile Cilia and Flagella Are Built from Microtubules and Dyneins 


Just as myofibrils are highly specialized and efficient motility machines built from 
actin and myosin filaments, cilia and flagella are highly specialized and efficient 
motility structures built from microtubules and dynein. Both cilia and flagella are 
hairlike cell appendages that have a bundle of microtubules at their core. Flagella 
are found on sperm and many protozoa. By their undulating motion, they enable 
the cells to which they are attached to swim through liquid media. Cilia are orga- 
nized in a similar fashion, but they beat with a whiplike motion that resembles the 
breaststroke in swimming. Ciliary beating can either propel single cells through a 
fluid (as in the swimming of the protozoan Paramecium) or can move fluid over 
the surface of a group of cells in a tissue. In the human body, huge numbers of cilia 
(10°/cm? or more) line our respiratory tract, sweeping layers of mucus, trapped 
particles of dust, and bacteria up to the mouth where they are swallowed and ulti- 
mately eliminated. Likewise, cilia along the oviduct help to sweep eggs toward the 
uterus. 

The movement of a cilium or a flagellum is produced by the bending of its 
core, which is called the axoneme. The axoneme is composed of microtubules 
and their associated proteins, arranged in a distinctive and regular pattern. Nine 
special doublet microtubules (comprising one complete and one partial micro- 
tubule fused together so that they share a common tubule wall) are arranged in 
a ring around a pair of single microtubules (Figure 16-63). Almost all forms of 
motile eukaryotic flagella and cilia (from protozoans to humans) have this char- 
acteristic arrangement. The microtubules extend continuously for the length of 
the axoneme, which can be 10-200 um. At regular positions along the length of the 
microtubules, accessory proteins cross-link the microtubules together. 


(A) (B) (C) microtubule inner proteins (MIPs) 
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Figure 16-63 The arrangement of microtubules in a flagellum or cilium. (A) Electron micrograph of the flagellum of a green-alga cell 
(Chlamydomonas) shown in cross section, illustrating the distinctive “9 + 2” arrangement of microtubules. (B) Diagram of the parts of a flagellum or 
cillum. The various projections from the microtubules link the microtubules together and occur at regular intervals along the length of the axoneme. 
(C) High-resolution electron tomography image of an outer doublet microtubule showing structural details and features inside the microtubules called 
microtubule inner proteins (MIPs). (A, courtesy of Lewis Tilney; C, courtesy of Daniela Nicastro.) 
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Molecules of axonemal dynein form bridges between the neighboring doublet 
microtubules around the circumference of the axoneme (Figure 16-64). When 
the motor domain of this dynein is activated, the dynein molecules attached to 
one microtubule doublet (see Figure 16-59) attempt to walk along the adjacent 
microtubule doublet, tending to force the adjacent doublets to slide relative to 
one another, much as actin thin filaments slide during muscle contraction. How- 
ever, the presence of other links between the microtubule doublets prevents this 
sliding, and the dynein force is instead converted into a bending motion (Figure 
16-65). 

In humans, hereditary defects in axonemal dynein cause a condition called 
primary ciliary dyskinesia or Kartagener’s syndrome. This syndrome is character- 
ized by inversion of the normal asymmetry of internal organs (sinus inversus) due 
to disruption of fluid flow in the developing embryo, male sterility due to immo- 
tile sperm, and a high susceptibility to lung infections due to paralyzed cilia being 
unable to clear the respiratory tract of debris and bacteria. 

Bacteria also swim using cell-surface structures called flagella, but these do 
not contain microtubules or dynein and do not wave or beat. Instead, bacterial 
flagella are long, rigid helical filaments, made up of repeating subunits of the pro- 
tein flagellin. The flagella rotate like propellers, driven by a special rotary motor 
embedded in the bacterial cell wall. The use of the same name to denote these two 
very different types of swimming apparatus is an unfortunate historical accident. 


Primary Cilia Perform Important Signaling Functions in Animal Cells 


Many cells possess a shorter, nonmotile counterpart of cilia and flagella called 
the primary cilium. Primary cilia can be viewed as specialized cellular compart- 
ments or organelles that perform a wide range of cellular functions, but share 
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Figure 16-64 Ciliary dynein. Ciliary 
(axonemal) dynein is a large protein 
assembly (nearly 2 million daltons) 
composed of 9-12 polypeptide chains, the 
largest of which is the heavy chain of more 
than 500,000 daltons. (A) The heavy chains 
form the major portion of the globular 
head and stem domains, and many of the 
smaller chains are clustered around the 
base of the stem. There are two heads 

in the outer dynein in metazoans (shown 
here), but three heads in protozoa, each 
formed from their own heavy chain. The 
tail of the molecule binds tightly to an 

A microtubule, while the large globular 
heads have an ATP-dependent binding 
site for a B microtubule (See Figure 16-63). 
When the heads hydrolyze their bound ATP, 
they move toward the minus end of the 

B microtubule, thereby producing a sliding 
force between the adjacent microtubule 
doublets in a cilium or flagellum (See 
Figure 16-59). (B) Freeze-etch electron 
micrograph of a cilium showing the dynein 
arms projecting at regular intervals from 
the doublet microtubules. (B, courtesy of 
John Heuser.) 


Figure 16-65 The bending of an 
axoneme. (A) When axonemes are 
exposed to the proteolytic enzyme trypsin, 
the linkages holding neighboring doublet 
microtubules together are broken. In this 
case, the addition of ATP allows the motor 
action of the dynein heads to slide one 
pair of doublet microtubules against the 
other pair. (B) In an intact axoneme (Such 
as in a sperm), flexible protein links prevent 
the sliding of the doublet. The motor 
action therefore causes a bending motion, 
creating waves or beating motions. 
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many structural features with motile cilia. Both motile and nonmotile cilia are 
generated during interphase at plasma-membrane-associated structures called 
basal bodies that firmly root them at the cell surface. At the core of each basal body 
is a centriole, the same structure found embedded at the center of animal cen- 
trosomes, with nine groups of fused triplet microtubules arranged in a cartwheel 
(see Figure 16-48). Centrioles are multifunctional, contributing to assembly of the 
mitotic spindle in dividing cells but migrating to the plasma membrane of inter- 
phase cells to template the nucleation of the axoneme (Figure 16-66). Because 
no protein translation occurs in cilia, construction of the axoneme requires intra- 
flagellar transport (IFT), a transport system discovered in the green algae Chlam- 
ydomonas. Analogous to the axon, motors move cargoes in both anterograde and 
retrograde directions, in this case driven by kinesin-2 and cytoplasmic dynein 2, 
respectively. 

Primary cilia are found on the surface of almost all cell types, where they sense 
and respond to the exterior environment, functions best understood in the con- 
text of smell and sight. In the nasal epithelium, cilia protruding from dendrites of 
olfactory neurons are the site of both odorant reception and signal amplification. 
Similarly, the rod and cone cells of the vertebrate retina possess a primary cilium 
equipped with an expanded tip called the outer segment, which is specialized for 
converting light into a neural signal (see Figure 15-38). Maintenance of the outer 
segment requires continuous IFT-mediated transport of large quantities of lipids 
and proteins into the cilium, at rates of up to 2000 molecules per minute. The links 
between cilia function and the senses of sight and smell are underscored by Bar- 
det-Biedl syndrome, a set of disorders associated with defects in IFT, the cilium, or 
the basal body. Patients with Bardet-Biedl syndrome cannot smell and suffer from 
retinal degeneration. Other characteristics of this multifaceted disorder include 
hearing loss, polycystic kidney disease, diabetes, obesity, and polydactyly, sug- 
gesting that primary cilia have functions in many aspects of human physiology. 


Summary 


Microtubules are stiff polymers of tubulin molecules. They assemble by addition of 
GTP-containing tubulin subunits to the free end of a microtubule, with one end (the 
plus end) growing faster than the other. Hydrolysis of the bound GTP takes place 
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Figure 16-66 Primary cilia. (A) Electron micrograph and diagram of the basal body of a mouse neuron primary cilium. The axoneme of the primary 
cilium (black arrow) is nucleated by the mother centriole at the basal body, which localizes at the plasma membrane near the cell surface. 

(B) Centrioles function alternately as basal bodies and as the core of centrosomes. Before a cell enters the cell division cycle, the primary cilium is 
shed or resorbed. The centrioles recruit pericentriolar material and duplicate during S phase, generating two centrosomes, each of which contains 
a pair of centrioles. The centrosomes nucleate microtubules and localize to the poles of the mitotic spindle. Upon exit from mitosis, a primary cilium 
again grows from the mother centriole. (A, courtesy of Josef Spacek.) 
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after assembly and weakens the bonds that hold the microtubule together. Microtu- 
bules are dynamically unstable and liable to catastrophic disassembly, but they can 
be stabilized in cells by association with other structures. Microtubule-organizing 
centers such as centrosomes protect the minus ends of microtubules and continu- 
ally nucleate the formation of new microtubules. Microtubule-associated proteins 
(MAPSs) stabilize microtubules, and those that localize to the plus end (+TIPs) can 
alter the dynamic properties of the microtubule or mediate their interaction with 
other structures. Counteracting the stabilizing activity of MAPs are catastrophe 
factors, such as kinesin-13 proteins, that act to peel apart microtubule ends. Other 
kinesin family members as well as dynein use the energy of ATP hydrolysis to move 
unidirectionally along a microtubule. The motor dynein moves toward the minus 
end of microtubules, and its sliding of axonemal microtubules underlies the beating 
of cilia and flagella. Primary cilia are nonmotile sensory organs found on many 
cell types. 


INTERMEDIATE FILAMENTS AND SEPTINS 


All eukaryotic cells contain actin and tubulin. But the third major type of cyto- 
skeletal protein, the intermediate filament, forms a cytoplasmic filament only in 
some metazoans—including vertebrates, nematodes, and mollusks. Intermediate 
filaments are particularly prominent in the cytoplasm of cells that are subject to 
mechanical stress and are generally not found in animals that have rigid exoskel- 
etons, such as arthropods and echinoderms. It seems that intermediate filaments 
impart mechanical strength to tissues for the squishier animals. 

Cytoplasmic intermediate filaments are closely related to their ancestors, the 
much more prevalent nuclear lamins, which are found in many eukaryotes but 
missing from unicellular organisms. The nuclear lamins form a meshwork lining 
the inner membrane of the nuclear envelope, where they provide anchorage sites 
for chromosomes and nuclear pores. Several times during metazoan evolution, 
lamin genes have apparently duplicated, and the duplicates have evolved to pro- 
duce ropelike, cytoplasmic intermediate filaments. In contrast to the highly con- 
served actins and tubulin isoforms that are encoded by a handful of genes, differ- 
ent families of intermediate filaments are much more diverse and are encoded by 
70 different human genes with distinct, cell type-specific functions (Table 16-2). 


TABLE 16-2 


Nuclear Lamins A, B, and C Nuclear lamina (inner lining of 
nuclear envelope) 
Vimentin-like Many cells of mesenchymal origin 


Glial fibrillary acidic protein Glial cells (astrocytes and some 
Schwann cells) 


Epithelial Type | keratins (acidic) Epithelial cells and their derivatives 


(e.g., hair and nails) 
“Type Il keratins (neutral/basic) ll keratins “Type Il keratins (neutral/basic) 


Axonal Neurofilament proteins Neurons 
(NF-L, NF-M, and NF-H) 
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Figure 16-67 A model of intermediate filament construction. The monomer shown in (A) pairs with another monomer to 

form a dimer (B), in which the conserved central rod domains are aligned in parallel and wound together into a coiled-coil. 

(C) Two dimers then line up side by side to form an antiparallel tetramer of four polypeptide chains. Dimers and tetramers are 

the soluble subunits of intermediate filaments. (D) Within each tetramer, the two dimers are offset with respect to one another, 
thereby allowing it to associate with another tetramer. (E) In the final 10-nm ropelike filament, tetramers are packed together in a 
helical array, which has 16 dimers (32 coiled-coils) in cross section. Half of these dimers are pointing in each direction. An electron 
micrograph of intermediate filaments is shown on the upper left (Movie 16.12). (Electron micrograph courtesy of Roy Quinlan.) 


Intermediate Filament Structure Depends on the Lateral Bundling 
and Twisting of Coiled-Coils 


Although their amino- and carboxy-terminal domains differ, all intermediate fil- 
ament family members are elongated proteins with a conserved central a-helical 
domain containing 40 or so heptad repeat motifs that form an extended coiled- 
coil structure with another monomer (see Figure 3-9). A pair of parallel dimers 
then associates in an antiparallel fashion to form a staggered tetramer (Figure 
16-67). Unlike actin or tubulin subunits, intermediate filament subunits do not 
contain a binding site for a nucleotide. Furthermore, since the tetrameric sub- 
unit is made up of two dimers pointing in opposite directions, its two ends are the 
same. The assembled intermediate filament therefore lacks the overall structural 
polarity that is critical for actin filaments and microtubules. The tetramers pack 
together laterally to form the filament, which includes eight parallel protofila- 
ments made up of tetramers. Each individual intermediate filament therefore has 
a cross section of 32 individual a-helical coils. This large number of polypeptides 
all lined up together, with the strong lateral hydrophobic interactions typical of 
coiled-coil proteins, gives intermediate filaments a ropelike character. They can 
be easily bent, with a persistence length of less than one micrometer (compared 
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to several millimeters for microtubules and about ten micrometers for actin), but 
they are extremely difficult to break and can be stretched to over three times their 
length (see Figure 16-6). 

Less is understood about the mechanism of assembly and disassembly of 
intermediate filaments than of actin filaments and microtubules. In pure protein 
solutions, intermediate filaments are extremely stable due to tight association 
of subunits, but some types of intermediate filaments, including vimentin, form 
highly dynamic structures in cells such as fibroblasts. Protein phosphorylation 
probably regulates their disassembly, in much the same way that phosphorylation 
regulates the disassembly of nuclear lamins in mitosis (see Figure 12-18). As evi- 
dence for rapid turnover, labeled subunits microinjected into tissue-culture cells 
incorporate into intermediate filaments within a few minutes. Remodeling of the 
intermediate filament network accompanies events requiring dynamic cellular 
reorganization, such as division, migration, and differentiation. 


Intermediate Filaments Impart Mechanical Stability 
to Animal Cells 


Keratins are the most diverse intermediate filament family: there are about 20 
found in different types of human epithelial cells and about 10 more that are spe- 
cific to hair and nails; analysis of the human genome sequence has revealed that 
there are 54 distinct keratins. Every keratin filament is made up of an equal mix- 
ture of type I (acidic) and type II (neutral/basic) keratin proteins; these form a 
heterodimer filament subunit (see Figure 16-67). Cross-linked keratin networks 
held together by disulfide bonds can survive even the death of their cells, forming 
tough coverings for animals, as in the outer layer of skin and in hair, nails, claws, 
and scales. The diversity in keratins is clinically useful in the diagnosis of epithe- 
lial cancers (carcinomas), as the particular set of keratins expressed gives an indi- 
cation of the epithelial tissue in which the cancer originated and thus can help to 
guide the choice of treatment. 

A single epithelial cell may produce multiple types of keratins, and these 
copolymerize into a single network (Figure 16-68). Keratin filaments impart 
mechanical strength to epithelial tissues in part by anchoring the intermediate 
filaments at sites of cell-cell contact, called desmosomes, or cell-matrix contact, 
called hemidesmosomes (see Figure 16-4). We discuss these important adhesive 
structures in Chapter 19. Accessory proteins, such as filaggrin, bundle keratin fil- 
aments in differentiating cells of the epidermis to give the outermost layers of the 
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Figure 16-68 Keratin filaments in 
epithelial cells. Immunofluorescence 
micrograph of the network of keratin 
filaments (b/ue) in a sheet of epithelial 
cells in culture. The filaments in each cell 
are indirectly connected to those of its 
neighbors by desmosomes (discussed in 
Chapter 19). A second protein (red) has 
been stained to reveal the location of the 
cell boundaries. (Courtesy of Kathleen 
Green and Evangeline Amargo.) 
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skin their special toughness. Individuals with mutations in the gene encoding fil- 
agerin are strongly predisposed to dry skin diseases such as eczema. 

Mutations in keratin genes cause several human genetic diseases. For exam- 
ple, when defective keratins are expressed in the basal cell layer of the epidermis, 
they produce a disorder called epidermolysis bullosa simplex, in which the skin 
blisters in response to even very slight mechanical stress, which ruptures the basal 
cells (Figure 16-69). Other types of blistering diseases, including disorders of the 
mouth, esophageal lining, and the cornea of the eye, are caused by mutations in 
the different keratins whose expression is specific to those tissues. All of these 
maladies are typified by cell rupture as a consequence of mechanical trauma 
and a disorganization or clumping of the keratin filament cytoskeleton. Many of 
the specific mutations that cause these diseases alter the ends of the central rod 
domain, demonstrating the importance of this particular part of the protein for 
correct filament assembly. 

Members of another family of intermediate filaments, called neurofilaments, 
are found in high concentrations along the axons of vertebrate neurons (Figure 
16-70). Three types of neurofilament proteins (NF-L, NF-M, and NF-H) coassem- 
ble in vivo, forming heteropolymers. The NF-H and NF-M proteins have lengthy 
C-terminal tail domains that bind to neighboring filaments, generating aligned 
arrays with a uniform interfilament spacing. During axonal growth, new neuro- 
filament subunits are incorporated all along the axon in a dynamic process that 
involves the addition of subunits along the filament length as well as the ends. 
After an axon has grown and connected with its target cell, the diameter of the 
axon may increase as much as fivefold. The level of neurofilament gene expres- 
sion seems to directly control axonal diameter, which in turn influences how 
fast electrical signals travel down the axon. In addition, neurofilaments provide 
strength and stability to the long cell processes of neurons. 

The neurodegenerative disease amyotrophic lateral sclerosis (ALS, or Lou 
Gehrig’s disease) is associated with an accumulation and abnormal assembly 
of neurofilaments in motor neuron cell bodies and in the axon, aberrations that 
may interfere with normal axonal transport. The degeneration of the axons leads 
to muscle weakness and atrophy, which is usually fatal. The overexpression of 
human NF-L or NF-H in mice results in mice that have an ALS-like disease. How- 
ever, a causative link between neurofilament pathology and ALS has not been 
firmly established. 
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Figure 16-69 Blistering of the skin caused by a mutant keratin gene. A mutant gene encoding a truncated keratin protein (lacking both the 

N- and C-terminal domains) was expressed in a transgenic mouse. The defective protein assembles with the normal keratins and thereby disrupts 
the keratin filament network in the basal cells of the skin. Light micrographs of cross sections of (A) normal and (B) mutant skin show that the 
blistering results from the rupturing of cells in the basal layer of the mutant epidermis (Short red arrows). (C) A sketch of three cells in the basal 

layer of the mutant epidermis, as observed by electron microscopy. As indicated by the red arrow, the cells rupture between the nucleus and the 
hemidesmosomes (discussed in Chapter 19), which connect the keratin filaments to the underlying basal lamina. (From P.A. Coulombe et al., J. Cell 
Biol. 115:1661-1674, 1991. With permission from The Rockefeller University Press.) 
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The vimentin-like filaments are a third family of intermediate filaments. 
Desmin, a member of this family, is expressed in skeletal, cardiac, and smooth 
muscle, where it forms a scaffold around the Z disc of the sarcomere (see Figure 
16-34). Mice lacking desmin show normal initial muscle development, but adults 
have various muscle-cell abnormalities, including misaligned muscle fibers. In 
humans, mutations in desmin are associated with various forms of muscular dys- 
trophy and cardiac myopathy, illustrating the important role of desmin in stabi- 
lizing muscle fibers. 

Besides their well-established role in maintaining the mechanical stability 
of the nucleus, it is becoming increasingly evident that one class of lamins, the 
A-type, together with many proteins of the nuclear envelope, are scaffolds for pro- 
teins that control myriad cellular processes including transcription, chromatin 
organization, and signal transduction. The majority of laminopathies are associ- 
ated with mutant versions of lamin A and include tissue-specific diseases. Skeletal 
and cardiac abnormalities might be explained by a weakened nuclear envelope 
leading to cell damage and death, but laminopathies are also thought to arise 
from pathogenic and tissue-specific alterations in gene expression. 


Linker Proteins Connect Cytoskeletal Filaments and Bridge the 
Nuclear Envelope 


The intermediate filament network is linked to the rest of the cytoskeleton by 
members of a family of proteins called plakins. Plakins are large and modular, 
containing multiple domains that connect cytoskeletal filaments to each other 
and to junctional complexes. Plectin is a particularly interesting example. In addi- 
tion to bundling intermediate filaments, it links the intermediate filaments to 
microtubules, actin filament bundles, and filaments of the motor protein myosin 
I]; it also helps attach intermediate filament bundles to adhesive structures at the 
plasma membrane (Figure 16-71). 

Plectin and other plakins can interact with protein complexes that connect 
the cytoskeleton to the nuclear interior. These complexes consist of SUN proteins 
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Figure 16-70 Two types of intermediate 
filaments in cells of the nervous system. 
(A) Freeze-etch electron microscopic 
image of neurofilaments in a nerve cell 
axon, showing the extensive cross- 

linking through protein cross-bridges—an 
arrangement believed to give this long cell 
process great tensile strength. The cross- 
bridges are formed by the long, nonhelical 
extensions at the C-terminus of the largest 
neurofilament protein (NF-H). (B) Freeze- 
etch image of glial filaments in glial cells, 
showing that these intermediate filaments 
are smooth and have few cross-bridges. 
(C) Conventional transmission electron 
micrograph of a cross section of an axon 
showing the regular side-to-side spacing 
of the neurofilaments, which greatly 
outnumber the microtubules. 

(A and B, courtesy of Nobutaka Hirokawa; 
C, courtesy of John Hopkins.) 


Figure 16-71 Plectin cross-linking of 
diverse cytoskeletal elements. Plectin 
(green) is seen here making cross- 

links from intermediate filaments (b/ue) 
to microtubules (red). In this electron 
micrograph, the dots (yellow) are gold 
particles linked to anti-plectin antibodies. 
The entire actin filament network was 
removed to reveal these proteins. (From 
T.M. Svitkina et al., J. Cell Biol. 135:991- 
1007, 1996. With permission from The 
Rockefeller University Press.) 
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of the inner nuclear membrane and KASH proteins (also called nesprins) of the 
outer nuclear membrane (Figure 16-72). SUN and KASH proteins bind to each 
other within the lumen of the nuclear envelope, forming a bridge that connects the 
nuclear and cytoplasmic cytoskeletons. Inside the nucleus, the SUN proteins bind 
to the nuclear lamina or chromosomes, whereas in the cytoplasm, KASH proteins 
can bind directly to actin filaments and indirectly to microtubules and intermedi- 
ate filaments through association with motor proteins and plakins, respectively. 
This linkage serves to mechanically couple the nucleus to the cytoskeleton and is 
involved in many cellular functions, including chromosome movements inside 
the nucleus during meiosis, nuclear and centrosome positioning, nuclear migra- 
tion, and global cytoskeletal organization. 

Mutations in the gene for plectin cause a devastating human disease that 
combines epidermolysis bullosa (caused by disruption of skin keratin filaments), 
muscular dystrophy (caused by disruption of desmin filaments), and neuro- 
degeneration (caused by disruption of neurofilaments). Mice lacking a functional 
plectin gene die within a few days of birth, with blistered skin and abnormal skel- 
etal and heart muscles. Thus, although plectin may not be necessary for the ini- 
tial formation and assembly of intermediate filaments, its cross-linking action is 
required to provide cells with the strength they need to withstand the mechanical 
stresses inherent to vertebrate life. 


Septins Form Filaments That Regulate Cell Polarity 


GTP-binding proteins called septins serve as an additional filament system in all 
eukaryotes except terrestrial plants. Septins assemble into nonpolar filaments 
that form rings and cagelike structures, which act as scaffolds to compartmen- 
talize membranes into distinct domains, or recruit and organize the actin and 
microtubule cytoskeletons. First identified in budding yeast, septin filaments 
localize to the neck between a dividing yeast mother cell and its growing bud 
(Figure 16-73A). At this location, septins block the movement of proteins from 
one side of the bud neck to the other, thereby concentrating cell growth preferen- 
tially within the bud. Septins also recruit the actin-myosin machinery that forms 
the contractile ring required for cytokinesis. In animal cells, septins function in 
cell division, migration, and vesicle trafficking. In primary cilia, for example, a 
ring of septin filaments assembles at the base of the cilium and serves as a dif- 
fusion barrier at the plasma membrane, restricting the movement of membrane 
proteins and establishing a specific composition in the ciliary membrane (Figure 
16-73B and C). Reduction of septin levels impairs primary cilium formation and 
signaling. 

There are 7 septin genes in yeast and 13 in human, and septin proteins fall 
into four groups on the basis of sequence relationships. In a test tube, purified 
septins assemble into symmetrical hetero-hexamers or hetero-octamers that 
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Figure 16-72 SUN-KASH protein 
complexes connect the nucleus 

and cytoplasm through the nuclear 
envelope. The cytoplasmic cytoskeleton 
is linked across the nuclear envelope 

to the nuclear lamina or chromosomes 
through SUN and KASH proteins (orange 
and purple, respectively). The SUN and 
KASH domains of these proteins bind 
within the lumen of the nuclear envelope. 
From the inner nuclear envelope, SUN 
proteins connect to the nuclear lamina 
or chromosomes. KASH proteins in the 
outer nuclear envelope connect to the 
cytoplasmic cytoskeleton by binding 
microtubule motor proteins, actin 
filaments, or plectin. 
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form nonpolar paired filaments (Figure 16-74). GTP binding is required for the 
folding of septin polypeptides, but the role of GTP hydrolysis in septin function is 
not understood. Septin structures assemble and disassemble inside cells, but they 
are not as dynamic as actin filaments and microtubules. 


Summary 


Whereas tubulin and actin have been highly conserved in evolution, intermediate 
filament proteins are very diverse. There are many tissue-specific forms of interme- 
diate filaments in the cytoplasm of animal cells, including keratin filaments in epi- 
thelial cells, neurofilaments in nerve cells, and desmin filaments in muscle cells. The 
primary function of these filaments is to provide mechanical strength. Septins com- 
prise an additional system of filaments that organize compartments inside cells. 
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Figure 16-74 Septins polymerize to form paired filaments and sheets. 

(A) Electron micrograph of a septin rod assembled by combining two copies 

each of the four yeast septins illustrated at the right. The eight-subunit rod is 
nonpolar because the central pair of subunits (Cdc10) creates a symmetrical 
dimer. (B) Electron micrograph of paired septin filaments and sheets, assembled 
from purified septins in the presence of high salt concentrations. (C) Paired septin 
filaments may assemble by lateral association between filaments, mediated by 
coiled-coils formed between the paired C-terminal extensions of Cdc3 and Cdc12 
that project from each filament. (Images and schematics adapted from A. Bertin 
et al., Proc. Natl Acad. Sci. USA 105:8274-8279, 2008. With permission from the 
National Academy of Sciences.) 


Figure 16-73 Cell compartmentalization 
by septins. (A) Septins form filaments in 
the neck region between a mother yeast 
cell and bud. (B) In this photomicrograph 
of human cultured cells, the DNA is 
stained blue and septins are labeled in 
green. The microtubules of primary cilia are 
labeled with an antibody that recognizes a 
modified (acetylated) form of tubulin (red) 
that is enriched in the axoneme. (C) A 
magnified image reveals a collar of septin 
at the base of the cilium. (A, from B. Byers 
and L. Goetsch, J. Cell Biol. 69:717-721, 
1976. With permission from Rockefeller 
University Press. B and C, from Q. Hu et 
al., Science 329:436-439, 2010. With 
permission from AAAS.) 
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CELL POLARIZATION AND MIGRATION 


A central challenge in cell biology is to understand how multiple individual 
molecular components collaborate to produce complex cell behaviors. The pro- 
cess of cell migration, which we describe in this final section, relies on the coor- 
dinated deployment of the components and processes that we have explored in 
this chapter: the dynamic assembly and disassembly of cytoskeletal polymers, the 
regulation and modification of their structure by polymer-associated proteins, 
and the actions of motor proteins moving along the polymers or exerting tension 
against them. How does the cell coordinate all these activities to define its polarity 
and enable it to crawl? 


Many Cells Can Crawl Across a Solid Substratum 


Many cells move by crawling over surfaces rather than by using cilia or flagella 
to swim. Predatory amoebae crawl continuously in search of food, and they can 
easily be observed to attack and devour smaller ciliates and flagellates in a drop 
of pond water (see Movie 1.4). In animals, almost all cell locomotion occurs by 
crawling, with the notable exception of swimming sperm. During embryogene- 
sis, the structure of an animal is created by the migrations of individual cells to 
specific target locations and by the coordinated movements of whole epithelial 
sheets (discussed in Chapter 21). In vertebrates, neural crest cells are remarkable 
for their long-distance migrations from their site of origin in the neural tube to a 
variety of sites throughout the embryo (see Movie 21.5). Long-distance crawling 
is fundamental to the construction of the entire nervous system: it is in this way 
that the actin-rich growth cones at the advancing tips of developing axons travel 
to their eventual synaptic targets, guided by combinations of soluble signals and 
signals bound to cell surfaces and extracellular matrix along the way. 

The adult animal also seethes with crawling cells. Macrophages and neutro- 
phils crawl to sites of infection and engulf foreign invaders as a critical part of 
the innate immune response. Osteoclasts tunnel into bone, forming channels that 
are filled in by the osteoblasts that follow after them, in a continuous process of 
bone remodeling and renewal. Similarly, fibroblasts migrate through connective 
tissues, remodeling them where necessary and helping to rebuild damaged struc- 
tures at sites of injury. In an ordered procession, the cells in the epithelial lining 
of the intestine travel up the sides of the intestinal villi, replacing absorptive cells 
lost at the tip of the villus. Unfortunately, cell crawling also has a role in many 
cancers, when cells in a primary tumor invade neighboring tissues and crawl into 
blood vessels or lymph vessels and then emerge at other sites in the body to form 
metastases. 

Cell migration is a complex process that depends on the actin-rich cortex 
beneath the plasma membrane. Three distinct activities are involved: protrusion, 
in which the plasma membrane is pushed out at the front of the cell; attachment, 
in which the actin cytoskeleton connects across the plasma membrane to the sub- 
stratum; and traction, in which the bulk of the trailing cytoplasm is drawn forward 
(Figure 16-75). In some crawling cells, such as keratocytes from the fish epider- 
mis, these activities occur simultaneously, and the cells seem to glide forward 
smoothly without changing shape. In other cells, such as fibroblasts, these activi- 
ties are more independent, and the locomotion is jerky and irregular. 


Actin Polymerization Drives Plasma Membrane Protrusion 


The first step in locomotion, protrusion of a leading edge, frequently relies on 
forces generated by actin polymerization pushing the plasma membrane out- 
ward. Different cell types generate different types of protrusive structures, includ- 
ing filopodia (also known as microspikes) and lamellipodia. These are filled with 
dense cores of filamentous actin, which excludes membrane-enclosed organ- 
elles. The structures differ primarily in the way in which the actin is organized by 
actin-cross-linking proteins (see Figure 16-22). 
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Filopodia, formed by migrating growth cones of neurons and some types of 
fibroblasts, are essentially one-dimensional. They contain a core of long, bun- 
dled actin filaments, which are reminiscent of those in microvilli but longer and 
thinner, as well as more dynamic. Lamellipodia, formed by epithelial cells and 
fibroblasts, as well as by some neurons, are two-dimensional, sheetlike structures. 
They contain a cross-linked mesh of actin filaments, most of which lie in a plane 
parallel to the solid substratum. Invadopodia and related structures known as 
podosomes represent a third type of actin-rich protrusion. These extend in three 
dimensions and are important for cells to cross tissue barriers, as when a meta- 
static cancer cell invades the surrounding tissue. Invadopodia contain many of 
the same actin-regulatory components as filopodia and lamellipodia, and they 
also degrade the extracellular matrix, which requires the delivery of vesicles con- 
taining matrix-degrading proteases. 

A distinct form of membrane protrusion called blebbing is often observed in 
vivo or when cells are cultured on a pliable extracellular matrix substratum. Blebs 
form when the plasma membrane detaches locally from the underlying actin 
cortex, thereby allowing cytoplasmic flow to push the membrane outward (Fig- 
ure 16-76). Bleb formation also depends on hydrostatic pressure within the cell, 
which is generated by the contraction of actin and myosin assemblies. Once blebs 
have extended, actin filaments reassemble on the bleb membrane to form a new 
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Figure 16-75 A model of how forces 
generated in the actin-rich cortex move 
a cell forward. The actin-polymerization- 
dependent protrusion and firm attachment 
of a lamellipodium at the leading edge 

of the cell move the edge forward (green 
arrows at front) and stretch the actin 
cortex. Contraction at the rear of the 

cell propels the body of the cell forward 
(green arrow at back) to relax some of 

the tension (traction). New focal contacts 
are made at the front, and old ones are 
disassembled at the back as the cell crawls 
forward. The same cycle can be repeated, 
moving the cell forward in a stepwise 
fashion. Alternatively, all steps can be 
tightly coordinated, moving the cell forward 
smoothly. The newly polymerized cortical 
actin is shown in red. 


Figure 16-76 Membrane bleb induced 
by disruption of the actin cortex. On 

the left is a light micrograph showing a 
spherical membrane protrusion or bleb 
induced by laser ablation of a small region 
of the actin cortex. The cortex is labeled 
green in the middle image by expression of 
GFP-actin. (Courtesy of Ewa Paluch.) 
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actin cortex. Recruitment of myosin II and contraction of actin and myosin can 
then power retraction of membrane blebs. Alternatively, extension of new blebs 
from old ones can drive cell migration. 


Lamellioodia Contain All of the Machinery Required for Cell Motility 


Lamellipodia have been particularly well studied in the epithelial cells of the epi- 
dermis of fish and frogs; these epithelial cells are known as keratocytes because 
of their abundant keratin filaments. These cells normally cover the animal by 
forming an epithelial sheet, and they are specialized to close wounds very rapidly, 
moving at rates of up to 30 um/min. When cultured as individual cells, keratocytes 
assume a distinctive shape with a very large lamellipodium and a small, trailing 
cell body that is not attached to the substratum (Figure 16-77). Fragments of this 
lamellipodium can be sliced off with a micropipette. Although the fragments gen- 
erally lack microtubules and membrane-enclosed organelles, they continue to 
crawl normally, looking like tiny keratocytes. 

The dynamic behavior of actin filaments in keratocyte lamellipodia can be 
studied by labeling a small patch of actin and examining its fate. This reveals that, 
while the lamellipodia crawl forward, the actin filaments remain stationary with 
respect to the substratum. The actin filaments in the meshwork are mostly ori- 
ented with their plus ends facing forward. The minus ends are frequently attached 
to the sides of other actin filaments by Arp 2/3 complexes (see Figure 16-16), 
helping to form the two-dimensional web (Figure 16-78). The web as a whole is 
undergoing treadmilling, assembling at the front and disassembling at the back, 
reminiscent of the treadmilling that occurs in individual actin filaments discussed 
previously (see Figure 16-14). 


Figure 16-78 Actin filament nucleation and web formation by the 

Arp 2/3 complex in lamellipodia. (A) A keratocyte with actin filaments 
labeled in red by fluorescent phalloidin and the Arp 2/3 complex labeled in 
green with an antibody against one of its subunits. The Arp 2/3 complex 

is highly concentrated near the front of the lamellipodium, where actin 
nucleation is most active. (B) Electron micrograph of a platinum-shadowed 
replica of the leading edge of a keratocyte, showing the dense actin filament 
meshwork. The labels denote areas enlarged in (C). (C) Close-up views of the 
marked regions of the actin web at the leading edge shown in (B). Numerous 
branched filaments can be seen, with the characteristic 70° angle formed 
when the Arp 2/3 complex nucleates a new actin filament off the side of a 
preexisting filament (see Figure 16-16). (From T. Svitkina and G. Borisy, 

J. Cell Biol. 145:1009-1026, 1999. With permission from the authors.) 
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Figure 16-77 Migratory keratocytes from 
a fish epidermis. (A) Light micrographs of 
a keratocyte in culture, taken about 

15 seconds apart. This cell is moving 

at about 15 wm/min (Movie 16.13 and 
see Movie 1.1). (B) Keratocyte seen by 
scanning electron microscopy, showing 

its broad, flat lamellioodium and small 

cell body, including the nucleus, carried 
up above the substratum at the rear. 

(C) Distribution of cytoskeletal filaments in 
this cell. Actin filaments (red) fill the large 
lamellipodium and are responsible for 

the cell’s rapid movement. Microtubules 
(green) and intermediate filaments (blue) 
are restricted to the regions close to the 
nucleus. (A and B, courtesy of Juliet Lee.) 
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Maintenance of unidirectional motion by lamellipodia is thought to require 
the cooperation and mechanical integration of several factors. Filament nucle- 
ation is localized at the leading edge, with new actin filament growth occurring 
primarily in that location to push the plasma membrane forward. Most filament 
depolymerization occurs at sites located well behind the leading edge. Because 
cofilin (see Figure 16-20) binds cooperatively and preferentially to actin filaments 
containing ADP-actin (the D form), the new T-form filaments generated at the 
leading edge should be resistant to depolymerization by cofilin (Figure 16-79). 
As the filaments age and ATP hydrolysis proceeds, cofilin can efficiently disas- 
semble the older filaments. Thus, the delayed ATP hydrolysis by filamentous 
actin is thought to provide the basis for a mechanism that maintains an efficient, 
unidirectional treadmilling process in the lamellipodium (Figure 16-80); it also 
explains the intracellular movement of bacterial pathogens such as Listeria (see 
Figure 16-25). 


Myosin Contraction and Cell Adhesion Allow Cells to Pull 
Themselves Forward 


Forces generated by actin filament polymerization at the front of a migrating cell 
are transmitted to the underlying substratum to drive cell motion. For the leading 
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Figure 16-79 Cofilin in lamellipodia. 

(A) A keratocyte with actin filaments 
labeled in red by fluorescent phalloidin, and 
cofilin labeled in green with a fluorescent 
antibody. Although the dense actin 
meshwork reaches all the way through the 
lamellipodium, cofilin is not found at the 
very leading edge. (B) Close-up view of 

the region marked with the white rectangle 
in (A). The actin filaments closest to the 
leading edge, which are also the ones that 
have formed most recently and that are 
most likely to contain ATP-actin (rather than 
ADP-actin), are generally not associated 
with cofilin. (From T. Svitkina and G. Borisy, 
J. Cell Biol. 145:1009-1026, 1999. With 
permission from the authors.) 


Figure 16-80 A model for protrusion 

of the actin meshwork at the leading 
edge. Iwo time points during advance of 
the lamellipodium are illustrated, with newly 
assembled structures at the later time 
point shown in a lighter color. Nucleation 
is mediated by the Arp 2/3 complex at 

the front. Newly nucleated actin filaments 
are attached to the sides of preexisting 
filaments, primarily at a 70° angle. 
Filaments elongate, pushing the plasma 
membrane forward because of some 

sort of anchorage of the array behind. At 

a steady rate, actin filament plus ends 
become capped. After newly polymerized 
actin subunits hydrolyze their bound ATP in 
the filament lattice, the filaments become 
susceptible to depolymerization by cofilin. 
This cycle causes a spatial separation 
between net filament assembly at the 
front and net filament disassembly at the 
rear, so that the actin filament network as 
a whole can move forward, even though 
the individual filaments within it remain 
stationary with respect to the substratum. 
Not all of the actin disassembles, however, 
and actin at the rear of the lamellipodium 
contributes to subsequent steps of 
migration together with myosin. 
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Figure 16-81 Contribution of myosin II to polarized cell motility. 

(A) Myosin II bipolar filaments bind to actin filaments in the lamellipodial 
meshwork and cause network contraction. The myosin-driven reorientation 
of the actin filaments forms an actin bundle that recruits more myosin || 
and helps generate the contractile forces required for retraction of the 
trailing edge of the moving cell. (B) A fragment of the large lamellipodium 
of a keratocyte can be separated from the main cell body either by surgery 
with a micropipette or by treating the cell with certain drugs. Many of these 
fragments continue to move rapidly, with the same overall cytoskeletal 
organization as the intact keratocytes. Actin (blue) forms a protrusive 
meshwork at the front of the fragment. Myosin II (oink) is gathered into 

a band at the rear. (From A. Verkhovsky et al., Curr Biol. 9:11-20, 1999. 
With permission from Elsevier.) 


edge of a migrating cell to advance, protrusion of the membrane must be followed 
by adhesion to the substratum at the front. Conversely, in order for the cell body 
to follow, contraction must be coupled with de-adhesion at the rear of the cell. 
The processes contributing to migration are therefore tightly regulated in space 
and time, with actin polymerization, dynamic adhesions, and myosin contraction 
being employed to coordinate movement. Myosin II operates in at least two ways 
to assist cell migration. The first is by helping to connect the actin cytoskeleton to 
the substratum through integrin-mediated adhesions. Forces generated by both 
actin polymerization and myosin activity create tension at attachment sites, pro- 
moting their maturation into focal adhesions, which are dynamic assemblies of 
structural and signaling proteins that link the migrating cell to the extracellular 
matrix (see Figure 19-59). A second mechanism involves bipolar myosin II fila- 
ments, which associate with the actin filaments at the rear of the lamellipodium 
and pull them into a new orientation—from nearly perpendicular to the leading 
edge to almost parallel to the leading edge. This sarcomere-like contraction pre- 
vents protrusion, and it pinches in the sides of the locomoting lamellipodium, 
helping to gather in the sides of the cell as it moves forward (Figure 16-81). 

Actin-mediated protrusions can only push the leading edge of the cell forward 
if there are strong interactions between the actin network and the focal adhesions 
that link the cell to the substrate. When these interactions are disengaged, polym- 
erization pressure at the leading edge and myosin-dependent contraction cause 
the actin network to slip back, resulting in a phenomenon known as retrograde 
flow (Figure 16-82). 

The traction forces generated by locomoting cells exert a significant pull on the 
substratum. By growing cells on a surface coated with tiny flexible posts, the force 
exerted on the substratum can be calculated by measuring the deflection of each 
post from its vertical position (Figure 16-83). In a living animal, most crawling 
cells move across a semiflexible substratum made of extracellular matrix, which 
can be deformed and rearranged by these cell forces. Conversely, mechanical ten- 
sion or stretching applied externally to a cell will cause it to assemble stress fibers 
and focal adhesions, and become more contractile. Although poorly understood, 
this two-way mechanical interaction between cells and their physical environ- 
ment is thought to help vertebrate tissues organize themselves. 


Cell Polarization Is Controlled by Members of the Rho Protein 
Family 


Cell migration requires long-distance communication and coordination between 
one end of a cell and the other. During directed migration, it is important that the 
front end of the cell remain structurally and functionally distinct from the back 
end. In addition to driving local mechanical processes such as protrusion at the 
front and retraction at the rear, the cytoskeleton is responsible for coordinating 
cell shape, organization, and mechanical properties from one end of the cell to 
the other, a distance that is typically tens of micrometers for animal cells. 

In many cases, including but not limited to cell migration, large-scale cyto- 
skeletal coordination takes the form of the establishment of cell polarity, where 
a cell builds different structures with distinct molecular components at the front 
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versus the back, or at the top versus the bottom. Cell locomotion requires an ini- 
tial polarization of the cell to set it offin a particular direction. Carefully controlled 
cell-polarization processes are also required for oriented cell divisions in tissues 
and for formation of a coherent, organized multicellular structure. Genetic stud- 
ies in yeast, flies, and worms have provided most of our current understanding 
of the molecular basis of cell polarity. The mechanisms that generate cell polar- 
ity in vertebrates are only beginning to be explored. In all known cases, however, 
the cytoskeleton has a central role, and many of the molecular components have 
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been evolutionarily conserved. 


The establishment of many kinds of cell polarity depends on the local regula- 
tion of the actin cytoskeleton by external signals. Many of these signals seem to 
converge inside the cell on a group of closely related monomeric GTPases that 
are members of the Rho protein family—Cdc42, Rac, and Rho. Like other mono- 
meric GTPases, the Rho proteins act as molecular switches that cycle between 
an active GTP-bound state and an inactive GDP-bound state (see Figure 3-66). 
Activation of Cdc42 on the inner surface of the plasma membrane triggers actin 
polymerization and bundling to form filopodia. Activation of Rac promotes actin 
polymerization at the cell periphery, leading to the formation of sheetlike lamel- 
lipodial extensions. Activation of Rho promotes both the bundling of actin fila- 
ments with myosin II filaments into stress fibers and the clustering of integrins 
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Figure 16-82 Control of cell- 
substratum adhesion at the leading 
edge of a migrating cell. (A) Actin 
monomers assemble on the barbed end 
of actin filaments at the leading edge. 
Transmembrane integrin proteins (blue) 
help form focal adhesions that link the 

cell membrane to the substrate. (B) If 
there is no interaction between the actin 
filaments and focal adhesions, the actin 
filament is driven rearward by newly 
assembled actin. Myosin motors (green) 
also contribute to filament movement. 

(C) Interactions between actin-binding 
adaptor proteins (brown) and integrins link 
the actin cytoskeleton to the substratum. 
Myosin-mediated contractile forces are 
then transmitted through the focal adhesion 
to generate traction on the extracellular 
matrix, and new actin polymerization drives 
the leading edge forward in a protrusion. 


Figure 16-83 Traction forces exerted 
by a motile cell. (A) Tiny flexible pillars 
attached to the substratum bend in 
response to traction forces. (B) Scanning 
electron micrograph of a cell on a 
substratum coated with pillars that are 
6.1 um in height. Pillar deflections are used 
to calculate force vectors corresponding 
to inward pulling forces on the underlying 
substratum. (Adapted from J. Fu et al., 
Nat. Methods 7:733-736, 2010. With 
permission from Macmillan Publishers.) 
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actin staining actin staining Figure 16-84 The dramatic effects 

of Cdc42, Rac, and Rho on actin 
organization in fibroblasts. In each case, 
the actin filaments have been labeled with 
fluorescent phalloidin. (A) Serum-starved 
fibroblasts have actin filaments primarily 

in the cortex, and relatively few stress 
fibers. (B) Microinjection of a constitutively 
activated form of Cdc42 causes the 
protrusion of many long filopodia at 

the cell periphery. (C) Microinjection of 

a constitutively activated form of Rac, 

a closely related monomeric GTPase, 
causes the formation of an enormous 
lamellipodium that extends from the 

entire circumference of the cell. 

(D) Microinjection of a constitutively 
activated form of Rho causes the rapid 
assembly of many prominent stress fibers. 
(From A. Hall, Science 279:509-514, 1998. 
With permission from AAAS.) 





and associated proteins to form focal adhesions (Figure 16-84). These dramatic 
and complex structural changes occur because each of these three molecular 
switches has numerous downstream target proteins that affect actin organization 
and dynamics. 

Some key targets of activated Cdc42 are members of the WASp protein fam- 
ily. Human patients deficient in WASp suffer from Wiskott-Aldrich Syndrome, a 
severe form of immunodeficiency in which immune system cells have abnormal 
actin-based motility and platelets do not form normally. Although WASp itself 
is expressed only in blood cells and immune system cells, other more ubiqui- 
tous versions enable activated Cdc42 to enhance actin polymerization in many 
cell types. WASp proteins can exist in an inactive folded conformation and an 
activated open conformation. Association with Cdc42-GTP stabilizes the open 
form of WASp, enabling it to bind to the Arp 2/3 complex and strongly enhance 
its actin-nucleating activity (see Figure 16-16). In this way, activation of Cdc42 
increases actin nucleation. 

Rac-GTP also activates WASp family members. Additionally, it activates the 
cross-linking activity of the gel-forming protein filamin and inhibits the contrac- 
tile activity of the motor protein myosin II. It thereby stabilizes lamellipodia and 
inhibits the formation of contractile stress fibers (Figure 16-85A). 

Rho-GTP has a very different set of targets. Instead of activating the Arp 2/3 
complex to build actin networks, Rho-GTP turns on formin proteins to construct 
parallel actin bundles. At the same time, Rho-GTP activates a protein kinase that 
indirectly inhibits the activity of cofilin, leading to actin filament stabilization. The 
same protein kinase inhibits a phosphatase acting on myosin light chains (see 
Figure 16-39). The consequent increase in the net amount of myosin light chain 
phosphorylation increases the amount of contractile myosin motor protein activ- 
ity in the cell, enhancing the formation of tension-dependent structures such as 
stress fibers (Figure 16-85B). 

In some cell types, Rac-GTP activates Rho, usually at a rate that is slow com- 
pared to Rac’s activation of the Arp 2/3 complex. This enables cells to use the Rac 
pathway to build a new actin structure while subsequently activating the Rho 
pathway to generate a contractility that builds up tension in this structure. This 
occurs, for example, during the formation and maturation of cell-cell contacts. 
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As we will explore in more detail below, the communication between the Rac and 
Rho pathways also facilitates maintenance of the large-scale differences between 
the cell front and the cell rear during migration. 


Extracellular Signals Can Activate the Three Rho Protein Family 
Members 


The activation of the monomeric GTPases Rho, Rac, and Cdc42 occurs through an 
exchange of GTP for a tightly bound GDP molecule, catalyzed by guanine nucle- 
otide exchange factors (GEFs). Of the many GEFs that have been identified in the 
human genome, some are specific for an individual Rho family GTPase, whereas 
others seem to act on multiple family members. Different GEFs are restricted to 
specific tissues and even specific subcellular locations, and they are sensitive to 
distinct kinds of regulatory inputs. GEFs can be activated by extracellular cues 
through cell-surface receptors, or in response to intracellular signals. GEFs may 
also act as scaffolds that direct GTPases to downstream effectors. Interestingly, 
several of the Rho family GEFs associate with the growing ends of microtubules 
by binding to one of the +TIPs. This provides a connection between the dynamics 
of the microtubule cytoskeleton and the large-scale organization of the actin cyto- 
skeleton; such a connection is important for the overall integration of cell shape 
and movement. 


External Signals Can Dictate the Direction of Cell Migration 


Chemotaxis is the movement of a cell toward or away from a source of some dif- 
fusible chemical. These external signals act through Rho family proteins to set up 
large-scale cell polarity by influencing the organization of the cell motility appa- 
ratus. One well-studied example is the chemotactic movement of a class of white 
blood cells, called neutrophils, toward a source of bacterial infection. Receptor 
proteins on the surface of neutrophils enable them to detect very low concen- 
trations of N-formylated peptides that are derived from bacterial proteins (only 
prokaryotes begin protein synthesis with N-formylmethionine). Using these 
receptors, neutrophils are guided to bacterial targets by their ability to detect a 
difference of only 1% in the concentration of these diffusible peptides on one side 
of the cell versus the other (Figure 16-86A). 

In this case, and in the chemotaxis of Dictyostelium amoebae toward a source 
of cyclic AMP, binding of the chemoattractant to its G-protein-coupled receptor 
activates phosphoinositide 3-kinases (PI3Ks) (see Figure 15-52), which gener- 
ate a signaling molecule [PI(3,4,5)P3] that in turn activates the Rac GTPase. Rac 


Figure 16-85 The contrasting effects 

of Rac and Rho activation on actin 
organization. (A) Activation of the small 
GTPase Rac leads to alterations in actin 
accessory proteins that tend to favor 

the formation of actin networks, as in 
lamellioodia. Several different pathways 
contribute independently. Rac-GTP 
activates members of the WASp protein 
family, which in turn activate actin 
nucleation and branched web formation by 
the Arp 2/3 complex. In a parallel pathway, 
Rac-GTP activates a protein kinase, PAK, 
which has several targets including the 
web-forming cross-linker filamin, which 

is activated by phosphorylation, and the 
myosin light chain kinase (MLCk), which is 
inhibited by phosphorylation. Inhibition of 
MLCK results in decreased phosphorylation 
of the myosin regulatory light chain and 
leads to myosin Il filament disassembly and 
a decrease in contractile activity. In some 
cells, PAK also directly inhibits myosin II 
activity by phosphorylation of the myosin 
heavy chain (MHC). (B) Activation of the 
related GTPase Rho leads to nucleation of 
actin filaments by formins and increases 
contraction by myosin Il, promoting the 
formation of contractile actin bundles 

such as stress fibers. Activation of myosin 
Il by Rho requires a Rno-dependent 
protein kinase called Rock. This kinase 
inhibits the phosphatase that removes the 
activating phosphate groups from myosin 
Il light chains (MLC); it may also directly 
phosphorylate the myosin light chains in 
some cell types. Rock also activates other 
protein kinases, such as LIM kinase, which 
in turn contributes to the formation of 
stable contractile actin filament bundles by 
inhibiting the actin depolymerizing factor 
cofilin. A similar signaling pathway 

is important for forming the contractile 

ring necessary for cytokinesis (See 

Figure 17-44), 
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then activates the Arp 2/3 complex leading to lamellipodial protrusion. Through 
an unknown mechanism, accumulation of the polarized actin web at the lead- 
ing edge causes further local enhancement of PI3K activity in a positive feedback 
loop, strengthening the induction of protrusion. The PI(3,4,5)P3 that activates Rac 
cannot diffuse far from its site of synthesis, since it is rapidly converted back into 
PI(4,5)P2 by a constitutively active lipid phosphatase. At the same time, binding 
of the chemoattractant ligand to its receptor activates another signaling pathway 
that turns on Rho and enhances myosin-based contractility. The two processes 
directly inhibit each other, such that Rac activation dominates in the front of the 
cell and Rho activation dominates in the rear (Figure 16-86B). This enables the 
cell to maintain its functional polarity with protrusion at the leading edge and 
contraction at the back. 

Nondiffusible chemical cues attached to the extracellular matrix or to the sur- 
face of cells can also influence the direction of cell migration. When these sig- 
nals activate receptors, they can cause increased cell adhesion and directed actin 
polymerization. Most long-distance cell migrations in animals, including neural- 
crest-cell migration and the travels of neuronal growth cones, depend on a com- 
bination of diffusible and nondiffusible signals to steer the locomoting cells or 
growth cones to their proper destinations. 


Communication Among Cytoskeletal Elements Coordinates 
Whole-Cell Polarization and Locomotion 


The interconnected cytoskeleton is crucial for cell migration. Although movement 
is driven primarily by actin polymerization and myosin contractility, septins and 
intermediate filaments also participate. For example, vimentin intermediate fil- 
ament networks associate with integrins at focal adhesions, and vimentin-defi- 
cient fibroblasts display impaired mechanical stability, migration, and contractile 
capacity. Furthermore, disruption of linker proteins that connect different cyto- 
skeletal elements, including several plakins and KASH proteins, leads to defects 
in cell polarization and migration. Thus, interactions among cytoplasmic filament 
systems, as well as mechanical linkage to the nucleus, are required for complex, 
whole-cell behaviors such as migration. 

Cells also use microtubules to help organize persistent movement in a spe- 
cific direction. In many locomoting cells, the position of the centrosome is influ- 
enced by the location of protrusive actin polymerization. Activation of receptors 
on the protruding front edge of a cell might locally activate dynein motor proteins 
that move the centrosome by pulling on its microtubules. Several effector pro- 
teins downstream of Rac and Rho modulate microtubule dynamics directly: for 
example, a protein kinase activated by Rac can phosphorylate (and thereby 
inhibit) the tubulin-binding protein stathmin (see Panel 16-4), thereby stabilizing 
microtubules. 
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Figure 16-86 Neutrophil polarization 
and chemotaxis. (A) The pipette tip at 
the right is leaking a small amount of 

the bacterial peptide formyl-Met-Leu- 
Phe, which is recognized by the human 
neutrophil as the product of a foreign 
invader. The neutrophil quickly extends a 
new lamellioodium toward the source of 
the chemoattractant peptide (top). It then 
extends this lamellioodium and polarizes 
its cytoskeleton so that contractile myosin 
ll is located primarily at the rear, opposite 
the position of the lamellipodium (middle). 
Finally, the cell crawls toward the source 
of the peptide (bottom). If a real bacterium 
were the source of the peptide, rather than 
an investigator’s pipette, the neutrophil 
would engulf the bacterium and destroy 

it (See also Figure 16-3 and Movie 
16.14). (B) Binding of bacterial molecules 
to G-protein-coupled receptors on the 
neutrophil stimulates directed motility. 
These receptors are found all over the 
surface of the cell, but are more likely to be 
bound to the bacterial ligand at the front. 
Two distinct signaling pathways contribute 
to the cell’s polarization. At the front of the 
cell, stimulation of the Rac pathway leads, 
via the trimeric G protein Gi, to growth 

of protrusive actin networks. Second 
messengers within this pathway are short- 
lived, so protrusion is limited to the region 
of the cell closest to the stimulant. The 
same receptor also stimulates a second 
signaling pathway, via the trimeric G 
proteins G12 and G43, that triggers the 
activation of Rho. The two pathways are 
mutually antagonistic. Since Rac-based 
protrusion is active at the front of the cell, 
Rho is activated only at the rear of the cell, 
stimulating contraction of the cell rear and 
assisting directed movement. (A, from 
O.D. Weiner et al., Nat. Cell Biol. 1:75-81, 
1999. With permission from Macmillan 
Publishers Ltd.) 
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In turn, microtubules influence actin rearrangements and cell adhesion. The 
centrosome nucleates a large number of dynamic microtubules, and its reposi- 
tioning means that the plus ends of many of these microtubules extend into the 
protrusive region of the cell. Direct interactions with microtubules help guide focal 
adhesion dynamics in migrating cells. Microtubules might also influence actin fil- 
ament formation by delivering Rac-GEFs that bind to the +TIPs traveling on grow- 
ing microtubule ends. Microtubules also transport cargoes to and from the focal 
adhesions, thereby affecting their signaling and disassembly. Thus, microtubules 
reinforce the polarity information that the actin cytoskeleton receives from the 
outside world, allowing a sensitive response to weak signals and enabling motility 
to persist in the same direction for a prolonged period. 


Summary 


Whole-cell movements and the large-scale shaping and structuring of cells require 
the coordinated activities of all three basic filament systems along with a large vari- 
ety of cytoskeletal accessory proteins, including motor proteins. Cell crawling—a 
widespread behavior important in embryonic development and also in wound 
healing, tissue maintenance, and immune system function in the adult animal— 
is a prime example of such complex, coordinated cytoskeletal action. For a cell to 
crawl, it must generate and maintain an overall structural polarity, which is influ- 
enced by external cues. In addition, the cell must coordinate protrusion at the lead- 
ing edge (by assembly of new actin filaments), adhesion of the newly protruded part 
of the cell to the substratum, and forces generated by molecular motors to bring the 
cell body forward. 


PROBLEMS 


Which statements are true? Explain why or why not. 


16-1 ‘The role of ATP hydrolysis in actin polymerization 


is similar to the role of GTP hydrolysis in tubulin polym- z 
erization: both serve to weaken the bonds in the polymer E ian 
and thereby promote depolymerization. 2 B { 
E 
16-2 Motor neurons trigger action potentials in muscle æ 
cell membranes that open voltage-sensitive Ca** channels g a 
in T tubules, allowing extracellular Ca** to enter the cyto- c 25 l 
sol, bind to troponin C, and initiate rapid muscle contrac- i r 
tion. 1 2 


F. 


WHAT WE DON’T KNOW 


e How is the cell cortex regulated 
locally and globally to coordinate its 
activities at different places on the 
cell surface? What determines, for 
example, where filopodia form? 


e How are actin-regulatory proteins 
controlled spatially in the cytoplasm to 
generate multiple distinct types of actin 
arrays in the same cell? 


e Are there biologically important 
processes occurring inside a 
microtubule? 


e How can we account for the fact that 
there are many different kinesins and 
myosins in the cytoplasm but only one 
dynein? 


e Mutations in the nuclear lamin 
proteins cause a large number of 
diseases called laminopathies. What do 
we not understand about the nuclear 
lamina that could account for this fact? 


3 4 


sarcomere length (um) 


16-3 In most animal cells, minus-end directed microtu- 
bule motors deliver their cargo to the periphery of the cell, 
whereas plus-end directed microtubule motors deliver 
their cargo to the interior of the cell. 


Figure Q16-1 Tension as a function of sarcomere length during 
isometric contraction (Problem 16-5). 


Discuss the following problems. 


16-4 ‘The concentration of actin in cells is 50-100 times 
greater than the critical concentration observed for pure 
actin in a test tube. How is this possible? What prevents the 
actin subunits in cells from polymerizing into filaments? 
Why is it advantageous to the cell to maintain such a large 
pool of actin subunits? 


16-5 Detailed measurements of sarcomere length and 
tension during isometric contraction in striated muscle 
provided crucial early support for the sliding-filament 


model of muscle contraction. Based on your understand- 
ing of the sliding-filament model and the structure of a 
sarcomere, propose a molecular explanation for the rela- 
tionship of tension to sarcomere length in the portions of 
Figure Q16-1 marked I, II, III, and IV. (In this muscle, the 
length of the myosin filament is 1.6 um, and the lengths of 
the actin thin filaments that project from the Z discs are 1.0 


um.) 


16-6 At 1.4 mg/mL pure tubulin, microtubules grow at 
a rate of about 2 um/min. At this growth rate, how many 
aß-tubulin dimers (8 nm in length) are added to the ends 
of a microtubule each second? 


CHAPTER 16 END-OF-CHAPTER PROBLEMS 


LINEAR GROWTH LATERAL ASSOCIATION 


b~ $~ 


Figure Q16-2 Model for microtubule nucleation by pure aß-tubulin 
dimers (Problem 16-7). 


16-7 A solution of pure aß-tubulin dimers is thought to 
nucleate microtubules by forming a linear protofilament 
about seven dimers in length. At that point, the probabili- 
ties that the next aß-dimer will bind laterally or to the end 
of the protofilament are about equal. The critical event 
for microtubule formation is thought to be the first lateral 
association (Figure Q16-2). How does lateral association 
promote the subsequent rapid formation of a microtu- 
bule? 


16-8 How does a centrosome “know” when it has found 
the center of the cell? 


16-9 The movements of single motor-protein molecules 
can be analyzed directly. Using polarized laser light, it is 
possible to create interference patterns that exert a cen- 
trally directed force, ranging from zero at the center to a 
few piconewtons at the periphery (about 200 nm from the 
center). Individual molecules that enter the interference 
pattern are rapidly pushed to the center, allowing them to 
be captured and moved at the experimenter’s discretion. 

Using such “optical tweezers,’ single kinesin mol- 
ecules can be positioned on a microtubule that is fixed to 
a coverslip. Although a single kinesin molecule cannot 
be seen optically, it can be tagged with a silica bead and 
tracked indirectly by following the bead (Figure Q16-3A). 
In the absence of ATP, the kinesin molecule remains at the 
center of the interference pattern, but with ATP it moves 
toward the plus end of the microtubule. As kinesin moves 
along the microtubule, it encounters the force of the inter- 
ference pattern, which simulates the load kinesin carries 
during its actual function in the cell. Moreover, the pres- 
sure against the silica bead counters the effects of Brown- 
ian (thermal) motion, so that the position of the bead more 
accurately reflects the position of the kinesin molecule on 
the microtubule. 

A trace of the movements of a kinesin molecule 
along a microtubule is shown in Figure Q16-3B. 


A. As shown in Figure Q16-3B, all movement of kine- 
sin is in one direction (toward the plus end of the micro- 
tubule). What supplies the free energy needed to ensure a 
unidirectional movement along the microtubule? 
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B. What is the average rate of movement of kinesin 
along the microtubule? 


C. What is the length of each step that a kinesin takes 
as it moves along a microtubule? 


D. From other studies it is known that kinesin has two 
globular domains that can each bind to B-tubulin, and that 
kinesin moves along a single protofilament in a microtu- 
bule. In each protofilament, the B-tubulin subunit repeats 
at 8-nm intervals. Given the step length and the interval 
between -tubulin subunits, how do you suppose a kine- 
sin molecule moves along a microtubule? 


E. Is there anything in the data in Figure Q16-3B that 
tells you how many ATP molecules are hydrolyzed per 
step? 


16-10 A mitochondrion 1 um long can travel the 1 meter 
length of the axon from the spinal cord to the big toe in a 
day. The Olympic men’s freestyle swimming record for 200 
meters is 1.75 minutes. In terms of body lengths per day, 
who is moving faster: the mitochondrion or the Olympic 
record holder? (Assume that the swimmer is 2 meters tall.) 


16-11 Cofilin preferentially binds to older actin filaments 
and promotes their disassembly. How does cofilin distin- 
guish old filaments from new ones? 


16-12 Why is it that intermediate filaments have iden- 
tical ends and lack polarity, whereas actin filaments and 
microtubules have two distinct ends with a defined polar- 


ity? 


16-13 How is the unidirectional motion of a lamellipo- 
dium maintained? 


(A) EXPERIMENTAL-SETUP (B) POSITION OF KINESIN 
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Figure Q16-3 Movement of kinesin along a microtubule (Problem 
16-9). (A) Experimental set-up, with kinesin linked to a silica bead, 
moving along a microtubule. (B) Position of kinesin (as visualized by 
the position of the silica bead) relative to the center of the interference 
pattern, as a function of time of movement along the microtubule. 
The jagged nature of the trace results from Brownian motion of 

the bead. 
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The Cell Cycle 


The only way to make a new cell is to duplicate a cell that already exists. This sim- 
ple fact, first established in the middle of the nineteenth century, carries with it a 
profound message for the continuity of life. All living organisms, from the unicel- 
lular bacterium to the multicellular mammal, are products of repeated rounds of 
cell growth and division extending back in time to the beginnings of life on Earth 
over three billion years ago. 

A cell reproduces by performing an orderly sequence of events in which it 
duplicates its contents and then divides in two. This cycle of duplication and divi- 
sion, known as the cell cycle, is the essential mechanism by which all living things 
reproduce. In unicellular species, such as bacteria and yeasts, each cell division 
produces a complete new organism. In multicellular species, long and complex 
sequences of cell divisions are required to produce a functioning organism. Even 
in the adult body, cell division is usually needed to replace cells that die. In fact, 
each of us must manufacture many millions of cells every second simply to sur- 
vive: if all cell division were stopped—by exposure to a very large dose of x-rays, 
for example—we would die within a few days. 

The details of the cell cycle vary from organism to organism and at different 
times in an organism’s life. Certain characteristics, however, are universal. At a 
minimum, the cell must accomplish its most fundamental task: the passing on 
of its genetic information to the next generation of cells. To produce two geneti- 
cally identical daughter cells, the DNA in each chromosome must first be faith- 
fully replicated to produce two complete copies. The replicated chromosomes 
must then be accurately distributed (segregated) to the two daughter cells, so that 
each receives a copy of the entire genome (Figure 17-1). In addition to duplicat- 
ing their genome, most cells also duplicate their other organelles and macromole- 
cules; otherwise, daughter cells would get smaller with each division. To maintain 
their size, dividing cells must coordinate their growth (that is, their increase in cell 
mass) with their division. 

This chapter describes the events of the cell cycle and how they are controlled 
and coordinated. We begin with a brief overview of the cell cycle. We then describe 
the cell-cycle control system, a complex network of regulatory proteins that triggers 
the different events of the cycle. We next consider in detail the major stages of the 
cell cycle, in which the chromosomes are duplicated and then segregated into the 
two daughter cells. Finally, we consider how extracellular signals govern the rates 
of cell growth and division and how these two processes are coordinated. 


OVERVIEW OF THE CELL CYCLE 


The most basic function of the cell cycle is to duplicate the vast amount of DNA 
in the chromosomes and then segregate the copies into two genetically identical 
daughter cells. These processes define the two major phases of the cell cycle. Chro- 
mosome duplication occurs during S phase (S for DNA synthesis), which requires 
10-12 hours and occupies about half of the cell-cycle time in a typical mammalian 
cell. After S phase, chromosome segregation and cell division occur in M phase (M 
for mitosis), which requires much less time (less than an hour in a mammalian 
cell). M phase comprises two major events: nuclear division, or mitosis, during 
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which the copied chromosomes are distributed into a pair of daughter nuclei; 
and cytoplasmic division, or cytokinesis, when the cell itself divides in two (Figure 
17-2). 

At the end of S phase, the DNA molecules in each pair of duplicated chromo- 
somes are intertwined and held tightly together by specialized protein linkages. 
Early in mitosis at a stage called prophase, the two DNA molecules are gradu- 
ally disentangled and condensed into pairs of rigid, compact rods called sister 
chromatids, which remain linked by sister-chromatid cohesion. When the nuclear 
envelope disassembles later in mitosis, the sister-chromatid pairs become 
attached to the mitotic spindle, a giant bipolar array of microtubules (discussed 
in Chapter 16). Sister chromatids are attached to opposite poles of the spindle 
and, eventually, align at the spindle equator in a stage called metaphase. The 
destruction of sister-chromatid cohesion at the start of anaphase separates the 
sister chromatids, which are pulled to opposite poles of the spindle. The spindle is 
then disassembled, and the segregated chromosomes are packaged into separate 
nuclei at telophase. Cytokinesis then cleaves the cell in two, so that each daughter 
cell inherits one of the two nuclei (Figure 17-3). 


The Eukaryotic Cell Cycle Usually Consists of Four Phases 


Most cells require much more time to grow and double their mass of proteins and 
organelles than they require to duplicate their chromosomes and divide. Partly to 
allow time for growth, most cell cycles have gap phases—a G, phase between M 
phase and S phase and a G2 phase between S phase and mitosis. Thus, the eukary- 
otic cell cycle is traditionally divided into four sequential phases: Gj, S, G2, and M. 
G1, S, and G2 together are called interphase (Figure 17-4, and see Figure 17-3). In 
a typical human cell proliferating in culture, interphase might occupy 23 hours of 
a 24-hour cycle, with 1 hour for M phase. Cell growth occurs throughout the cell 
cycle, except during mitosis. 

The two gap phases are more than simple time delays to allow cell growth. They 
also provide time for the cell to monitor the internal and external environment 


Figure 17-2 The major events of the cell cycle. The major chromosomal 
events of the cell cycle occur in S phase, when the chromosomes are 
duplicated, and M phase, when the duplicated chromosomes are segregated 
into a pair of daughter nuclei (in mitosis), after which the cell itself divides into 
two (cytokinesis). 


Figure 17-1 The cell cycle. The division 
of a hypothetical eukaryotic cell with two 
chromosomes (one red, and one black) 

is shown to illustrate how two genetically 
identical daughter cells are produced in 
each cycle. Each of the daughter cells will 
often continue to divide by going through 
additional cell cycles. 
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Figure 17-3 The events of eukaryotic cell division as seen under a microscope. The easily visible processes of nuclear 
division (mitosis) and cell division (cytokinesis), collectively called M phase, typically occupy only a small fraction of the cell cycle. 
The other, much longer, part of the cycle is known as interphase, which includes S phase and the gap phases (discussed in 
text). The five stages of mitosis are shown: an abrupt change in the biochemical state of the cell occurs at the transition from 
metaphase to anaphase. A cell can pause in metaphase before this transition point, but once it passes this point, the cell carries 


on to the end of mitosis and through cytokinesis into interphase. 


to ensure that conditions are suitable and preparations are complete before the 
cell commits itself to the major upheavals of S phase and mitosis. The G; phase 
is especially important in this respect. Its length can vary greatly depending on 
external conditions and extracellular signals from other cells. If extracellular con- 
ditions are unfavorable, for example, cells delay progress through G; and may even 
enter a specialized resting state known as Go (G zero), in which they can remain 
for days, weeks, or even years before resuming proliferation. Indeed, many cells 
remain permanently in Gg until they or the organism dies. If extracellular condi- 
tions are favorable and signals to grow and divide are present, cells in early G; or 
Go progress through a commitment point near the end of G; known as Start (in 
yeasts) or the restriction point (in mammalian cells). We will use the term Start 
for both yeast and animal cells. After passing this point, cells are committed to 
DNA replication, even if the extracellular signals that stimulate cell growth and 
division are removed. 


Cell-Cycle Control Is Similar in All Eukaryotes 


Some features of the cell cycle, including the time required to complete certain 
events, vary greatly from one cell type to another, even in the same organism. The 
basic organization of the cycle, however, is essentially the same in all eukaryotic 
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Figure 17-4 The four phases of the cell 
cycle. In most cells, gap phases separate 
the major events of S phase and M phase. 
G4 is the gap between M phase and 

S phase, while Go is the gap between 

S phase and M phase. 
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Figure 17-5 Mammalian cells proliferating in culture. The cells in this 
scanning electron micrograph are rat fibroblasts. Cells at the lower left have 
rounded up and are in mitosis. (Courtesy of Guenter Albrecht-Buehler.) 


cells, and all eukaryotes appear to use similar machinery and control mecha- 
nisms to drive and regulate cell-cycle events. The proteins of the cell-cycle control 
system, for example, first appeared over a billion years ago. Remarkably, they have 
been so well conserved over the course of evolution that many of them function 
perfectly when transferred from a human cell to a yeast cell. We can therefore 
study the cell cycle and its regulation in a variety of organisms and use the find- 
ings from all of them to assemble a unified picture of how eukaryotic cells divide. 
Several model organisms are used in the analysis of the eukaryotic cell cycle. 
The budding yeast Saccharomyces cerevisiae and the fission yeast Schizosaccha- 
romyces pombe are simple eukaryotes in which powerful molecular and genetic 
approaches can be used to identify and characterize the genes and proteins that 
govern the fundamental features of cell division. The early embryos of certain 
animals, particularly those of the frog Xenopus laevis, are excellent tools for bio- 
chemical dissection of cell-cycle control mechanisms, while the fruit fly Drosoph- 
ila melanogaster is useful for the genetic analysis of mechanisms underlying the 
control and coordination of cell growth and division in multicellular organisms. 
Cultured human cells provide an excellent system for the molecular and micro- 
scopic exploration of the complex processes by which our own cells divide. 


Cell-Cycle Progression Can Be Studied in Various Ways 


How can we tell what stage a cell has reached in the cell cycle? One way is simply 
to look at living cells with a microscope. A glance at a population of mammalian 
cells proliferating in culture reveals that a fraction of the cells have rounded up 
and are in mitosis (Figure 17-5). Others can be observed in the process of cyto- 
kinesis. Similarly, looking at budding yeast cells under a microscope is very use- 
ful, because the size of the bud provides an indication of cell-cycle stage (Figure 
17-6). We can gain additional clues about cell-cycle position by staining cells with 
DNA-binding fluorescent dyes (which reveal the condensation of chromosomes 
in mitosis) or with antibodies that recognize specific cell components such as the 
microtubules (revealing the mitotic spindle). S-phase cells can be identified in 
the microscope by supplying them with visualizable molecules that are incorpo- 
rated into newly synthesized DNA, such as the artificial thymidine analog bromo- 
deoxyuridine (BrdU); cell nuclei that have incorporated BrdU are then revealed 
by staining with anti-BrdU antibodies (Figure 17-7). 

Typically, in a population of cultured mammalian cells that are all proliferat- 
ing rapidly but asynchronously, about 30-40% will be in S phase at any instant and 
become labeled by a brief pulse of BrdU. From the proportion of cells in such a 
population that are labeled, we can estimate the duration of S phase as a fraction 
of the whole cell-cycle duration. Similarly, from the proportion of cells in mitosis 
(the mitotic index), we can estimate the duration of M phase. 

Another way to assess the stage that a cell has reached in the cell cycle is by 
measuring its DNA content, which doubles during S phase. This approach is 
greatly facilitated by the use of fluorescent DNA-binding dyes and a flow cytom- 
eter, which allows the rapid and automatic analysis of large numbers of cells (Fig- 
ure 17-8). We can use flow cytometry to determine the lengths of Gj, S, and G2 
+ M phases, by measuring DNA content in a synchronized cell population as it 
progresses through the cell cycle. 


Figure 17-6 The morphology of budding yeast cells. In a normal 
population of proliferating yeast cells, buds vary in size according to the 
cell-cycle stage. Unbudded cells are in G4. Progression through the Start 
transition triggers formation of a tiny bud, which grows in size during the 
S and M phases until it is almost the size of the mother cell. (Courtesy of 
Jeff Ubersax.) 








THE CELL-CYCLE CONTROL SYSTEM 


Figure 17-7 Labeling S-phase cells. An immunofluorescence micrograph 
of BrdU-labeled epithelial cells of the zebrafish gut. The fish was exposed 
to BrdU, after which the tissue was fixed and prepared for labeling with 
fluorescent anti-BrdU antibodies (green). All the cells are stained with a red 
fluorescent dye. (Courtesy of Cécile Crosnier.) 


Summary 


Cell division usually begins with duplication of the cell’s contents, followed by dis- 
tribution of those contents into two daughter cells. Chromosome duplication occurs 
during S phase of the cell cycle, whereas most other cell components are duplicated 
continuously throughout the cycle. During M phase, the replicated chromosomes 
are segregated into individual nuclei (mitosis), and the cell then splits in two (cyto- 
kinesis). S phase and M phase are usually separated by gap phases called G; and 
Go, when various intracellular and extracellular signals regulate cell-cycle progres- 
sion. Cell-cycle organization and control have been highly conserved during evolu- 
tion, and studies in a wide range of systems have led to a unified view of eukaryotic 
cell-cycle control. 


THE CELL-CYCLE CONTROL SYSTEM 


For many years, cell biologists watched the puppet show of DNA synthesis, mito- 
sis, and cytokinesis but had no idea of what lay behind the curtain controlling 
these events. It was not even clear whether there was a separate control system, or 
whether the processes of DNA synthesis, mitosis, and cytokinesis somehow con- 
trolled themselves. A major breakthrough came in the late 1980s with the identi- 
fication of the key proteins of the control system, along with the realization that 
they are distinct from the proteins that perform the processes of DNA replication, 
chromosome segregation, and so on. 

In this section, we first consider the basic principles upon which the cell-cycle 
control system operates. We then discuss the protein components of the system 
and how they work together to time and coordinate the events of the cell cycle. 


The Cell-Cycle Control System Triggers the Major Events of the 
Cell Cycle 


The cell-cycle control system operates much like a timer that triggers the events 
of the cell cycle in a set sequence (Figure 17-9). In its simplest form—as seen in 
the stripped-down cell cycles of early animal embryos, for example—the control 
system is rigidly programmed to provide a fixed amount of time for the comple- 
tion of each cell-cycle event. The control system in these early embryonic divi- 
sions is independent of the events it controls, so that its timing mechanisms con- 
tinue to operate even if those events fail. In most cells, however, the control system 
does respond to information received back from the processes it controls. If some 
malfunction prevents the successful completion of DNA synthesis, for example, 
signals are sent to the control system to delay progression to M phase. Such delays 
provide time for the machinery to be repaired and also prevent the disaster that 
might result ifthe cycle progressed prematurely to the next stage—and segregated 
incompletely replicated chromosomes, for example. 

The cell-cycle control system is based on a connected series of biochemi- 
cal switches, each of which initiates a specific cell-cycle event. This system of 
switches possesses many important features that increase the accuracy and reli- 
ability of cell-cycle progression. First, the switches are generally binary (on/off) 
and launch events in a complete, irreversible fashion. It would clearly be disas- 
trous, for example, if events like chromosome condensation or nuclear-envelope 
breakdown were only partially initiated or started but not completed. Second, the 
cell-cycle control system is remarkably robust and reliable, partly because backup 
mechanisms and other features allow the system to operate effectively under a 
variety of conditions and even if some components fail. Finally, the control system 
is highly adaptable and can be modified to suit specific cell types or to respond to 
specific intracellular or extracellular signals. 
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Figure 17-8 Analysis of DNA content 
with a flow cytometer. This graph shows 
typical results obtained for a proliferating 
cell population when the DNA content of 
its individual cells is determined in a flow 
cytometer. (A flow cytometer, also called a 
fluorescence-activated cell sorter, or FACS, 
can also be used to sort cells according to 
their fluorescence — see Figure 8-2). The 
cells analyzed here were stained with a dye 
that becomes fluorescent when it binds to 
DNA, so that the amount of fluorescence 
is directly proportional to the amount of 
DNA in each cell. The cells fall into three 
categories: those that have an unreplicated 
complement of DNA and are therefore 

in G4, those that have a fully replicated 
complement of DNA (twice the G4 DNA 
content) and are in G2 or M phase, and 
those that have an intermediate amount of 
DNA and are in S phase. The distribution 
of cells indicates that there are greater 
numbers of cells in G4 than in Go + M 
phase, showing that G4 is longer than 

Go + M in this population. 


968 Chapter 17: The Cell Cycle 


Are all chromosomes 
Is all DNA replicated? attached to the spindle? 
METAPHASE-TO-ANAPHASE 
TRANSITION 


Is environment favorable? 


G2/M TRANSITION 








CONTROLLER 





START TRANSITION 


Is environment favorable? 


In most eukaryotic cells, the cell-cycle control system governs cell-cycle pro- 
gression at three major regulatory transitions (see Figure 17-9). The first is Start 
(or the restriction point) in late G1, where the cell commits to cell-cycle entry and 
chromosome duplication. The second is the G2/M transition, where the control 
system triggers the early mitotic events that lead to chromosome alignment on 
the mitotic spindle in metaphase. The third is the metaphase-to-anaphase tran- 
sition, where the control system stimulates sister-chromatid separation, leading 
to the completion of mitosis and cytokinesis. The control system blocks progres- 
sion through each of these transitions if it detects problems inside or outside the 
cell. If the control system senses problems in the completion of DNA replication, 
for example, it will hold the cell at the G/M transition until those problems are 
solved. Similarly, if extracellular conditions are not appropriate for cell prolifera- 
tion, the control system blocks progression through Start, thereby preventing cell 
division until conditions become favorable. 


The Cell-Cycle Control System Depends on Cyclically Activated 
Cyclin-Dependent Protein Kinases (Cdks) 


Central components of the cell-cycle control system are members of a family of 
protein kinases known as cyclin-dependent kinases (Cdks). The activities of these 
kinases rise and fall as the cell progresses through the cycle, leading to cyclical 
changes in the phosphorylation of intracellular proteins that initiate or regulate 
the major events of the cell cycle. An increase in Cdk activity at the G2/M transi- 
tion, for example, increases the phosphorylation of proteins that control chromo- 
some condensation, nuclear-envelope breakdown, spindle assembly, and other 
events that occur in early mitosis. 

Cyclical changes in Cdk activity are controlled by a complex array of enzymes 
and other proteins. The most important of these Cdk regulators are proteins 
known as cyclins. Cdks, as their name implies, are dependent on cyclins for their 
activity: unless they are bound tightly to a cyclin, they have no protein kinase 
activity (Figure 17-10). Cyclins were originally named because they undergo a 
cycle of synthesis and degradation in each cell cycle. The levels of the Cdk pro- 
teins, by contrast, are constant. Cyclical changes in cyclin protein levels result in 
the cyclic assembly and activation of cyclin-Cdk complexes at specific stages of 
the cell cycle. 


Figure 17-9 The control of the cell cycle. 
A cell-cycle control system triggers the 
essential processes of the cycle—such as 
DNA replication, mitosis, and cytokinesis. 
The control system is represented here as 
a central arm—the controller—that rotates 
clockwise, triggering essential processes 
when it reaches specific transitions on the 
outer dial (yellow boxes). Information about 
the completion of cell-cycle events, as well 
as signals from the environment, can cause 
the control system to arrest the cycle at 
these transitions. 
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Figure 17-10 Two key components of 
the cell-cycle control system. When 
cyclin forms a complex with Cdk, the 
protein kinase is activated to trigger specific 
cell-cycle events. Without cyclin, Cdk is 
inactive. 
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There are four classes of cyclins, each defined by the stage of the cell cycle at 
which they bind Cdks and function. All eukaryotic cells require three of these 
classes (Figure 17-11): 


1. G,/S-cyclins activate Cdks in late G; and thereby help trigger progression 
through Start, resulting in a commitment to cell-cycle entry. Their levels 
fall in S phase. 


2. S-cyclins bind Cdks soon after progression through Start and help stimu- 
late chromosome duplication. S-cyclin levels remain elevated until mito- 
sis, and these cyclins also contribute to the control of some early mitotic 
events. 


3. M-cyclins activate Cdks that stimulate entry into mitosis at the G2/M tran- 

sition. M-cyclin levels fall in mid-mitosis. 

In most cells, a fourth class of cyclins, the G;-cyclins, helps govern the activi- 
ties of the G,/S-cyclins, which control progression through Start in late Gy. 

In yeast cells, a single Cdk protein binds all classes of cyclins and triggers dif- 
ferent cell-cycle events by changing cyclin partners at different stages of the cycle. 
In vertebrate cells, by contrast, there are four Cdks. Two interact with G,-cyclins, 
one with G,/S- and S-cyclins, and one with S- and M-cyclins. In this chapter, we 
simply refer to the different cyclin-Cdk complexes as G,-Cdk, G,/S-Cdk, S-Cdk, 
and M-Cdk. Table 17-1 lists the names of the individual Cdks and cyclins. 

How do different cyclin-Cdk complexes trigger different cell-cycle events? 
The answer, at least in part, seems to be that the cyclin protein does not simply 
activate its Cdk partner but also directs it to specific target proteins. As a result, 
each cyclin-Cdk complex phosphorylates a different set of substrate proteins. 
The same cyclin-Cdk complex can also induce different effects at different times 
in the cycle, probably because the accessibility of some Cdk substrates changes 
during the cell cycle. Certain proteins that function in mitosis, for example, may 
become available for phosphorylation only in Go. 


TABLE 17-1 


Cyclin D*  Cdk4, Cdké Cdk1*" 


Cyclin E Cdk2 Cln1, 2 Cdk1 
Cyclin A Cdk2, Cdk1** Clb5, 6 Cdk1 
Cyclin B Cdk1 Clo1, 2, 3,4 | Cdk 


* There are three D cyclins in mammals (cyclins D1, D2, and D8). 
** The original name of Cdk1 was Cdc2 in both vertebrates and fission yeast, and Cdc28 in 
budding yeast. 
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Figure 17-11 Cyclin-Cdk complexes 
of the cell-cycle control system. The 
concentrations of the three major cyclin 
types oscillate during the cell cycle, while 
the concentrations of Cdks (not shown) do 
not change and exceed cyclin amounts. 
In late G4, rising G4/S-cyclin levels lead 
to the formation of G1/S-Cdk complexes 
that trigger progression through the 

Start transition. S-Cdk complexes form 
at the start of S phase and trigger DNA 
replication, as well as some early mitotic 
events. M-Cdk complexes form during 
Go but are held in an inactive state; they 
are activated at the end of Go and trigger 
entry into mitosis at the Go/M transition. 
A separate regulatory protein complex, 
the APC/C, initiates the metaphase-to- 
anaphase transition, aS we discuss later. 
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Studies of the three-dimensional structures of Cdk and cyclin proteins have 
revealed that, in the absence of cyclin, the active site in the Cdk protein is partly 
obscured by a protein loop, like a stone blocking the entrance to a cave (Figure 
17-12A). Cyclin binding causes the loop to move away from the active site, result- 
ing in partial activation of the Cdk enzyme (Figure 17-12B). Full activation of 
the cyclin-Cdk complex then occurs when a separate kinase, the Cdk-activating 
kinase (CAK), phosphorylates an amino acid near the entrance of the Cdk active 
site. This causes a small conformational change that further increases the activity 
of the Cdk, allowing the kinase to phosphorylate its target proteins effectively and 
thereby induce specific cell-cycle events (Figure 17-12C). 


Cdk Activity Can Be Suppressed By Inhibitory Phosphorylation 
and Cak Inhibitor Proteins (CKIs) 


The rise and fall of cyclin levels is the primary determinant of Cdk activity during 
the cell cycle. Several additional mechanisms, however, help control Cdk activity 
at specific stages of the cycle. 

Phosphorylation at a pair of amino acids in the roof of the kinase active site 
inhibits the activity of a cyclin-Cdk complex. Phosphorylation of these sites by 
a protein kinase known as Weel inhibits Cdk activity, while dephosphorylation 
of these sites by a phosphatase known as Cdc25 increases Cdk activity (Figure 
17-13). We will see later that this regulatory mechanism is particularly important 
in the control of M-Cdk activity at the onset of mitosis. 

Binding of Cdk inhibitor proteins (CKIs) inactivates cyclin-Cdk complexes. 
The three-dimensional structure of a cyclin-Cdk-CKI complex reveals that CKI 
binding stimulates a large rearrangement in the structure of the Cdk active site, 
rendering it inactive (Figure 17-14). Cells use CKIs primarily to help govern the 
activities of G,/S- and S-Cdks early in the cell cycle. 


Regulated Proteolysis Triggers the Metaphase-to-Anaphase 
Transition 


Whereas activation of specific cyclin-Cdk complexes drives progression through 
the Start and G2/M transitions (see Figure 17-11), progression through the meta- 
phase-to-anaphase transition is triggered not by protein phosphorylation but by 
protein destruction, leading to the final stages of cell division. 

The key regulator of the metaphase-to-anaphase transition is the anaphase- 
promoting complex, or cyclosome (APC/C), a member of the ubiquitin ligase 
family of enzymes. As discussed in Chapter 3, these enzymes are used in numer- 
ous cell processes to stimulate the proteolytic destruction of specific regulatory 
proteins. They polyubiquitylate specific target proteins, resulting in their destruc- 
tion in proteasomes. Other ubiquitin ligases mark proteins for purposes other 
than destruction (discussed in Chapter 3). 


Figure 17-12 The structural basis of 
Cdk activation. These drawings are 
based on three-dimensional structures of 
human Cdk2 and cyclin A, as determined 
by x-ray crystallography. The location of 
the bound ATP is indicated. The enzyme 
is shown in three states. (A) In the inactive 
state, without cyclin bound, the active 

site is blocked by a region of the protein 
called the T-loop (red). (B) The binding of 
cyclin causes the T-loop to move out of the 
active site, resulting in partial activation of 
the Cdk2. (C) Phosphorylation of Cdk2 (by 
CAK) at a threonine residue in the T-loop 
further activates the enzyme by changing 
the shape of the T-loop, improving the 
ability of the enzyme to bind its protein 
substrates (Movie 17.1). 
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Figure 17-13 The regulation of Cdk 
activity by phosphorylation. The active 
cyclin-Cdk complex is turned off when 
the kinase Weel phosphorylates two 
closely spaced sites above the active 
site. Removal of these phosphates by the 
phosphatase Cdc25 activates the cyclin- 
Cdk complex. For simplicity, only one 
inhibitory phosphate is shown. CAK adds 
the activating phosphate, as shown 

in Figure 17-12. 


THE CELL-CYCLE CONTROL SYSTEM 


cyclin 


Cdk 







Mag, 
w 


~€ 






active inactive 
cyclin—Cdk p27-cyclin—Cdk 
27 
complex p complex 


Figure 17-14 The inhibition of a cyclin-Cdk complex by a CKI. This drawing is based on 
the three-dimensional structure of the human cyclin A-Cdk2 complex bound to the CKI p27, as 
determined by x-ray crystallography. The p27 binds to both the cyclin and Cdk in the complex, 
distorting the active site of the Cdk. It also inserts into the ATP-binding site, further inhibiting the 
enzyme activity. 


The APC/C catalyzes the ubiquitylation and destruction of two major types 
of proteins. The first is securin, which protects the protein linkages that hold 
sister-chromatid pairs together in early mitosis. Destruction of securin in meta- 
phase activates a protease that separates the sisters and unleashes anaphase, as 
described later. The S- and M-cyclins are the second major targets of the APC/C. 
Destroying these cyclins inactivates most Cdks in the cell (see Figure 17-11). As 
a result, the many proteins phosphorylated by Cdks from S phase to early mitosis 
are dephosphorylated by various phosphatases in the anaphase cell. This dephos- 
phorylation of Cdk targets is required for the completion of M phase, including the 
final steps in mitosis and then cytokinesis. Following its activation in mid-mitosis, 
the APC/C remains active in G; to provide a stable period of Cdk inactivity. When 
G,/S-Cdk is activated in late G;, the APC/C is turned off, thereby allowing cyclin 
accumulation in the next cell cycle. 

The cell-cycle control system also uses another ubiquitin ligase called SCF (see 
Figure 3-71). It has many functions in the cell, but its major role in the cell cycle is 
to ubiquitylate certain CKI proteins in late G4, thereby helping to control the acti- 
vation of S-Cdks and DNA replication. SCF is also responsible for the destruction 
of G,/S-cyclins in early S phase. 

The APC/C and SCF are both large, multisubunit complexes with some related 
components (see Figure 3-71), but they are regulated differently. APC/C activity 
changes during the cell cycle, primarily as a result of changes in its association 
with an activating subunit—either Cdc20 in mid-mitosis or Cdh1 from late mito- 
sis through early G1. These subunits help the APC/C recognize its target proteins 
(Figure 17-15A). SCF activity depends on substrate-binding subunits called F-box 
proteins. Unlike APC/C activity, however, SCF activity is constant during the cell 
cycle. Ubiquitylation by SCF is controlled instead by changes in the phosphor- 
ylation state of its target proteins, as F-box subunits recognize only specifically 
phosphorylated proteins (Figure 17-15B). 


Cell-Cycle Control Also Depends on Transcriptional Regulation 


In the simple cell cycles of early animal embryos, gene transcription does not 
occur. Cell-cycle control depends exclusively on post-transcriptional mechanisms 
that involve the regulation of Cdks and ubiquitin ligases and their target proteins. 
In the more complex cell cycles of most cell types, however, transcriptional con- 
trol provides an important additional level of regulation. Changes in cyclin gene 
transcription, for example, help control cyclin levels in most cells. 

A variety of methods discussed in Chapter 8 have been used to analyze changes 
in the expression of all genes in the genome as the cell progresses through the cell 
cycle. The results of these studies are surprising. In budding yeast, for example, 
about 10% of the genes encode mRNAs whose levels oscillate during the cell cycle. 
Some of these genes encode proteins with known cell-cycle functions, but the 
functions of many others are unknown. 
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The Cell-Cycle Control System Functions as a Network of 
Biochemical Switches 


Table 17-2 summarizes some of the major components of the cell-cycle control 
system. These proteins are functionally linked to form a robust network, which 
operates essentially autonomously to activate a series of biochemical switches, 
each of which triggers a specific cell-cycle event. 

When conditions for cell proliferation are right, various external and internal 
signals stimulate the activation of G,-Cdk, which in turn stimulates the expression 
of genes encoding G,/S- and S-cyclins (Figure 17-16). The resulting activation of 
G,/S-Cdk then drives progression through the Start transition. By mechanisms 
we discuss later, G,;/S-Cdks unleash a wave of S-Cdk activity, which initiates 
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Figure 17-15 The control of proteolysis by APC/C and SCF during the cell cycle. (A) The 
APC/C is activated in mitosis by association with Cdc20, which recognizes specific amino acid 
sequences on M-cyclin and other target proteins. With the help of two additional proteins called 

E1 and E2, the APC/C assembles polyubiquitin chains on the target protein. The polyubiquitylated 
target is then recognized and degraded in a proteasome. (B) The activity of the ubiquitin ligase SCF 
depends on substrate-binding subunits called F-box proteins, of which there are many different 
types. The phosphorylation of a target protein, such as the CKI shown, allows the target to be 
recognized by a specific F-box subunit. 
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TABLE 17-2 


Protein kinases and protein phosphatases that modify Cdks 
Cdk-activating kinase (CAK) | Phosphorylates an activating site in Cdks 


Weel kinase Phosphorylates inhibitory sites in Cdks; primarily involved in suppressing Cdk1 activity before 
mitosis 


Cdc25 phosphatase Removes inhibitory phosphates from Cdks; three family members (Cdc25A, B, C) in mammals; 
primarily involved in controlling Cdk1 activation at the onset of mitosis 


Cdk inhibitor proteins (CKIs) 
Sic1 (budding yeast) Suppresses Cdk1 activity in G1; phosphorylation by Cdk1 at the end of G4 triggers its destruction 


027 (mammals) Suppresses G+/S-Cdk and S-Cdk activities in G1; helps cells withdraw from cell cycle when they 


terminally differentiate; phosphorylation by Cdk2 triggers its ubiquitylation by SCF 


021 (mammals) Suppresses G1/S-Cdk and S-Cdk activities following DNA damage 


016 (mammals) Suppresses G4-Cdk activity in G4; frequently inactivated in cancer 


Ubiquitin ligases and their activators 


Catalyzes ubiquitylation of regulatory proteins involved primarily in exit from mitosis, including 
securin and S- and M-cyclins; regulated by association with activating subunits Cdc20 or Cdh1 


APC/C-activating subunit in all cells; triggers initial activation of APC/C at metaphase-to-anaphase 
transition; stimulated by M-Cdk activity 


APC/C-activating subunit that maintains APC/C activity after anaphase and throughout G4; 
inhibited by Cdk activity 


Catalyzes ubiquitylation of regulatory proteins involved in G4 control, including some CKIls (Sic1 in 
budding yeast, p27 in mammals); phosphorylation of target protein usually required for this activity 





chromosome duplication in S phase and also contributes to some early events of 
mitosis. M-Cdk activation then triggers progression through the G2/M transition 
and the events of early mitosis, leading to the alignment of sister-chromatid pairs 
at the equator of the mitotic spindle. Finally, the APC/C, together with its acti- 
vator Cdc20, triggers the destruction of securin and cyclins, thereby unleashing 
sister-chromatid separation and segregation and the completion of mitosis. When 
mitosis is complete, multiple mechanisms collaborate to suppress Cdk activity, 
resulting in a stable G; period. We are now ready to discuss these cell-cycle stages 
in more detail, starting with S phase. 


favorable chromosome 
extracellular DNA unreplicated DNA unattached to 
environment damage DNA damage spindle 
/ \ N | Figure 17-16 An overview of the cell- 
cycle control system. The core of the cell- 
Gi-Cdk G,/S-Cdk ——® S-Cdk ———> M-Cdk —> APC/C cycle control system consists of a series 


of cyclin—Cdk complexes (yellow). The 
| | | activity of each complex is also influenced 
G , , by various inhibitory mechanisms, which 
1/S-cyclin synthesis ea oe 
E provide information about the extracellular 


Seydim syinthesi DNA re-replication environment, cell damage, and incomplete 
cell-cycle events (top). These inhibitory 


mechanisms are not present in all cell 
types; many are missing in early embryonic 
cell cycles, for example. 
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Summary 


The cell-cycle control system triggers the events of the cell cycle and ensures that they 
are properly timed and coordinated with each other. The control system responds 
to various intracellular and extracellular signals and arrests the cycle when the cell 
either fails to complete an essential cell-cycle process or encounters unfavorable 
environmental or intracellular conditions. 

Central components of the control system are the cyclin-dependent protein 
kinases (Cdks), which depend on cyclin subunits for their activity. Oscillations in the 
activities of different cyclin-Cdk complexes control various cell-cycle events. Thus, 
activation of S-phase cyclin-Cdk complexes (S-Cdk) initiates S phase, whereas acti- 
vation of M-phase cyclin-Cdk complexes (M-Cdk) triggers mitosis. The mechanisms 
that control the activities of cyclin-Cdk complexes include phosphorylation of the 
Cdk subunit, binding of Cdk inhibitor proteins (CKIs), proteolysis of cyclins, and 
changes in the transcription of genes encoding Cdk regulators. The cell-cycle control 
system also depends crucially on two additional enzyme complexes, the APC/C and 
SCF ubiquitin ligases, which catalyze the ubiquitylation and consequent destruc- 
tion of specific regulatory proteins that control critical events in the cycle. 


S PHASE 


The linear chromosomes of eukaryotic cells are vast and dynamic assemblies of 
DNA and protein, and their duplication is a complex process that takes up a major 
fraction of the cell cycle. Not only must the long DNA molecule of each chromo- 
some be duplicated accurately—a remarkable feat in itself—but the protein pack- 
aging surrounding each region of that DNA must also be reproduced, ensuring 
that the daughter cells inherit all features of chromosome structure. 

The central event of chromosome duplication—DNA replication—poses two 
problems for the cell. First, replication must occur with extreme accuracy to mini- 
mize the risk of mutations in the next cell generation. Second, every nucleotide in 
the genome must be copied once, and only once, to prevent the damaging effects 
of gene amplification. In Chapter 5, we discuss the sophisticated protein machin- 
ery that performs DNA replication with astonishing speed and accuracy. In this 
section, we consider the elegant mechanisms by which the cell-cycle control sys- 
tem initiates the replication process and, at the same time, prevents it from hap- 
pening more than once per cycle. 


S-Cdk Initiates DNA Replication Once Per Cycle 


DNA replication begins at origins of replication, which are scattered at numerous 
locations in every chromosome. During S phase, DNA replication is initiated at 
these origins when a DNA helicase unwinds the double helix and DNA replica- 
tion enzymes are loaded onto the two single-stranded templates. This leads to the 
elongation phase of replication, when the replication machinery moves outward 
from the origin at two replication forks (discussed in Chapter 5). 

To ensure that chromosome duplication occurs only once per cell cycle, the 
initiation phase of DNA replication is divided into two distinct steps that occur at 
different times in the cell cycle (Figure 17-17). The first step occurs in late mitosis 
and early G;, when a pair of inactive DNA helicases is loaded onto the replication 
origin, forming a large complex called the prereplicative complex or preRC. This 
step is sometimes called licensing of replication origins because initiation of DNA 
synthesis is permitted only at origins containing a preRC. The second step occurs 
in S phase, when the DNA helicases are activated, resulting in DNA unwinding 
and the initiation of DNA synthesis. Once a replication origin has been fired in 
this way, the two helicases move out from the origin with the replication forks, and 
that origin cannot be reused until a new preRC is assembled there at the end of 
mitosis. As a result, origins can be activated only once per cell cycle. 

Figure 17-18 illustrates some of the molecular details underlying the control 
of the two steps in the initiation of DNA replication. A key player is a large mul- 
tiprotein complex called the origin recognition complex (ORC), which binds to 
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replication origins throughout the cell cycle. In late mitosis and early Gj, the pro- 
teins Cdc6 and Cdt1 collaborate with the ORC to load the inactive DNA helicases 
around the DNA next to the origin. The resulting large complex is the preRC, and 
the origin is now licensed for replication. 

At the onset of S phase, S-Cdk triggers origin activation by phosphorylating 
specific initiator proteins, which then nucleate the assembly of a large protein 
complex that activates the DNA helicase and recruits the DNA synthesis machin- 
ery. Another protein kinase called DDK is also activated in S phase and helps 
drive origin activation by phosphorylating specific subunits of the DNA helicase. 

At the same time as S-Cdk initiates DNA replication, several mechanisms 
prevent assembly of new preRCs. S-Cdk phosphorylates and thereby inhibits the 
ORC and Cdc6 proteins. Inactivation of the APC/C in late G; also helps turn off 
preRC assembly. In late mitosis and early Gj, the APC/C triggers the destruction 
of a Cdt1 inhibitor called geminin, thereby allowing Cdt1 to be active. When the 
APC/C is turned off in late G1, geminin accumulates and inhibits the Cdt1 that 
is not associated with DNA. Also, the association of Cdt1 with a protein at active 
replication forks stimulates Cdt1 destruction. In these various ways, preRC for- 
mation is prevented from S phase to mitosis, thereby ensuring that each origin 
is fired only once per cell cycle. How, then, is the cell-cycle control system reset 
to allow replication in the next cell cycle? At the end of mitosis, APC/C activation 
leads to the inactivation of Cdks and the destruction of geminin. ORC and Cdc6 
are dephosphorylated and Cdtl is activated, allowing preRC assembly to prepare 
the cell for the next S phase. 


Chromosome Duplication Requires Duplication of Chromatin 
structure 
The DNA of the chromosomes is extensively packaged in a variety of protein com- 


ponents, including histones and various regulatory proteins involved in the control 
of gene expression (discussed in Chapter 4). Thus, duplication of a chromosome is 
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Figure 17-17 Control of chromosome 
duplication. Preparations for DNA 
replication begin in late mitosis and Gi, 
when the DNA helicases are loaded by 
multiple proteins at the replication origin, 
forming the prereplicative complex (preRC). 
S-Cdk activation leads to activation of the 
DNA helicases, which unwind the DNA 

at origins to initiate DNA replication. Two 
replication forks move out from each origin 
until the entire chromosome is duplicated. 
Duplicated chromosomes are then 
segregated in M phase. S-Cdk activation 
in S phase also prevents assembly of new 
preRCs at any origin until the following 

G4 —thereby ensuring that each origin is 
activated only once in each cell cycle. 
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not simply a matter of replicating the DNA at its core but also requires the duplica- 
tion of these chromatin proteins and their proper assembly on the DNA. 

The production of chromatin proteins increases during S phase to provide the 
raw materials needed to package the newly synthesized DNA. Most importantly, 
S-Cdks stimulate a large increase in the synthesis of the four histone subunits that 
form the histone octamers at the core of each nucleosome. These subunits are 
assembled into nucleosomes on the DNA by nucleosome assembly factors, which 
typically associate with the replication fork and distribute nucleosomes on both 
strands of the DNA as they emerge from the DNA synthesis machinery. 

Chromatin packaging helps to control gene expression. In some parts of the 
chromosome, the chromatin is highly condensed and is called heterochromatin, 
whereas in other regions it has a more open structure and is called euchromatin 
(discussed in Chapter 4). These differences in chromatin structure depend on a 
variety of mechanisms, including modification of histone tails and the presence of 
non-histone proteins. Because these differences are important in gene regulation, 
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Figure 17-18 Control of the initiation 

of DNA replication. The replication origin 
is bound by the ORC throughout the cell 
cycle. In early G41, Cdc6 associates with 
the ORC, and these proteins bind the 
DNA helicase, which contains six closely 
related subunits called Mcm proteins. The 
helicase also associates with a protein 
called Cdt1. Using energy provided by ATP 
hydrolysis, the ORC and Cdc6 proteins 
load two copies of the DNA helicase, in an 
inactive form, around the DNA next to the 
origin, thereby forming the prereplicative 
complex (preRC). At the onset of S 

phase, S-Cdk stimulates the assembly 

of several initiator proteins on each DNA 
helicase, while another protein kinase, 
DDK, phosphorylates subunits of the DNA 
helicase. As a result, the DNA helicases 
are activated and unwind the DNA. DNA 
polymerase and other replication proteins 
are recruited to the origin, and DNA 
replication begins. The ORC is displaced 
by the replication machinery and then 
rebinds. S-Cdk and other mechanisms 
also inactivate the preRC components 
ORC, Cdcé6, and Cdt1, thereby preventing 
formation of new preRCs at the origins until 
the end of mitosis (see text). 
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it is crucial that chromatin structure, like the DNA within, is reproduced accu- 
rately during S phase. How chromatin structure is reproduced is not well under- 
stood, however. During DNA synthesis, histone-modifying enzymes and various 
non-histone proteins are probably deposited onto the two new DNA strands as 
they emerge from the replication fork, and these proteins are thought to repro- 
duce the local chromatin structure of the parent chromosome (see Figure 4-45). 


Cohesins Hold Sister Chromatids Together 


At the end of S phase, each replicated chromosome consists of a pair of identical 
sister chromatids glued together along their length. This sister-chromatid cohe- 
sion sets the stage for a successful mitosis because it greatly facilitates the attach- 
ment of the two sister chromatids to opposite poles of the mitotic spindle. Imagine 
how difficult it would be to achieve this bipolar attachment if sister chromatids 
were allowed to drift apart after S phase. Indeed, defects in sister-chromatid cohe- 
sion—in yeast mutants, for example—lead inevitably to major errors in chromo- 
some segregation. 

Sister-chromatid cohesion depends on a large protein complex called cohesin, 
which is deposited at many locations along the length of each sister chromatid as 
the DNA is replicated in S phase. Two of the subunits of cohesin are members of a 
large family of proteins called SMC proteins (for Structural Maintenance of Chro- 
mosomes). Cohesin forms giant ringlike structures, and it has been proposed that 
these surround the two sister chromatids (Figure 17-19). 

Sister-chromatid cohesion also results, at least in part, from DNA catenation, 
the intertwining of sister DNA molecules that occurs when two replication forks 
meet during DNA synthesis. The enzyme topoisomerase II gradually disentangles 
the catenated sister DNAs between S phase and early mitosis by cutting one DNA 
molecule, passing the other through the break, and then resealing the cut DNA 
(see Figure 5-22). Once the catenation has been removed, sister-chromatid cohe- 
sion depends primarily on cohesin complexes. The sudden and synchronous loss 
of sister cohesion at the metaphase-to-anaphase transition therefore depends 
primarily on disruption of these complexes, as we describe later. 


Summary 


Duplication of the chromosomes in S phase involves the accurate replication of the 
entire DNA molecule in each chromosome, as well as the duplication of the chro- 
matin proteins that associate with the DNA and govern various aspects of chromo- 
some function. Chromosome duplication is triggered by the activation of S-Cdk, 
which activates proteins that unwind the DNA and initiate its replication at repli- 
cation origins. Once a replication origin is activated, S-Cdk also inhibits proteins 
that are required to allow that origin to initiate DNA replication again. Thus, each 
origin is fired once and only once in each S phase and cannot be reused until the 
next cell cycle. 
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Figure 17-19 Cohesin. Cohesin is a 
protein complex with four subunits. (A) Two 
subunits, Smc1 and Smc3, are coiled-coil 
proteins with an ATPase domain at one 
end; (B) two additional subunits, Scc1 and 
Scc3, connect the ATPase head domains, 
forming a ring structure that may encircle 
the sister chromatids as shown in (C). The 
ATPase domains are required for cohesin 
loading on the DNA. 
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MITOSIS 


Following the completion of S phase and transition through Go», the cell under- 
goes the dramatic upheaval of M phase. This begins with mitosis, during which 
the sister chromatids are separated and distributed (segregated) to a pair of iden- 
tical daughter nuclei, each with its own copy of the genome. Mitosis is tradition- 
ally divided into five stages—prophase, prometaphase, metaphase, anaphase, and 
telophase—defined primarily on the basis of chromosome behavior as seen in a 
microscope. As mitosis is completed, the second major event of M phase—cytoki- 
nesis—divides the cell into two halves, each with an identical nucleus. Panel 17-1 
summarizes the major events of M phase (Movie 17.2, Movie 17.3, Movie 17.4, 
and Movie 17.5). 

From a regulatory point of view, mitosis can be divided into two major parts, 
each governed by distinct components of the cell-cycle control system. First, 
an abrupt increase in M-Cdk activity at the G2/M transition triggers the events 
of early mitosis (prophase, prometaphase, and metaphase). During this period, 
M-Cdk and several other mitotic protein kinases phosphorylate a variety of pro- 
teins, leading to the assembly of the mitotic spindle and its attachment to the 
sister-chromatid pairs. The second major part of mitosis begins at the metaphase- 
to-anaphase transition, when the APC/C triggers the destruction of securin, liber- 
ating a protease that cleaves cohesin and thereby initiates separation of the sister 
chromatids. The APC/C also promotes the destruction of cyclins, which leads to 
Cdk inactivation and the dephosphorylation of Cdk targets, which is required for 
all events of late M phase, including the completion of anaphase, the disassembly 
of the mitotic spindle, and the division of the cell by cytokinesis. 

In this section, we describe the key mechanical events of mitosis and how 
M-Cdk and the APC/C orchestrate them. 


M-Cdk Drives Entry Into Mitosis 


One of the most remarkable features of cell-cycle control is that a single protein 
kinase, M-Cdk, brings about all of the diverse and complex cell rearrangements 
that occur in the early stages of mitosis. At a minimum, M-Cdk must induce the 
assembly of the mitotic spindle and ensure that each sister chromatid in a pair 
is attached to the opposite pole of the spindle. It also triggers chromosome con- 
densation, the large-scale reorganization of the intertwined sister chromatids into 
compact, rodlike structures. In animal cells, M-Cdk also promotes the breakdown 
of the nuclear envelope and rearrangements of the actin cytoskeleton and the 
Golgi apparatus. Each of these processes is thought to be initiated when M-Cdk 
phosphorylates specific proteins involved in the process, although most of these 
proteins have not yet been identified. 

M-Cdk does not act alone to phosphorylate key proteins involved in early 
mitosis. Two additional families of protein kinases, the Polo-like kinases and the 
Aurora kinases, also make important contributions to the control of early mitotic 
events. The Polo-like kinase Plk, for example, is required for the normal assembly 
of a bipolar mitotic spindle, in part because it phosphorylates proteins involved 
in separation of the spindle poles early in mitosis. The Aurora kinase Aurora-A 
also helps control proteins that govern the assembly and stability of the spindle, 
whereas Aurora-B controls attachment of sister chromatids to the spindle, as we 
discuss later. 


Dephosphorylation Activates M-Cdk at the Onset of Mitosis 


M-Cdk activation begins with the accumulation of M-cyclin (cyclin B in vertebrate 
cells; see Table 17-1). In embryonic cell cycles, the synthesis of M-cyclin is con- 
stant throughout the cell cycle, and M-cyclin accumulation results from the high 
stability of the protein in interphase. In most cell types, however, M-cyclin synthe- 
sis increases during Gz and M, owing primarily to an increase in M-cyclin gene 
transcription. The increase in M-cyclin protein leads to a corresponding accu- 
mulation of M-Cdk (the complex of Cdk1 and M-cyclin) as the cell approaches 
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inactive Figure 17-20 The activation of M-Cdk. 
phosphatase Cdk1 associates with M-cyclin as the levels 
of M-cyclin gradually rise. The resulting 
M-Cdk complex is phosphorylated on an 
— POSITIVE activating site by the Cdk-activating kinase 
T FEEDBACK (CAK) and on a pair of inhibitory sites by 
the Wee1 kinase. The resulting inactive 
M-Cdk complex is then activated at the 


end of Ge by the phosphatase Cdc25. 
Cdc25 is further stimulated by active 
—— _» May M-Cdk, resulting in positive feedback. 


activating This feedback is enhanced by the ability of 
phosphate M-Cdk to inhibit Wee1. 
Cdk1 Cdk-inhibitory active M-Cdk — 
kinase 
G- POSITIVE FEEDBACK | 


mitosis. Although the Cdk in these complexes is phosphorylated at an activating 
site by the Cdk-activating kinase (CAK), as discussed earlier, the protein kinase 
Weel holds it in an inactive state by inhibitory phosphorylation at two neighbor- 
ing sites (see Figure 17-13). Thus, by the time the cell reaches the end of Gg, it 
contains an abundant stockpile of M-Cdk that is primed and ready to act but is 
suppressed by phosphates that block the active site of the kinase. 

What, then, triggers the activation of the M-Cdk stockpile? The crucial event 
is the activation of the protein phosphatase Cdc25, which removes the inhibitory 
phosphates that restrain M-Cdk (Figure 17-20). At the same time, the inhibitory 
activity of the kinase Weel is suppressed, further ensuring that M-Cdk activity 
increases. The mechanisms that unleash Cdc25 activity in early mitosis are not 
well understood. One possibility is that the S-Cdks that are active in G2 and early 
prophase stimulate Cdc25. 

Interestingly, Cdc25 can also be activated, at least in part, by its target, M-Cdk. 
M-Cdk may also inhibit the inhibitory kinase Weel1. The ability of M-Cdk to acti- 
vate its own activator (Cdc25) and inhibit its own inhibitor (Wee1) suggests that 
M-Cdk activation in mitosis involves positive feedback loops (see Figure 17-20). 
According to this attractive model, the partial activation of Cdc25 (perhaps by 
S-Cdk) leads to the partial activation of a subpopulation of M-Cdk complexes, 
which then phosphorylate Cdc25 and Weel molecules. This leads to more M-Cdk 
activation, and so on. Such a mechanism would quickly promote the activation of 
all M-Cdk complexes in the cell. As mentioned earlier, similar molecular switches 
operate at various points in the cell cycle to promote the abrupt and complete 
transition from one cell-cycle state to the next. 
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Condensin Helps Configure Duplicated Chromosomes for 
Separation 


At the end of S phase, the immensely long DNA molecules of the sister chroma- 
tids are tangled in a mass of partially catenated DNA and proteins. Any attempt to 
pull the sisters apart in this state would undoubtedly lead to breaks in the chro- 
mosomes. To avoid this disaster, the cell devotes a great deal of energy in early 
mitosis to gradually reorganizing the sister chromatids into relatively short, dis- 
tinct structures that can be pulled apart more easily in anaphase. These chromo- 
somal changes involve two processes: chromosome condensation, in which the 
chromatids are dramatically compacted; and sister-chromatid resolution, whereby 
the two sisters are resolved into distinct, separable units (Figure 17-21). Resolu- 
tion results from the decatenation of the sister DNAs, accompanied by the partial Figure 17-21 The mitotic chromosome. 
removal of cohesin molecules along the chromosome arms. As a result, when the Seanning electron micrograph ofa 
: : . f human mitotic chromosome, consisting 

cell reaches metaphase, the sister chromatids appear in the microscope as com- of two sister chromatids joined along their 
pact, rodlike structures that are joined tightly at their centromeric regions and length. The constricted regions are the 
only loosely along their arms. centromeres. (Courtesy of Terry D. Allen.) 
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During , the two 
sets of daughter chromo- 
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the spindle and decondense. 
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the end of mitosis. The 
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the contractile ring. 
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| [6 CYTOKINESIS 
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one nucleus. 







contractile ring 
creating cleavage 
furrow 





re-formation of interphase 
array of microtubules nucleated 
by the centrosome 


(Micrographs courtesy of Julie Canman and Ted Salmon.) 
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Figure 17-22 Condensin. (A) Condensin is a 
five-subunit protein complex that resembles 
cohesin (See Figure 17-19). The ATPase head 
domains of its two major subunits, Smc2 and 
Smc4, are held together by three additional 
subunits. (B) It is not clear how condensin 


catalyzes the restructuring and compaction > $ CAP 
of chromosome DNA, but it may form a ring G 
structure that encircles loops of DNA within each hinge CAP-D2 

m 


sister chromatid. Smc4 


ATPase domain 
Smc2 


CAP-G 





The condensation and resolution of sister chromatids depend, at least in 
part, on a five-subunit protein complex called condensin. Condensin structure 
is related to that of the cohesin complex that holds sister chromatids together 
(see Figure 17-19). It contains two SMC subunits like those of cohesin, plus three 
non-SMC subunits (Figure 17-22). Condensin may form a ringlike structure that 
somehow uses the energy provided by ATP hydrolysis to promote the compaction 
and resolution of sister chromatids. Condensin is able to change the coiling of 
DNA molecules in a test tube, and this coiling activity is thought to be important 
for chromosome condensation during mitosis. Interestingly, phosphorylation 
of condensin subunits by M-Cdk stimulates this coiling activity, providing one 
mechanism by which M-Cdk may promote chromosome restructuring in early 
mitosis. 


The Mitotic Spindle Is a Microtubule-Based Machine 


The central event of mitosis—chromosome segregation—depends in all eukary- 
otes on a complex and beautiful machine called the mitotic spindle (see Panel 
17-1). The spindle is a bipolar array of microtubules, which pulls sister chroma- 
tids apart in anaphase, thereby segregating the two sets of chromosomes to oppo- 
site ends of the cell, where they are packaged into daughter nuclei (Movie 17.6). 
M-Cdk triggers the assembly of the spindle early in mitosis, in parallel with the 
chromosome restructuring just described. Before we consider how the spindle 
assembles and how its microtubules attach to sister chromatids, we briefly review 
the basic features of spindle structure. 

The core of the mitotic spindle is a bipolar array of microtubules, the minus 
ends of which are focused at the two spindle poles, and the plus ends of which 
radiate outward from the poles (Figure 17-23). The plus ends of some microtu- 
bules—called the interpolar microtubules—overlap with the plus ends of micro- 
tubules from the other pole, resulting in an antiparallel array in the spindle mid- 
zone. The plus ends of other microtubules—the kinetochore microtubules—are 
attached to sister-chromatid pairs at large protein structures called kinetochores, 
which are located at the centromere of each sister chromatid. Finally, many spin- 
dles also contain astral microtubules that radiate outward from the poles and 
contact the cell cortex, helping to position the spindle in the cell. 

In most somatic animal cells, each spindle pole is focused at a protein organ- 
elle called the centrosome (see Figures 16-47 and 16-48). Each centrosome 
consists of a cloud of amorphous material (called the pericentriolar matrix) that 
surrounds a pair of centrioles (Figure 17-24). The pericentriolar matrix nucleates 
a radial array of microtubules, with their fast-growing plus ends projecting out- 
ward and their minus ends associated with the centrosome. The matrix contains a 
variety of proteins, including microtubule-dependent motor proteins, coiled-coil 
proteins that link the motors to the centrosome, structural proteins, and compo- 
nents of the cell-cycle control system. Most important, it contains y-tubulin ring 
complexes, which are the components mainly responsible for nucleating microtu- 
bules (see Figure 16-46). 

Some cells—notably the cells of higher plants and the oocytes of many ver- 
tebrates—do not have centrosomes, and microtubule-dependent motor proteins 
and other proteins associate with microtubule minus ends to organize and focus 
the spindle poles. 
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spindle pole replicated kinetochore motor Figure 17-23 The metaphase mitotic 
chromosome protein spindle in an animal cell. The plus ends 
centrosome (sister chromatids) of the microtubules project away from the 
err š spindle pole, while the minus ends are 
+ + anchored at the spindle poles, which in this 






example are organized by centrosomes. 
Kinetochore microtubules connect the 
spindle poles with the kinetochores of sister 
chromatids, while interpolar microtubules 
+ from the two poles interdigitate at the 
spindle equator. Astral microtubules radiate 
out from the poles into the cytoplasm. 
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Microtubule-Dependent Motor Proteins Govern Spindle Assembly 
and Function 


The function of the mitotic spindle depends on numerous microtubule-depen- 
dent motor proteins. As discussed in Chapter 16, these proteins belong to two 
families—the kinesin-related proteins, which usually move toward the plus end 
of microtubules, and dyneins, which move toward the minus end. In the mitotic 
spindle, these motor proteins generally operate at or near the ends of the micro- 
tubules. Four major types of motor proteins—kinesin-5, kinesin-14, kinesins-4/10, 
and dynein—are particularly important in spindle assembly and function (Figure 
17-25). 

Kinesin-5 proteins contain two motor domains that interact with the plus 
ends of antiparallel microtubules in the spindle midzone. Because the two motor 
domains move toward the plus ends of the microtubules, they slide the two anti- 
parallel microtubules past each other toward the spindle poles, pushing the 
poles apart. Kinesin-14 proteins, by contrast, are minus-end directed motors 
with a single motor domain and other domains that can interact with a neighbor- 
ing microtubule. They can cross-link antiparallel interpolar microtubules at the 
spindle midzone and tend to pull the poles together. Kinesin-4 and kinesin-10 





microtubule pericentriolar matrix pair of centrioles 


Figure 17-24 The centrosome. (A) Electron micrograph of an S-phase mammalian cell in culture, showing a duplicated centrosome. Each 
centrosome contains a pair of centrioles; although the centrioles have duplicated, they remain together in a single complex, as shown in the drawing 
of the micrograph in (B). One centriole of each centriole pair has been cut in cross section, while the other is cut in longitudinal section, indicating 
that the two members of each pair are aligned at right angles to each other. The two halves of the replicated centrosome, each consisting of a 
centriole pair Surrounded by pericentriolar matrix, will split and migrate apart to initiate the formation of the two poles of the mitotic spindle when the 
cell enters M phase. (A, from M. McGill, D.P. Highfield, T.M. Monahan, and B.R. Brinkley, J. Ultrastruct. Res. 57:43-53, 1976. With permission from 
Academic Press.) 
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proteins, also called chromokinesins, are plus-end directed motors that associate 
with chromosome arms and push the attached chromosome away from the pole 
(or the pole away from the chromosome). Finally, dyneins are minus-end directed 
motors that, together with associated proteins, organize microtubules at various 
locations in the cell. They link the plus ends of astral microtubules to components 
of the actin cytoskeleton at the cell cortex, for example; by moving toward the 
minus end of the microtubules, the dynein motors pull the spindle poles toward 
the cell cortex and away from each other. 


Multiple Mechanisms Collaborate in the Assembly of a Bipolar 
Mitotic Spindle 


The mitotic spindle must have two poles if itis to pull the two sets of sister chroma- 
tids to opposite ends of the cell in anaphase. In most animal cells, several mecha- 
nisms ensure the bipolarity of the spindle. One depends on centrosomes. A typi- 
cal animal cell enters mitosis with a pair of centrosomes, each of which nucleates 
a radial array of microtubules. The two centrosomes provide prefabricated spin- 
dle poles that greatly facilitate bipolar spindle assembly. The other mechanisms 
depend on the ability of mitotic chromosomes to nucleate and stabilize microtu- 
bules and on the ability of motor proteins to organize microtubules into a bipolar 
array. These “self-organization” mechanisms can produce a bipolar spindle even 
in cells lacking centrosomes. 

We now describe the steps of spindle assembly, beginning with centrosome- 
dependent assembly in early mitosis. We then consider the self-organization 
mechanisms that do not require centrosomes and become particularly important 
after nuclear-envelope breakdown. 


Centrosome Duplication Occurs Early in the Cell Cycle 


Most animal cells contain a single centrosome that nucleates most of the cell’s 
cytoplasmic microtubules. The centrosome duplicates when the cell enters the 
cell cycle, so that by the time the cell reaches mitosis there are two centrosomes. 
Centrosome duplication begins at about the same time as the cell enters S phase. 
The G,/S-Cdk (a complex of cyclin E and Cdk2 in animal cells; see Table 17-1) that 
triggers cell-cycle entry also helps initiate centrosome duplication. The two cen- 
trioles in the centrosome separate, and each nucleates the formation of a single 
new centriole, resulting in two centriole pairs within an enlarged pericentriolar 
matrix (Figure 17-26). This centrosome pair remains together on one side of the 
nucleus until the cell enters mitosis. 

There are interesting parallels between centrosome duplication and chromo- 
some duplication. Both use a semiconservative mechanism of duplication, in 
which the two halves separate and serve as templates for construction of a new 
half. Centrosomes, like chromosomes, must replicate once and only once per 
cell cycle, to ensure that the cell enters mitosis with only two copies: an incorrect 
number of centrosomes could lead to defects in spindle assembly and thus errors 
in chromosome segregation. 


Figure 17-25 Major motor proteins of the 
spindle. Four major classes of microtubule- 
dependent motor proteins (yellow boxes) 
contribute to spindle assembly and function 
(see text). The colored arrows indicate the 
direction of motor protein movement along 
a microtubule — blue toward the minus end 
and red toward the plus end. 
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The mechanisms that limit centrosome duplication to once per cell cycle are 
uncertain. In many cell types, experimental inhibition of DNA synthesis blocks 
centrosome duplication, providing one mechanism by which centrosome number 
is kept in check. Other cell types, however, including those in the early embryos 
of flies, sea urchins, and frogs, do not have such a mechanism and centrosome 
duplication continues if chromosome duplication is blocked. It is not known how 
such cells limit centrosome duplication to once per cell cycle. 


M-Cdk Initiates Spindle Assembly in Proohase 


Spindle assembly begins in early mitosis, when the two centrosomes move apart 
along the nuclear envelope, pulled by dynein motor proteins that link astral micro- 
tubules to the cell cortex (see Figure 17-25). The plus ends of the microtubules 
between the centrosomes interdigitate to form the interpolar microtubules, and 
kinesin-5 motor proteins associate with these microtubules and push the centro- 
somes apart (see Figure 17-25). Also in early mitosis, the number of y-tubulin ring 
complexes in each centrosome increases greatly, increasing the ability of the cen- 
trosomes to nucleate new microtubules, a process called centrosome maturation. 

The balance of opposing forces generated by different types of motor proteins 
determines the final length of the spindle. Dynein and kinesin-5 motors generally 
promote centrosome separation and increase spindle length. Kinesin-14 proteins 
do the opposite: they tend to pull the poles together (see Figure 17-25). It is not 
clear how the cell regulates the balance of opposing forces to generate the appro- 
priate spindle length. 

M-Cdk and other mitotic protein kinases are required for centrosome separa- 
tion and maturation. M-Cdk and Aurora-A phosphorylate kinesin-5 motors and 
stimulate them to drive centrosome separation. Aurora-A and Plk also phosphor- 
ylate components of the centrosome and thereby promote its maturation. 


The Completion of Spindle Assembly in Animal Cells Requires 
Nuclear-Envelope Breakdown 


The centrosomes and microtubules of animal cells are located in the cytoplasm, 
separated from the chromosomes by the double-membrane barrier of the nuclear 
envelope (discussed in Chapter 12). Clearly, the attachment of sister-chromatid 
pairs to the spindle requires the removal of this barrier. In addition, many of the 
motor proteins and microtubule regulators that promote spindle assembly are 
associated with the chromosomes inside the nucleus, and they require nuclear- 
envelope breakdown to carry out their functions. 

Nuclear-envelope breakdown is a complex, multistep process, which is 
thought to begin when M-Cdk phosphorylates several subunits of the nuclear pore 
complexes in the nuclear envelope. This phosphorylation initiates the disassem- 
bly of nuclear pore complexes and their dissociation from the envelope. M-Cdk 
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Figure 17-26 Centriole replication. The 
centrosome consists of a centriole pair and 
associated pericentriolar matrix (green). At 
a certain point in G4, the two centrioles of 
the pair separate by a few micrometers. 
During S phase, a daughter centriole 
begins to grow near the base of each 
mother centriole and at a right angle to it. 
The elongation of the daughter centriole is 
usually completed in G2. The two centriole 
pairs remain close together in a single 
centrosomal complex until the beginning 
of M phase, when the complex splits in 
two and the two daughter centrosomes 
begin to separate. Each centrosome 

now nucleates its own radial array of 
microtubules (called an aster), mainly from 
the mother centriole. 
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also phosphorylates components of the nuclear lamina, the structural framework 
beneath the envelope. The phosphorylation of these lamina components and of 
several inner-nuclear-envelope proteins leads to disassembly of the nuclear lam- 
ina and the breakdown of the envelope membranes into small vesicles. 


Microtubule Instability Increases Greatly in Mitosis 


Most animal cells in interphase contain a cytoplasmic array of microtubules radi- 
ating out from the single centrosome. As discussed in Chapter 16, the microtubules 
of this interphase array are in a state of dynamic instability, in which individual 
microtubules are either growing or shrinking and stochastically switch between 
the two states. The switch from growth to shrinkage is called a catastrophe, and 
the switch from shrinkage to growth is called a rescue. New microtubules are con- 
tinually being created to balance the loss of those that disappear completely by 
depolymerization. 

Entry into mitosis signals an abrupt change in the cell’s microtubules. The 
interphase array of few, long microtubules radiating from the single centrosome is 
converted to a larger number of shorter and more dynamic microtubules emanat- 
ing from both centrosomes. During prophase, and particularly in prometaphase 
and metaphase (see Panel 17-1), the half-life of microtubules decreases dramati- 
cally. This increase in microtubule instability, coupled with the increased ability 
of centrosomes to nucleate microtubules as mentioned earlier, results in remark- 
ably dense and dynamic arrays of spindle microtubules that are ideally suited for 
capturing sister chromatids. 

Microtubule dynamics are controlled in the cell by a variety of regulatory pro- 
teins, including microtubule-associated proteins (MAPs) that promote stability 
and catastrophe factors that destabilize microtubule plus ends. Changes in the 
activities of these regulatory proteins are responsible for the changes in micro- 
tubule dynamics that occur during mitosis. Many of these changes result from 
phosphorylation of specific proteins by M-Cdk and other mitotic protein kinases. 


Mitotic Chromosomes Promote Bipolar Spindle Assembly 


Chromosomes are not just passive passengers in the process of spindle assem- 
bly. By creating a local environment that favors both microtubule nucleation 
and microtubule stabilization, they play an active part in spindle formation. The 
influence of the chromosomes can be demonstrated by using a fine glass needle 
to reposition them after the spindle has formed. For some cells in metaphase, if 
a single chromosome is tugged out of alignment, a mass of new spindle micro- 
tubules rapidly appears around the newly positioned chromosome, while the 
spindle microtubules at the chromosome’s former position depolymerize. ‘This 
property of the chromosomes seems to depend, at least in part, on a guanine 
nucleotide exchange factor (GEF) that is bound to chromatin; the GEF stimulates 
a small GTPase in the cytosol called Ran to bind GTP in place of GDP. The activated 
Ran-GTP, which is also involved in nuclear transport (discussed in Chapter 12), 
releases microtubule-stabilizing proteins from protein complexes in the cytosol, 
thereby stimulating the local nucleation and stabilization of microtubules around 
chromosomes (Figure 17-27). Local microtubule stabilization is also promoted 
by the protein kinase Aurora-B, which associates with mitotic chromosomes. 


Figure 17-27 Activation of the GTPase Ran around mitotic 
chromosomes. The Ran protein, like other members of the small 
GTPase family (discussed in Chapter 15), can exist in two conformations 
depending on whether it is bound to GDP (inactive state) or GTP (active 
state). The localization of active Ran in mitosis was determined using 

a protein that emits fluorescence at a specific wavelength when it is 
activated by Ran-GTP. In the metaphase human cell shown here, Ran 
activity (yellow and red) is highest around the chromosomes, between 
the poles of the mitotic spindle (indicated by asterisks). (From P. Kaláb 

et al., Nature 440:697-701, 2006. With permission from Macmillan 
Publishers Ltd.) 
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Figure 17-28 Spindle self-organization by motor proteins. Mitotic chromosomes stimulate the local activation of proteins 
that nucleate and promote the formation of microtubules in the vicinity of the chromosomes. Kinesin-5 motor proteins (see 
Figure 17-25) organize these microtubules into antiparallel bundles, while plus-end directed kinesins-4 and 10 link the 
microtubules to chromosome arms and push minus ends away from the chromosomes. Dynein and kinesin-14 motors, 
together with numerous other proteins, focus these minus ends into a pair of spindle poles. 


The ability of chromosomes to stabilize and organize microtubules enables 
cells to form bipolar spindles in the absence of centrosomes. Acentrosomal spin- 
dle assembly is thought to begin with the formation of microtubules around the 
chromosomes. Various motor proteins then organize the microtubules into a 
bipolar spindle, as illustrated in Figure 17-28. 

Cells that normally lack centrosomes, such as those of higher plants and 
many animal oocytes, use this chromosome-based self-organization process to 
form spindles. It is also the process used to assemble spindles in certain animal 
embryos that have been induced to develop from eggs without fertilization (that 
is, parthenogenetically); as the sperm normally provides the centrosome when it 
fertilizes an egg, the mitotic spindles in these parthenogenetic embryos develop 
without centrosomes (Figure 17-29). Even in cells that normally contain centro- 
somes, the chromosomes help organize the spindle microtubules and, with the 
help of various motor proteins, can promote the assembly of a bipolar mitotic 
spindle if the centrosomes are removed. Although the resulting acentrosomal 
spindle can segregate chromosomes normally, it lacks astral microtubules, which 
are responsible for positioning the spindle in animal cells; as a result, the spindle 
is often mispositioned in the cell. 


Kinetochores Attach Sister Chromatids to the Spindle 


Following the assembly of a bipolar microtubule array, the second major step 
in spindle formation is the attachment of the array to the sister-chromatid pairs. 
Spindle microtubules become attached to each chromatid at its kinetochore, 
a giant, multilayered protein structure that is built at the centromeric region of 
the chromatid (Figure 17-30; also see Chapter 4). In metaphase, the plus ends 
of kinetochore microtubules are embedded head-on in specialized microtubule- 
attachment sites within the outer region of the kinetochore, furthest from the 
DNA. The kinetochore of an animal cell can bind 10-40 microtubules, whereas 
a budding yeast kinetochore can bind only one. Attachment of each microtubule 
depends on multiple copies of a rod-shaped protein complex called the Ndc80 
complex, which is anchored in the kinetochore at one end and interacts with the 
sides of the microtubule at the other, thereby linking the microtubule to the kinet- 
ochore while still allowing the addition or removal of tubulin subunits at this end 
(Figure 17-31). Regulation of plus-end polymerization and depolymerization at 
the kinetochore is critical for the control of chromosome movement on the spin- 
dle, as we discuss later. 

Kinetochore attachment to the spindle occurs by a complex sequence of 
events. At the end of prophase in animal cells, the centrosomes of the growing 
spindle generally lie on opposite sides of the nuclear envelope. Thus, when the 
envelope breaks down, the sister-chromatid pairs are bombarded by microtubule 
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Figure 17-29 Bipolar spindle assembly 
without centrosomes in parthenogenetic 
embryos of the insect Sciara (or fungus 
gnat). The microtubules are stained 
green, the chromosomes red. The top 
fluorescence micrograph shows a normal 
spindle formed with centrosomes in a 
normally fertilized Sciara embryo. The 
bottom micrograph shows a spindle 
formed without centrosomes in an 
embryo that initiated development without 
fertilization. Note that the spindle with 
centrosomes has an aster at each pole of 
the spindle, whereas the spindle formed 
without centrosomes does not. Both 
types of spindles are able to segregate 
the replicated chromosomes. (From B. de 
Saint Phalle and W. Sullivan, J. Cell Biol. 
141:1383-1391, 1998. With permission 
from The Rockefeller University Press.) 
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plus ends coming from two directions. However, the kinetochores do not instantly 
achieve the correct ‘end-on’ microtubule attachment to both spindle poles. 
Instead, detailed studies with light and electron microscopy show that most initial 
attachments are unstable lateral attachments, in which a kinetochore attaches to 
the side of a passing microtubule, with assistance from kinesin motor proteins in 
the outer kinetochore. Soon, however, the dynamic microtubule plus ends cap- 
ture the kinetochores in the correct end-on orientation (Figure 17-32). 

Another attachment mechanism also plays a part, particularly in the absence 
of centrosomes. Careful microscopic analysis suggests that short microtubules 
in the vicinity of the chromosomes become embedded in the plus-end-binding 
sites of the kinetochore. Polymerization at these plus ends then results in growth 
of the microtubules away from the kinetochore. The minus ends of these kineto- 
chore microtubules are eventually cross-linked to other minus ends and focused 
by motor proteins at the spindle pole (see Figure 17-28). 


Bi-orientation Is Achieved by Trial and Error 


The success of mitosis demands that sister chromatids in a pair attach to opposite 
poles of the mitotic spindle, so that they move to opposite ends of the cell when 
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Figure 17-30 The kinetochore. 

(A) A fluorescence micrograph of a 
metaphase chromosome stained with a 
DNA-binding fluorescent dye and with 
human autoantibodies that react with 
specific kinetochore proteins. The two 
kinetochores, one associated with each 
sister chromatid, are stained red. 

(B) A drawing of a metaphase chromosome 
showing its two sister chromatids 
attached to the plus ends of kinetochore 
microtubules. Each kinetochore forms a 
plaque on the surface of the centromere. 
(C) Electron micrograph of an anaphase 
chromatid with microtubules attached to 
its kinetochore. While most kinetochores 
have a trilaminar structure, the one shown 
here (from a green alga) has an unusually 
complex structure with additional layers. 
(A, courtesy of B.R. Brinkley; C, from 
J.D. Pickett-Heaps and L.C. Fowke, 
Aust. J. Biol. Sci. 23:71-92, 1970. With 
permission from CSIRO.) 






kinetochore 


Figure 17-31 Microtubule attachment sites in the kinetochore. (A) In this electron micrograph of a mammalian kinetochore, the chromosome 
is on the right, and the plus ends of multiple microtubules are embedded in the outer kinetochore on the left. (B) Electron tomography (discussed 
in Chapter 9) was used to construct a low-resolution three-dimensional image of the outer kinetochore in (A). Several microtubules (in multiple 
colors) are embedded in fibrous material of the kinetochore, which is thought to be composed of the Ndc80 complex and other proteins. (C) Each 
microtubule is attached to the kinetochore by interactions with multiple copies of the Ndc80 complex (blue). This complex binds to the sides of the 
microtubule near its plus end, allowing polymerization and depolymerization to occur while the microtubule remains attached to the kinetochore. 
(A and B, from Y. Dong et al., Nature Cell Biol. 9:516-522, 2007. With permission from Macmillan Publishers Ltd.) 
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Figure 17-32 Chromosome attachment to the mitotic spindle in animal cells. (A) In late prophase of most animal cells, the 
mitotic spindle poles have moved to opposite sides of the nuclear envelope, with an array of overlapping microtubules between 
them. (B) Following nuclear envelope breakdown, the sister-chromatid pairs are exposed to the large number of dynamic plus 
ends of microtubules radiating from the spindle poles. In most cases, the kinetochores are first attached to the sides of these 
microtubules, while at the same time the arms of the chromosomes are pushed outward from the spindle interior, preventing 
the arms from blocking microtubule access to the kinetochores. (C) Eventually, the laterally-attached sister chromatids are 
arranged in a ring around the outside of the spindle. Most of the microtubules are concentrated in this ring, so that the spindle 
is relatively hollow inside. (D) Dynamic microtubule plus ends eventually encounter the kinetochores in an end-on orientation and 
are captured and stabilized. (E) Stable end-on attachment to both poles results in bi-orientation. Additional microtubules are 
attached to the kinetochore, resulting in a kinetochore fiber containing 10-40 microtubules. 


they separate in anaphase. How is this mode of attachment, called bi-orienta- 
tion, achieved? What prevents the attachment of both kinetochores to the same 
spindle pole or the attachment of one kinetochore to both spindle poles? Part of 
the answer is that sister kinetochores are constructed in a back-to-back orienta- 
tion that reduces the likelihood that both kinetochores can face the same spindle 
pole. Nevertheless, incorrect attachments do occur, and elegant regulatory mech- 
anisms have evolved to correct them. 

Incorrect attachments are corrected by a system of trial and error that is based 
on a simple principle: incorrect attachments are highly unstable and do not last, 
whereas correct attachments become locked in place. How does the kinetochore 
sense a correct attachment? The answer appears to be tension (Figure 17-33). 
When a sister-chromatid pair is properly bi-oriented on the spindle, the two kine- 
tochores are pulled in opposite directions by strong poleward forces. Sister-chro- 
matid cohesion resists these poleward forces, creating high levels of tension within 
the kinetochores. When chromosomes are incorrectly attached—when both sis- 
ter chromatids are attached to the same spindle pole, for example—tension is low 


Figure 17-33 Alternative forms of 
kinetochore attachment to the spindle 
poles. (A) Initially, a single microtubule 
from a spindle pole binds to one 
kinetochore in a sister-chromatid pair. 
Additional microtubules can then bind to 
the chromosome in various ways. 

VAF UNSTABLE (B) A microtubule from the same spindle 


74 pole can attach to the other sister 
| a kinetochore, or (C) microtubules from 
both spindle poles can attach to the same 


kinetochore. These incorrect attachments 
are unstable, however, so that one of the 
two microtubules tends to dissociate. 

(D) When a microtubule from the opposite 
pole binds to the second kinetochore, 

the sister kinetochores are thought to 
sense tension across their microtubule- 
binding sites. This triggers an increase in 
microtubule binding affinity, thereby locking 
(B) UNSTABLE (C) UNSTABLE (D) STABLE the correct attachment in place. 
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and the kinetochore sends an inhibitory signal that loosens the grip of its microtu- 
bule attachment site, allowing detachment to occur. When bi-orientation occurs, 
the high tension at the kinetochore shuts off the inhibitory signal, strengthening 
microtubule attachment. In animal cells, tension not only increases the affinity of 
the attachment site but also leads to the attachment of additional microtubules 
to the kinetochore. This results in the formation of a thick kinetochore fiber com- 
posed of multiple microtubules. 

The tension-sensing mechanism depends on the protein kinase Aurora-B, 
which is associated with the kinetochore and is thought to generate the inhibi- 
tory signal that reduces the strength of microtubule attachment in the absence 
of tension. It phosphorylates several components of the microtubule attachment 
site, including the Ndc80 complex, decreasing the site’s affinity for a microtubule 
plus end. When bi-orientation occurs, the resulting tension somehow reduces 
phosphorylation by Aurora-B, thereby increasing the affinity of the attachment 
site (Figure 17-34). 

Following their attachment to the two spindle poles, the chromosomes are 
tugged back and forth, eventually assuming a position equidistant between the 
two poles, a position called the metaphase plate. In vertebrate cells, the chro- 
mosomes then oscillate gently at the metaphase plate, awaiting the signal for the 
sister chromatids to separate. The signal is produced, with a predictable lag time, 
after the bi-oriented attachment of the last of the chromosomes. 


Multiole Forces Act on Chromosomes in the Spindle 


Multiple mechanisms generate the forces that move chromosomes back and forth 
after they are attached to the spindle, and produce the tension that is so important 
for the stabilization of correct attachments. In anaphase, similar forces pull the 
separated chromatids to opposite ends of the spindle. Three major spindle forces 
are particularly critical, although their strength and importance vary at different 
stages of mitosis. 

The first major force pulls the kinetochore and its associated chromatid along 
the kinetochore microtubule toward the spindle pole. It is produced by proteins at 
the kinetochore itself. By an uncertain mechanism, depolymerization at the plus 
end of the microtubule generates a force that pulls the kinetochore poleward. This 
force pulls on chromosomes during prometaphase and metaphase but is particu- 
larly important for moving sister chromatids toward the poles after they separate 
in anaphase. Interestingly, this kinetochore-generated poleward force does not 
require ATP or motor proteins. This might seem implausible at first, but it has 
been shown that purified kinetochores in a test tube, with no ATP present, can 
remain attached to depolymerizing microtubules and thereby move. The energy 
that drives the movement is stored in the microtubule and is released when the 
microtubule depolymerizes; it ultimately comes from the hydrolysis of GTP that 
occurs after a tubulin subunit adds to the end of a microtubule (discussed in 
Chapter 16). 


Figure 17-34 How tension might 
increase microtubule attachment to the 
kinetochore. These diagrams illustrate 
one speculative mechanism by which 
bi-orientation might increase microtubule 
attachment to the kinetochore. A single 
kinetochore is shown for clarity; the spindle 
pole is on the right. (A) When a sister- 
chromatid pair is unattached to the spindle 
or attached to just one spindle pole, there 
is little tension between the outer and inner 
kinetochores. The protein kinase Aurora-B 
is tethered to the inner kinetochore and 
phosphorylates the microtubule attachment 
sites, including the Ndc80 complex 

(blue), in the outer kinetochore as shown, 
thereby reducing the affinity of microtubule 
binding. Microtubules therefore associate 
and dissociate rapidly, and attachment 

is unstable. (B) When bi-orientation is 
achieved, the forces pulling the kinetochore 
toward the spindle pole are resisted by 
forces pulling the other sister kinetochore 
toward the opposite pole, and the resulting 
tension pulls the outer kinetochore away 
from the inner kinetochore. As a result, 
Aurora-B is unable to reach the outer 
kinetochore, and microtubule attachment 
sites are not phosphorylated. Microtubule 
binding affinity is therefore increased, 
resulting in the stable attachment of 
multiple microtubules to both kinetochores. 
The dephosphorylation of outer kinetochore 
proteins depends on a phosphatase that is 
not shown here. 
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How does plus-end depolymerization drive the kinetochore toward the pole? 
As we discussed earlier (see Figure 17-31C), Ndc80 complexes in the kineto- 
chore make multiple low-affinity attachments along the side of the microtubule. 
Because the attachments are constantly breaking and re-forming at new sites, the 
kinetochore remains attached to a microtubule even as the microtubule depoly- 
merizes. In principle, this could move the kinetochore toward the spindle pole. 

A second poleward force is provided in some cell types by microtubule flux, 
whereby the microtubules themselves are pulled toward the spindle poles and 
dismantled at their minus ends. The mechanism underlying this poleward move- 
ment is not clear, although it might depend on forces generated by motor proteins 
and minus-end depolymerization at the spindle pole. In metaphase, the addition 
of new tubulin at the plus end of a microtubule compensates for the loss of tubulin 
at the minus end, so that microtubule length remains constant despite the move- 
ment of microtubules toward the spindle pole (Figure 17-35). Any kinetochore 
that is attached to a microtubule undergoing such flux experiences a poleward 
force, which contributes to the generation of tension at the kinetochore in meta- 
phase. Together with the kinetochore-based forces discussed above, flux also con- 
tributes to the poleward forces that move sister chromatids after they separate in 
anaphase. 

A third force acting on chromosomes is the polar ejection force, or polar wind. 
Plus-end directed kinesin-4 and 10 motors on chromosome arms interact with 
interpolar microtubules and transport the chromosomes away from the spindle 
poles (see Figure 17-25). This force is particularly important in prometaphase and 
metaphase, when it helps push chromosome arms out from the spindle. This force 
might also help align the sister-chromatid pairs at the metaphase plate (Figure 
17-36). 

One of the most striking aspects of mitosis in vertebrate cells is the continu- 
ous oscillatory movement of the chromosomes in prometaphase and metaphase. 
When studied by video microscopy in newt lung cells, the movements are seen to 
switch between two states—a poleward state, when the chromosomes are pulled 
toward the pole, and an away-from-the-pole, or neutral, state, when the poleward 
forces are turned off and the polar ejection force pushes the chromosomes away 
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Figure 17-35 Microtubule flux in the metaphase spindle. (A) To observe microtubule flux, a very small amount of fluorescent tubulin is injected 
into living cells so that individual microtubules form with a very small proportion of fluorescent tubulin. Such microtubules have a speckled 
appearance when viewed by fluorescence microscopy. (B) Fluorescence micrograph of a mitotic spindle in a living newt lung epithelial cell. The 
chromosomes are colored brown, and the tubulin speckles are red. (C) The movement of individual speckles can be followed by time-lapse video 
microscopy. Images of the thin vertical boxed region (arrow) in (B), taken at sequential times, show that individual speckles move toward the poles 
at a rate of about 0.75 m/min, indicating that the microtubules are moving poleward. (D) Microtubule length in the metaphase spindle does not 
change significantly because new tubulin subunits are added at the microtubule plus end at the same rate as tubulin subunits are removed from the 
minus end. (B and C, from T.J. Mitchison and E.D. Salmon, Nat. Cell Biol. 3:E17-21, 2001. With permission from Macmillan Publishers Ltd.) 
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from the pole. The switch between the two states may depend on the degree of 
tension in the kinetochore. It has been proposed, for example, that, as chromo- 
somes move toward a spindle pole, an increasing polar ejection force generates 
tension in the kinetochore nearest the pole, triggering a switch to the away-from- 
the-pole state and gradually resulting in the accumulation of chromosomes at the 
equator of the spindle. 


The APC/C Triggers Sister-Chromatid Separation and the 
Completion of Mitosis 


After M-Cdk has triggered the complex processes leading up to metaphase, the 
cell cycle reaches its climax with the separation of the sister chromatids at the 
metaphase-to-anaphase transition (Figure 17-37). Although M-Cdk activity sets 
the stage for this event, the anaphase-promoting complex (APC/C) discussed ear- 
lier throws the switch that initiates sister-chromatid separation by ubiquitylating 
several mitotic regulatory proteins and thereby triggering their destruction (see 
Figure 17-15A). 

During metaphase, cohesins holding the sister chromatids together resist the 
poleward forces that pull the sister chromatids apart. Anaphase begins with the 
sudden loss of sister-chromatid cohesion, which allows the sisters to separate 
and move to opposite poles of the spindle. The APC/C initiates the process by 
targeting the inhibitory protein securin for destruction. Before anaphase, securin 


(A) 


Figure 17-36 How opposing forces may 
drive chromosomes to the metaphase 
plate. (A) Evidence for a polar ejection 
force that pushes chromosomes away 
from the spindle poles toward the spindle 
equator. In this experiment, a laser beam 
severs a prometaphase chromosome 

that is attached to a single pole by a 
kinetochore microtubule. The part of 

the severed chromosome without a 
kinetochore is pushed rapidly away 

from the pole, whereas the part with 

the kinetochore moves toward the pole, 
reflecting a decreased repulsion. (B) A 
model of how two opposing forces may 
cooperate to move chromosomes to the 
metaphase plate. Plus-end-directed motor 
proteins (kinesin-4 and kinesin-10) on the 
chromosome arms are thought to interact 
with microtubules to generate the polar 
ejection force, which pushes chromosomes 
toward the spindle equator (See Figure 
17-25). Poleward forces generated by 
depolymerization at the kinetochore, 
together with microtubule flux, are thought 
to pull chromosomes toward the pole. 





Figure 17-37 Sister-chromatid separation at anaphase. In the transition from metaphase (A) to anaphase (B), sister chromatids suddenly 
and synchronously separate and move toward opposite poles of the mitotic spindle—as shown in these light micrographs of Haemanthus (lily) 
endosperm cells that were stained with gold-labeled antibodies against tubulin. (Courtesy of Andrew Bajer.) 
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binds to and inhibits the activity of a protease called separase. The destruction 
of securin at the end of metaphase releases separase, which is then free to cleave 
one of the subunits of cohesin. The cohesins fall away, and the sister chromatids 
separate (Figure 17-38). 

In addition to securin, the APC/C also targets the S- and M-cyclins for destruc- 
tion, leading to the loss of most Cdk activity in anaphase. Cdk inactivation allows 
phosphatases to dephosphorylate the many Cdk target substrates in the cell, as 
required for the completion of mitosis and cytokinesis. 

If the APC/C triggers anaphase, what activates the APC/C? The answer is 
only partly known. As mentioned earlier, APC/C activation requires binding to 
the protein Cdc20 (see Figure 17-15A). At least two processes regulate Cdc20 
and its association with the APC/C. First, Cdc20 synthesis increases as the cell 
approaches mitosis, owing to an increase in the transcription of its gene. Second, 
phosphorylation of the APC/C helps Cdc20 bind to the APC/C, thereby helping to 
create an active complex. Among the kinases that phosphorylate and thus activate 
the APC/C is M-Cdk. Thus, M-Cdk not only triggers the early mitotic events lead- 
ing up to metaphase, but it also sets the stage for progression into anaphase. The 
ability of M-Cdk to promote Cdc20-APC/C activity creates a negative feedback 
loop: M-Cdk sets in motion a regulatory process that leads to cyclin destruction 
and thus its own inactivation. 


Unattached Chromosomes Block Sister-Chromatid Separation: 
The Spindle Assembly Checkpoint 


Drugs that destabilize microtubules, such as colchicine or vinblastine (discussed 
in Chapter 16), arrest cells in mitosis for hours or even days. This observation led 
to the identification of a spindle assembly checkpoint mechanism that is acti- 
vated by the drug treatment and blocks progression through the metaphase-to- 
anaphase transition. The checkpoint mechanism ensures that cells do not enter 
anaphase until all chromosomes are correctly bi-oriented on the mitotic spindle. 

The spindle assembly checkpoint depends on a sensor mechanism that 
monitors the strength of microtubule attachment at the kinetochore, possi- 
bly by sensing tension as described earlier (see Figure 17-34). Any kinetochore 
that is not properly attached to the spindle sends out a diffusible negative sig- 
nal that blocks Cdc20-APC/C activation throughout the cell and thus blocks the 
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Figure 17-38 The initiation of sister- 
chromatid separation by the APC/C. 
The activation of APC/C by Cdc20 leads 
to the ubiquitylation and destruction of 
securin, which normally holds separase 
in an inactive state. The destruction of 
securin allows separase to cleave Scc1, 
a subunit of the cohesin complex holding 
the sister chromatids together (See Figure 
17-19). The pulling forces of the mitotic 
spindle then pull the sister chromatids 
apart. In animal cells, phosphorylation by 
Cdks also inhibits separase (not shown). 
Thus, Cdk inactivation in anaphase 
(resulting from cyclin destruction) also 
promotes separase activation by allowing 
its dephosphorylation. 
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metaphase-to-anaphase transition. When the last sister-chromatid pair is prop- 
erly bi-oriented, this block is removed, allowing sister-chromatid separation to 
Occur. 

The negative checkpoint signal depends on several proteins, including 
Mad2, which are recruited to unattached kinetochores (Figure 17-39). Detailed 
structural analyses of Mad2 suggest that the unattached kinetochore acts like 
an enzyme that catalyzes a change in the conformation of Mad2, so that Mad2, 
together with other proteins, can bind and inhibit Cdc20-APC/C. 

In mammalian somatic cells, the spindle assembly checkpoint determines 
the normal timing of anaphase. The destruction of securin in these cells begins 
moments after the last sister-chromatid pair becomes bi-oriented on the spin- 
dle, and anaphase begins about 20 minutes later. Experimental inhibition of the 
checkpoint mechanism causes premature sister-chromatid separation and ana- 
phase. Surprisingly, the normal timing of anaphase does not depend on the spin- 
dle assembly checkpoint in some cells, such as yeasts and the cells of early frog 
and fly embryos. Other mechanisms, as yet unknown, must determine the timing 
of anaphase in these cells. 


Chromosomes Segregate in Anaphase A and B 


The sudden loss of sister-chromatid cohesion at the onset of anaphase leads to sis- 
ter-chromatid separation, which allows the forces of the mitotic spindle to pull the 
sisters to opposite poles of the cell—called chromosome segregation. The chromo- 
somes move by two independent and overlapping processes. The first, anaphase 
A, is the initial poleward movement of the chromosomes, which is accompanied 
by shortening of the kinetochore microtubules. The second, anaphase B, is the 
separation of the spindle poles themselves, which begins after the sister chroma- 
tids have separated and the daughter chromosomes have moved some distance 
apart (Figure 17-40). 

Chromosome movement in anaphase A depends on a combination of the two 
major poleward forces described earlier. The first is the force generated by micro- 
tubule depolymerization at the kinetochore, which results in the loss of tubulin 
subunits at the plus end as the kinetochore moves toward the pole. The second 
is provided by microtubule flux, which is the poleward movement of the micro- 
tubules toward the spindle pole, where minus-end depolymerization occurs. The 
relative importance of these two forces during anaphase varies in different cell 
types: in embryonic cells, chromosome movement depends mainly on microtu- 
bule flux, for example, whereas movement in yeast and vertebrate somatic cells 
results primarily from forces generated at the kinetochore. 

Spindle-pole separation during anaphase B depends on motor-driven mecha- 
nisms similar to those that separate the two centrosomes in early mitosis. Plus- 
end directed kinesin-5 motor proteins, which cross-link the overlapping plus 
ends of the interpolar microtubules, push the poles apart. In addition, dynein 
motors that anchor astral microtubule plus ends to the cell cortex pull the poles 
apart (see Figure 17-25). 

Although sister-chromatid separation initiates the chromosome movements 
of anaphase A, other mechanisms also ensure correct chromosome move- 
ments in anaphase A and spindle elongation in anaphase B. Most importantly, 
the completion of a normal anaphase depends on the dephosphorylation of Cdk 
substrates, which in most cells results from the APC/C-dependent destruction of 
cyclins. If M-cyclin destruction is prevented—by the production of a mutant form 
that is not recognized by the APC/C, for example—sister-chromatid separation 
generally occurs, but the chromosome movements and microtubule behavior of 
anaphase are abnormal. 

The relative contributions of anaphase A and anaphase B to chromosome seg- 
regation vary greatly, depending on the cell type. In mammalian cells, anaphase 
B begins shortly after anaphase A and stops when the spindle is about twice its 
metaphase length; in contrast, the spindles of yeasts and certain protozoa primar- 
ily use anaphase B to separate the chromosomes at anaphase, and their spindles 
elongate to up to 15 times their metaphase length. 
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Figure 17-39 Mad2 protein on 
unattached kinetochores. This 
fluorescence micrograph shows a 
mammalian cell in prometaphase, with 

the mitotic spindle in green and the sister 
chromatids in blue. One sister-chromatid 
pair is attached to only one pole of the 
spindle. Staining with anti-Mad2 antibodies 
indicates that Mad2 is bound to the 
kinetochore of the unattached sister 
chromatid (red dot, indicated by red 
arrow). A small amount of Mad2 is 
associated with the kinetochore of the 
sister chromatid that is attached to the 
spindle pole (pale dot, indicated by white 
arrow). (From J.C. Waters et al., J. Cell Biol. 
141:1181-1191, 1998. With permission 
from the authors.) 
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Segregated Chromosomes Are Packaged in Daughter Nuclei at 
Telophase 


By the end of anaphase, the daughter chromosomes have segregated into two 
equal groups at opposite ends of the cell. In telophase, the final stage of mitosis, 
the two sets of chromosomes are packaged into a pair of daughter nuclei. The first 
major event of telophase is the disassembly of the mitotic spindle, followed by 
the re-formation of the nuclear envelope. Initially, nuclear membrane fragments 
associate with the surface of individual chromosomes. These membrane frag- 
ments fuse to partly enclose clusters of chromosomes and then coalesce to re- 
form the complete nuclear envelope. Nuclear pore complexes are incorporated 
into the envelope, the nuclear lamina re-forms, and the envelope once again 
becomes continuous with the endoplasmic reticulum. Once the nuclear enve- 
lope has re-formed, the pore complexes pump in nuclear proteins, the nucleus 
expands, and the mitotic chromosomes are reorganized into their interphase 
state, allowing gene transcription to resume. A new nucleus has been created, and 
mitosis is complete. All that remains is for the cell to complete its division into two. 

We saw earlier that phosphorylation of various proteins by M-Cdk promotes 
spindle assembly, chromosome condensation, and nuclear-envelope breakdown 
in early mitosis. It is thus not surprising that the dephosphorylation of these same 
proteins is required for spindle disassembly and the re-formation of daughter 
nuclei in telophase. In principle, these dephosphorylations and the completion 
of mitosis could be triggered by the inactivation of Cdks, the activation of phos- 
phatases, or both. Although Cdk inactivation—resulting primarily from cyclin 
destruction—is mainly responsible in most cells, some cells also rely on activa- 
tion of phosphatases. In budding yeast, for example, the completion of mitosis 
depends on the activation of a phosphatase called Cdc14, which dephosphory- 
lates a subset of Cdk substrates involved in anaphase and telophase. 


Summary 


M-Cdk triggers the events of early mitosis, including chromosome condensation, 
assembly of the mitotic spindle, and bipolar attachment of the sister-chromatid 
pairs to microtubules of the spindle. Spindle formation in animal cells depends 
largely on the ability of mitotic chromosomes to stimulate local microtubule nucle- 
ation and stability, as well as on the ability of motor proteins to organize micro- 
tubules into a bipolar array. Many cells also use centrosomes to facilitate spindle 
assembly. Anaphase is triggered by the APC/C, which stimulates the destruction of 


Figure 17-40 The two processes of 
anaphase in mammalian cells. Separated 
sister chromatids move toward the poles 

in anaphase A. In anaphase B, the two 
spindle poles move apart. 
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the proteins that hold the sister chromatids together. APC/C also promotes cyclin 
destruction and thus the inactivation of M-Cdk. The resulting dephosphorylation of 
Cdk targets is required for the events that complete mitosis, including the disassem- 
bly of the spindle and the re-formation of the nuclear envelope. 


CYTOKINESIS 


The final step in the cell cycle is cytokinesis, the division of the cytoplasm in two. 
In most cells, cytokinesis follows every mitosis, although some cells, such as early 
Drosophila embryos and some mammalian hepatocytes and heart muscle cells, 
undergo mitosis without cytokinesis and thereby acquire multiple nuclei. In most 
animal cells, cytokinesis begins in anaphase and ends shortly after the comple- 
tion of mitosis in telophase. 

The first visible change of cytokinesis in an animal cell is the sudden appear- 
ance of a pucker, or cleavage furrow, on the cell surface. The furrow rapidly deep- 
ens and spreads around the cell until it completely divides the cell in two. The 
structure underlying this process is the contractile ring—a dynamic assembly 
composed of actin filaments, myosin II filaments, and many structural and regula- 
tory proteins. During anaphase, the ring assembles just beneath the plasma mem- 
brane (Figure 17-41; see also Panel 17-1). The ring gradually contracts, and, at 
the same time, fusion of intracellular vesicles with the plasma membrane inserts 
new membrane adjacent to the ring. This addition of membrane compensates for 
the increase in surface area that accompanies cytoplasmic division. When ring 
contraction is completed, membrane insertion and fusion seal the gap between 
the daughter cells. 


Actin and Myosin Il in the Contractile Ring Generate the Force for 
Cytokinesis 


In interphase cells, actin and myosin II filaments form a cortical network underly- 
ing the plasma membrane. In some cells, they also form large cytoplasmic bundles 
called stress fibers (discussed in Chapter 16). As cells enter mitosis, these arrays 
of actin and myosin disassemble; much of the actin reorganizes, and myosin II 
filaments are released. As the sister chromatids separate in anaphase, actin and 
myosin II begin to accumulate in the rapidly assembling contractile ring (Figure 
17-42), which also contains numerous other proteins that provide structural sup- 
port or assist in ring assembly. Assembly of the contractile ring results in part from 
the local formation of new actin filaments, which depends on formin proteins that 
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Figure 17-41 Cytokinesis. (A) The actin-myosin bundles of the contractile ring are oriented as shown, so that their contraction pulls the membrane 
inward. (B) In this low-magnification scanning electron micrograph of a cleaving frog egg, the cleavage furrow is especially prominent, as the cell 

is unusually large. The furrowing of the cell membrane is caused by the activity of the contractile ring underneath it. (C) The surface of a furrow at 
higher magnification. (B and C, from H.W. Beams and R.G. Kessel, Am. Sci. 64:279-290, 1976. With permission from Sigma Xi.) 
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nucleate the assembly of parallel arrays of linear, unbranched actin filaments (dis- 
cussed in Chapter 16). After anaphase, the overlapping arrays of actin and myosin 
II filaments contract to generate the force that divides the cytoplasm in two. Once 
contraction begins, the ring exerts a force large enough to bend a fine glass needle 
that is inserted in its path. As the ring constricts, it maintains the same thickness, 
suggesting that its total volume and the number of filaments it contains decrease 
steadily. Moreover, unlike actin in muscle, the actin filaments in the ring are 
highly dynamic, and their arrangement changes continually during cytokinesis. 

The contractile ring is finally dispensed with altogether when cleavage ends, 
as the plasma membrane of the cleavage furrow narrows to form the midbody. 
The midbody persists as a tether between the two daughter cells and contains 
the remains of the central spindle, a large protein structure derived from the anti- 
parallel interpolar microtubules of the spindle midzone, packed tightly together 
within a dense matrix material (Figure 17-43). After the daughter cells separate 
completely, some of the components of the residual midbody often remain on the 
inside of the plasma membrane of each cell, where they may serve as a mark on 
the cortex that helps to orient the spindle in the subsequent cell division. 


Local Activation of RhoA Triggers Assembly and Contraction 
of the Contractile Ring 


RhoA, a small GTPase of the Ras superfamily (see Table 15-5), controls the assem- 
bly and function of the contractile ring at the site of cleavage. RhoA is activated at 
the cell cortex at the future division site, where it promotes actin filament forma- 
tion, myosin IT assembly, and ring contraction. It stimulates actin filament forma- 
tion by activating formins, and it promotes myosin II assembly and contractions 
by activating multiple protein kinases, including the Rho-activated kinase Rock 
(Figure 17-44). These kinases phosphorylate the regulatory myosin light chain, 
a subunit of myosin II, thereby stimulating bipolar myosin II filament formation 
and motor activity. 

RhoA is thought to be activated by a guanine nucleotide exchange factor (Rho- 
GEF), which is found at the cell cortex at the future division site and stimulates 
the release of GDP and binding of GTP to RhoA (see Figure 17-44). We know little 
about how the RhoGEFF is localized or activated at the division site, although the 
microtubules of the anaphase spindle seem to be involved, as we discuss next. 


The Microtubules of the Mitotic Spindle Determine the Plane of 
Animal Cell Division 
The central problem in cytokinesis is how to ensure that division occurs at the 


right time and in the right place. Cytokinesis must occur only after the two sets of 
chromosomes are fully segregated from each other, and the site of division must 
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Figure 17-42 The contractile ring. 

(A) A drawing of the cleavage furrow in a 
dividing cell. (B) An electron micrograph of 
the ingrowing edge of a cleavage furrow 
of a dividing animal cell. (C) Fluorescence 
micrographs of a dividing slime mold 
amoeba stained for actin (red) and myosin 
ll (green). Whereas all of the visible myosin 
Il has redistributed to the contractile ring, 
only some of the actin has done so; the 
rest remains in the cortex of the nascent 
daughter cells. (B, from H.W. Beams and 
R.G. Kessel, Am. Sci. 64:279-290, 1976. 
With permission from Sigma Xi; C, courtesy 
of Yoshio Fukui.) 
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be placed between the two sets of daughter chromosomes, thereby ensuring that 
each daughter cell receives a complete set. The correct timing and positioning 
of cytokinesis in animal cells are achieved by mechanisms that depend on the 
mitotic spindle. During anaphase, the spindle generates signals that initiate fur- 
row formation at a position midway between the spindle poles, thereby ensuring 
that division occurs between the two sets of separated chromosomes. Because 
these signals originate in the anaphase spindle, this mechanism also contrib- 
utes to the correct timing of cytokinesis in late mitosis. Cytokinesis also occurs 
at the correct time because dephosphorylation of Cdk substrates, which depends 
on cyclin destruction in metaphase and anaphase, initiates cytokinesis. We now 
describe these regulatory mechanisms in more detail, with an emphasis on cyto- 
kinesis in animal cells. 

Studies of the fertilized eggs of marine invertebrates first revealed the impor- 
tance of spindle microtubules in determining the placement of the contractile 
ring. After fertilization, these embryos cleave rapidly without intervening periods 
of growth. In this way, the original egg is progressively divided into smaller and 
smaller cells. Because the cytoplasm is clear, the spindle can be observed in real 
time with a microscope. If the spindle is tugged into a new position with a fine 
glass needle in early anaphase, the incipient cleavage furrow disappears, and a 
new one develops in accord with the new spindle site—supporting the idea that 
signals generated by the spindle induce local furrow formation. 

How does the mitotic spindle specify the site of division? Three general mech- 
anisms have been proposed, and most cells appear to employ a combination of 
these (Figure 17-45). The first is termed the astral stimulation model, in which the 


Figure 17-43 The midbody. (A) A 
scanning electron micrograph of a cultured 
animal cell dividing; the midbody still joins 
the two daughter cells. (B) A conventional 
electron micrograph of the midbody of a 
dividing animal cell. Cleavage is almost 
complete, but the daughter cells remain 
attached by this thin strand of cytoplasm 
containing the remains of the central 
spindle. (A, courtesy of Guenter Albrecht- 
Buehler; B, courtesy of J.M. Mullins.) 
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Figure 17-44 Regulation of the 
contractile ring by the GTPase RhoA. 
Like other Rho family GTPases, RhoA 

is activated by a RNOGEF protein and 
inactivated by a Rho GTPase-activating 
protein (RHOGAP). The active GTP- 
bound form of RhoA is focused at the 
future cleavage site. By binding formins, 
activated RhoA promotes the assembly of 
actin filaments in the contractile ring. By 
activating Rho-activated protein kinases, 
such as Rock, it stimulates myosin ll 
filament formation and activity, thereby 
promoting contraction of the ring. 
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astral microtubules carry furrow-inducing signals, which are somehow focused 
in a ring on the cell cortex, halfway between the spindle poles. Evidence for this 
model comes from ingenious experiments in large embryonic cells, which dem- 
onstrate that a cleavage furrow forms midway between two asters, even when 
the two centrosomes nucleating the asters are not connected to each other by a 
mitotic spindle (Figure 17-46). 

A second possibility, called the central spindle stimulation model, is that the 
spindle midzone, or central spindle, generates a furrow-inducing signal that spec- 
ifies the site of furrow formation at the cell cortex (see Figure 17-45). The over- 
lapping interpolar microtubules of the central spindle associate with numerous 
signaling proteins, including proteins that may stimulate RhoA (Figure 17-47). 
Defects in the functions of these proteins (in Drosophila mutants, for example) 
result in failure of cytokinesis. 

A third model proposes that, in some cell types, the astral microtubules pro- 
mote the local relaxation of actin-myosin bundles at the cell cortex. According to 
this astral relaxation model, the cortical relaxation is minimal at the spindle equa- 
tor, thus promoting cortical contraction at that site (see Figure 17-45). In the early 
embryos of Caenorhabditis elegans, for example, treatments that result in the loss 
of astral microtubules lead to increased contractile activity throughout the cell 
cortex, consistent with this model. 

In some cell types, the site of ring assembly is chosen before mitosis. In bud- 
ding yeasts, for example, a ring of proteins called septins assembles in late G4 at 
the future division site. The septins are thought to form a scaffold onto which 
other components of the contractile ring, including myosin II, assemble. In plant 
cells, an organized band of microtubules and actin filaments, called the prepro- 
phase band, assembles just before mitosis and marks the site where the cell wall 
will assemble and divide the cell in two, as we now discuss. 
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Figure 17-45 Three current models of 
how the microtubules of the anaphase 
spindle generate signals that influence 
the positioning of the contractile 

ring. No single model explains all the 
observations, and furrow positioning is 
probably determined by a combination of 
these mechanisms, with the importance 
of the different mechanisms varying in 
different organisms. See text for details. 


Figure 17-46 An experiment 
demonstrating the influence of the 
position of microtubule asters on the 
subsequent plane of cleavage in a 
large egg cell. If the mitotic spindle is 
mechanically pushed to one side of the 
cell with a glass bead, the membrane 
furrowing is incomplete, failing to occur on 
the opposite side of the cell. Subsequent 
cleavages occur not only at the midzone 
of each of the two subsequent mitotic 
spindles (yellow arrowheads), but also 
between the two adjacent asters that are 
not linked by a mitotic spindle — but in this 
abnormal cell share the same cytoplasm 
(red arrowhead). Apparently, the contractile 
ring that produces the cleavage furrow 

in these cells always forms in the region 
midway between two asters, suggesting 
that the asters somehow alter the adjacent 
region of cell cortex to induce furrow 
formation between them. 
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The Phragmoplast Guides Cytokinesis in Higher Plants 


In most animal cells, the inward movement of the cleavage furrow depends on an 
increase in the surface area of the plasma membrane. New membrane is added 
at the inner edge of the cleavage furrow and is generally provided by small mem- 
brane vesicles that are transported on microtubules from the Golgi apparatus to 
the furrow. 

Membrane deposition is particularly important for cytokinesis in higher-plant 
cells. These cells are enclosed by a semirigid cell wall. Rather than a contractile 
ring dividing the cytoplasm from the outside in, the cytoplasm of the plant cell is 
partitioned from the inside out by the construction of a new cell wall, called the 
cell plate, between the two daughter nuclei (Figure 17-48). The assembly of the 
cell plate begins in late anaphase and is guided by a structure called the phrag- 
moplast, which contains microtubules derived from the mitotic spindle. Motor 
proteins transport small vesicles along these microtubules from the Golgi appara- 
tus to the cell center. These vesicles, filled with polysaccharide and glycoproteins 
required for the synthesis of the new cell wall, fuse to form a disc-like, membrane- 
enclosed structure called the early cell plate. The plate expands outward by further 
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Figure 17-48 Cytokinesis in a plant cell in telophase. In this light micrograph, the early cell plate 
(between the two arrowheads) has formed in a plane perpendicular to the plane of the page. The 
microtubules of the spindle are stained with gold-labeled antibodies against tubulin, and the DNA 
in the two sets of daughter chromosomes is stained with a fluorescent dye. Note that there are no 
astral microtubules, because there are no centrosomes in higher-plant cells. (Courtesy of Andrew 
Bajer.) 


Figure 17-47 Localization of cytokinesis 
regulators at the central spindle of the 
human cell. (A) At center is a cultured 
human cell at the beginning of cytokinesis, 
showing the locations of the GTPase RhoA 
(red) and a protein called Cyk4 (green), 
which is one of several regulatory proteins 
that form complexes at the overlapping 
plus ends of interpolar microtubules. 
These proteins are thought to generate 
signals that help control RhoA activity at 
the cell cortex. (B) When the same three- 
dimensional image is viewed in the plane 
of the contractile ring, as shown here, 
RhoA (red) is seen as a ring beneath the 
cell surface, while the central spindle 
protein Cyk4 (green) is associated with 
microtubule bundles scattered throughout 
the equatorial plane of the cell. (Courtesy of 
Alisa Piekny and Michael Glotzer.) 
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vesicle fusion until it reaches the plasma membrane and the original cell wall and 
divides the cell in two. Later, cellulose microfibrils are laid down within the matrix 
of the cell plate to complete the construction of the new cell wall (Figure 17-49). 


Membrane-Enclosed Organelles Must Be Distributed to Daughter 
Cells During Cytokinesis 


The process of mitosis ensures that each daughter cell receives a full comple- 
ment of chromosomes. When a eukaryotic cell divides, however, each daughter 
cell must also inherit all of the other essential cell components, including the 
membrane-enclosed organelles. As discussed in Chapter 12, organelles such as 
mitochondria and chloroplasts cannot be assembled de novo from their individ- 
ual components; they can arise only by the growth and division of the preexisting 
organelles. Similarly, cells cannot make a new endoplasmic reticulum (ER) unless 
some part of it is already present. 

How, then, do the various membrane-enclosed organelles segregate when a 
cell divides? Organelles such as mitochondria and chloroplasts are usually pres- 
ent in large enough numbers to be safely inherited if, on average, their numbers 
roughly double once each cycle. The ER in interphase cells is continuous with 
the nuclear membrane and is organized by the microtubule cytoskeleton. Upon 
entry into M phase, the reorganization of the microtubules and breakdown of 
the nuclear envelope releases the ER. In most cells, the ER remains largely intact 
and is cut in two during cytokinesis. The Golgi apparatus is reorganized and frag- 
mented during mitosis. Golgi fragments associate with the spindle poles and are 
thereby distributed to opposite ends of the spindle, ensuring that each daughter 
cell inherits the materials needed to reconstruct the Golgi in telophase. 


Some Cells Reposition Their Spindle to Divide Asymmetrically 


Most animal cells divide symmetrically: the contractile ring forms around the 
equator of the parent cell, producing two daughter cells of equal size and with 
the same components. This symmetry results from the placement of the mitotic 
spindle, which in most cases tends to center itself in the cytoplasm. Astral micro- 
tubules and motor proteins that either push or pull on these microtubules con- 
tribute to the centering process. 

There are many instances in development, however, when cells divide asym- 
metrically to produce two cells that differ in size, in the cytoplasmic contents they 
inherit, or in both. Usually, the two different daughter cells are destined to develop 
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Figure 17-49 The special features of 
cytokinesis in a higher-plant cell. The 
division plane is established before 

M phase by a band of microtubules and 
actin filaments (the preprophase band) 

at the cell cortex. At the beginning of 
telophase, after the chromosomes have 
segregated, a new cell wall starts to 
assemble inside the cell at the equator of 
the old spindle. The interpolar microtubules 
of the mitotic spindle remaining at 
telophase form the phragmoplast. The 
plus ends of these microtubules no 

longer overlap but end at the cell equator. 
Golgi-derived vesicles, filled with cell-wall 
material, are transported along these 
microtubules and fuse to form the new cell 
wall, which grows outward to reach the 
plasma membrane and original cell wall. 
The plasma membrane and the membrane 
Surrounding the new cell wall fuse, 
separating the two daughter cells. 
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along different pathways. To create daughter cells with different fates in this way, 
the mother cell must first segregate certain components (called cell fate determi- 
nants) to one side of the cell and then position the plane of division so that the 
appropriate daughter cell inherits these components (Figure 17-50). To position 
the plane of division asymmetrically, the spindle has to be moved in a controlled 
manner within the dividing cell. It seems likely that changes in local regions of 
the cell cortex direct such spindle movements and that motor proteins localized 
there pull one of the spindle poles, via its astral microtubules, to the appropriate 
region. Genetic analyses in C. elegans and Drosophila have identified some of the 
proteins required for such asymmetric divisions, and some of these proteins seem 
to have a similar role in vertebrates. 


Mitosis Can Occur Without Cytokinesis 


Although nuclear division is usually followed by cytoplasmic division, there are 
exceptions. Some cells undergo multiple rounds of nuclear division without inter- 
vening cytoplasmic division. In the early Drosophila embryo, for example, the 
first 13 rounds of nuclear division occur without cytoplasmic division, resulting in 
the formation of a single large cell containing several thousand nuclei, arranged 
in a monolayer near the surface. A cell in which multiple nuclei share the same 
cytoplasm is called a syncytium. This arrangement greatly speeds up early devel- 
opment, as the cells do not have to take the time to go through all the steps of 
cytokinesis for each division. After these rapid nuclear divisions, membranes are 
created around each nucleus in one round of coordinated cytokinesis called cellu- 
larization. The plasma membrane extends inward and, with the help of an actin- 
myosin ring, pinches off to enclose each nucleus (Figure 17-51). 

Nuclear division without cytokinesis also occurs in some types of mammalian 
cells. Megakaryocytes, which produce blood platelets, and some hepatocytes and 
heart muscle cells, for example, become multinucleated in this way. 

After cytokinesis, most cells enter G4, in which Cdks are mostly inactive. We 
end this section by discussing how this state is achieved at the end of M phase. 


The G; Phase Is a Stable State of Cdk Inactivity 


A key regulatory event in late M phase is the inactivation of Cdks, which is driven 
primarily by APC/C-dependent cyclin destruction. As described earlier, the inac- 
tivation of Cdks in late M phase has many functions: it triggers the events of late 
mitosis, promotes cytokinesis, and enables the synthesis of prereplicative com- 
plexes at DNA replication origins. It also provides a mechanism for resetting the 
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Figure 17-50 An asymmetric cell 
division segregating cytoplasmic 
components to only one daughter cell. 
These light micrographs illustrate the 
controlled asymmetric segregation 

of specific cytoplasmic components to 
one daughter cell during the first division 
of a fertilized egg of the nematode 

C. elegans. The fertilized egg is shown in 
the /eft micrographs and the two daughter 
cells in the right micrographs. The cells 
above have been stained with a blue, 
DNA-binding, fluorescent dye to show the 
nucleus (and polar bodies); they are viewed 
by both differential-interference-contrast 
and fluorescence microscopy. The cells 
below are the same cells stained with an 
antibody against P-granules and viewed 
by fluorescence microscopy. These small 
granules are made of RNA and proteins 
and determine which cells become germ 
cells. They are distributed randomly 
throughout the cytoplasm of the unfertilized 
egg (not shown) but become segregated to 
the posterior pole of the fertilized egg. The 
cleavage plane is oriented to ensure that 
only the posterior daughter cell receives 
the P-granules when the egg divides. 

The same segregation process is 
repeated in several subsequent cell 
divisions, so that the P-granules end up 
only in cells that give rise to eggs and 
sperm. (Courtesy of Susan Strome.) 
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Figure 17-51 Mitosis without cytokinesis in the early Drosophila 
embryo. (A) The first 13 nuclear divisions occur synchronously and without 
cytoplasmic division to create a large syncytium. Most of the nuclei migrate 
to the cortex, and the plasma membrane extends inward and pinches 

off to surround each nucleus to form individual cells in a process called 
cellularization. (B) Fluorescence micrograph of multiple mitotic spindles in 

a Drosophila embryo before cellularization. The microtubules are stained 
green and the centrosomes red. Note that all the nuclei go through the 
cycle synchronously; here, they are all in metaphase, with the unlabeled 
chromosomes seen as a dark band at the spindle equator. (B, courtesy of 
Kristina Yu and William Sullivan.) 


cell-cycle control system to a state of Cdk inactivity as the cell prepares to enter a 
new cell cycle. In most cells, this state of Cdk inactivity generates a G, gap phase, 
during which the cell grows and monitors its environment before committing to 
a new cell cycle. 

In early animal embryos, the inactivation of M-Cdk in late mitosis is due 
almost entirely to the action of Cdc20-APC/C, discussed earlier. Recall, however, 
that M-Cdk stimulates Cdc20-APC/C activity. Thus, the destruction of M-cyclin 
in late mitosis soon leads to the inactivation of all APC/C activity in an embry- 
onic cell. This APC/C inactivation immediately after mitosis is especially useful 
in rapid embryonic cell cycles, as it allows the cell to quickly begin accumulating 
new M-cyclin for the next cycle (Figure 17-52A). 

Rapid cyclin accumulation immediately after mitosis is not useful, however, for 
cells in which a G, phase is needed to allow control of entry into the next cell cycle. 
These cells employ several mechanisms to prevent Cdk reactivation after mito- 
sis. One mechanism uses another APC/C-activating protein called Cdh1l, men- 
tioned earlier as a close relative of Cdc20 (see Table 17-2). Although both Cdh1 
and Cdc20 bind to and activate the APC/C, they differ in one important respect. 
Whereas M-Cdk activates the Cdc20-APC/C complex, it inhibits the Cdh1-APC/C 
complex by directly phosphorylating Cdh1. As a result of this relationship, Cdh1- 
APC/C activity increases in late mitosis after the Cdc20-APC/C complex has initi- 
ated the destruction of M-cyclin. M-cyclin destruction therefore continues after 
mitosis: although Cdc20-APC/C activity has declined, Cdh1-APC/C activity is 
high (Figure 17-52B). 

A second mechanism that suppresses Cdk activity in Gı depends on the 
increased production of CKIs, the Cdk inhibitor proteins discussed earlier. Bud- 
ding yeast cells, in which this mechanism is best understood, contain a CKI pro- 
tein called Sic1, which binds to and inactivates M-Cdk in late mitosis and G (see 


Figure 17-52 The creation of a G; phase by stable Cdk inhibition after 
mitosis. (A) In early embryonic cell cycles, Cdc20-APC/C activity rises at 

the end of metaphase, triggering M-cyclin destruction. Because M-Cdk 
activity stimulates Cdc20-APC/C activity, the loss of M-cyclin leads to APC/C 
inactivation after mitosis, which allows M-cyclins to begin accumulating again. 
(B) In cells that have a G; phase, the drop in M-Cdk activity in late mitosis 
leads to the activation of Cdh1-APC/C (as well as to the accumulation of Cdk 
inhibitor proteins; not shown). This ensures a continued suppression of Cdk 
activity after mitosis, as required for a G; phase. 





1003 











& 
© 
o 
[e| 
[e 
= S 
CELLULARIZATION [s 
COMPLETED g 
q 
Q 











OX) 





10 um 





Cdc20-APC/C activity 


M-cyclin level K | 
M S 


Cdc20-APC/C activity 











Cdh1-APC/C 

activity keeps 
M-cyclin level 
low in G4 


M-cyclin level 






1004 Chapter 17: The Cell Cycle 


Table 17-2). Like Cdh1, Sicl is inhibited by M-Cdk, which phosphorylates Sicl 
during mitosis and thereby promotes its ubiquitylation by SCE Thus, Sicl and 
M-Cdk, like Cdh1 and M-Cdk, inhibit each other. As a result, the decline in M-Cdk 
activity that occurs in late mitosis causes the Sicl protein to accumulate, and this 
CKI helps keep M-Cdk activity low after mitosis. A CKI protein called p27 (see 
Figure 17-14) may serve similar functions in animal cells. 

In most cells, decreased transcription of M-cyclin genes also inactivates 
M-Cdks in late mitosis. In budding yeast, for example, M-Cdk promotes the 
expression of these genes, resulting in a positive feedback loop. This loop is turned 
off as cells exit from mitosis: the inactivation of M-Cdk by Cdh1 and Sicl leads 
to decreased M-cyclin gene transcription and thus decreased M-cyclin synthesis. 
Gene regulatory proteins that promote the expression of G;/S- and S-cyclins are 
also inhibited during G1. 

Thus, Cdh1-APC/C activation, CKI accumulation, and decreased cyclin gene 
expression act together to ensure that the early G; phase is a time when essen- 
tially all Cdk activity is suppressed. As in many other aspects of cell-cycle control, 
the use of multiple regulatory mechanisms allows the system to operate with rea- 
sonable efficiency even if one mechanism fails. So how does the cell escape from 
this stable G; state to initiate a new cell cycle? The answer is that G;/S-Cdk activ- 
ity, which rises in late Gj, releases all the braking mechanisms that suppress Cdk 
activity, as we describe later, in the last section of this chapter. 


Summary 


After mitosis completes the formation of a pair of daughter nuclei, cytokinesis fin- 
ishes the cell cycle by dividing the cell itself. Cytokinesis depends on a ring of actin 
and myosin filaments that contracts in late mitosis at a site midway between the 
segregated chromosomes. In animal cells, the positioning of the contractile ring is 
determined by signals emanating from the microtubules of the anaphase spindle. 
Dephosphorylation of Cdk targets, which results from Cdk inactivation in ana- 
phase, triggers cytokinesis at the correct time after anaphase. After cytokinesis, the 
cell enters a stable G; state of low Cdk activity, where it awaits signals to enter anew 
cell cycle. 


MEIOSIS 


Most eukaryotic organisms reproduce sexually: the genomes of two parents mix 
to generate offspring that are genetically distinct from either parent. The cells of 
these organisms are generally diploid: that is, they contain two slightly different 
copies, or homologs, of each chromosome, one from each parent. Sexual repro- 
duction depends on a specialized nuclear division process called meiosis, which 
produces haploid cells carrying only a single copy of each chromosome. In many 
organisms, the haploid cells differentiate into specialized reproductive cells called 
gametes—eggs and sperm in most species. In these species, the reproductive cycle 
ends when a sperm and egg fuse to form a diploid zygote, which has the poten- 
tial to form a new individual. In this section, we consider the basic mechanisms 
and regulation of meiosis, with an emphasis on how they compare with those of 
mitosis. 


Meiosis Includes Two Rounds of Chromosome Segregation 


Meiosis reduces the chromosome number by half using many of the same molec- 
ular machines and control systems that operate in mitosis. As in the mitotic cell 
cycle, the cell begins the meiotic program by duplicating its chromosomes in mei- 
otic S phase, resulting in pairs of sister chromatids that are tightly linked along 
their entire lengths by cohesin complexes. Unlike mitosis, however, two succes- 
sive rounds of chromosome segregation then occur (Figure 17-53). The first of 
these divisions (meiosis I) solves the problem, unique to meiosis, of segregating 
the homologs. The duplicated paternal and maternal homologs pair up alongside 
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each other and become physically linked by the process of genetic recombination. 
These pairs of homologs, each containing a pair of sister chromatids, then line 
up on the first meiotic spindle. In the first meiotic anaphase, duplicated homo- 
logs rather than sister chromatids are pulled apart and segregated into the two 
daughter nuclei. Only in the second division (meiosis II), which occurs without 
further DNA replication, are the sister chromatids pulled apart and segregated (as 
in mitosis) to produce haploid daughter nuclei. In this way, each diploid nucleus 
that enters meiosis produces four haploid nuclei, each of which contains either 
the maternal or paternal copy of each chromosome, but not both (Movie 17.7). 
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Figure 17-53 Comparison of 
meiosis and mitosis. For clarity, 
only one pair of homologous 
chromosomes (homologs) is shown. 
(A) Meiosis is a form of nuclear 
division in which a single round of 
chromosome duplication (meiotic 

S phase) is followed by two rounds 
of chromosome segregation. The 
duplicated homologs, each consisting 
of tightly bound sister chromatids, 
pair up and are segregated into 
different daughter nuclei in meiosis |; 
the sister chromatids are segregated 
in meiosis Il. As indicated by the 
formation of chromosomes that are 
partly red and partly gray, homolog 
pairing in meiosis leads to genetic 
recombination (crossing-over) during 
meiosis |. Each diploid cell that enters 
meiosis therefore produces four 
genetically different haploid nuclei, 
which are distributed by cytokinesis 
into haploid cells that differentiate into 
gametes. (B) In mitosis, by contrast, 
homologs do not pair up, and the 
sister chromatids are segregated 
during the single division. Thus, each 
diploid cell that divides by mitosis 
produces two genetically identical 
diploid daughter nuclei, which are 
distributed by cytokinesis into a pair 
of daughter cells. 
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Duplicated Homologs Pair During Meiotic Prophase 


During mitosis in most organisms, homologous chromosomes behave indepen- 
dently of each other. During meiosis I, however, it is crucial that homologs rec- 
ognize each other and associate physically in order for the maternal and pater- 
nal homologs to be bi-oriented on the first meiotic spindle. Special mechanisms 
mediate these interactions. 

The gradual juxtaposition of homologs occurs during a prolonged period 
called meiotic prophase (or prophase I), which can take hours in yeasts, days in 
mice, and weeks in higher plants. Like their mitotic counterparts, duplicated mei- 
otic prophase chromosomes first appear as long threadlike structures, in which 
the sister chromatids are so tightly glued together that they appear as one. It is 
during early prophase I that the homologs begin to associate along their length in 
a process called pairing, which, in some organisms at least, begins with interac- 
tions between complementary DNA sequences (called pairing sites) in the two 
homologs. As prophase progresses, the homologs become more closely juxta- 
posed, forming a four-chromatid structure called a bivalent (Figure 17-54A). In 
most species, homolog pairs are then locked together by homologous recombi- 
nation: DNA double-strand breaks are formed at several locations in each sister 
chromatid, resulting in large numbers of DNA recombination events between the 
homologs (as described in Chapter 5). Some of these events lead to reciprocal 
DNA exchanges called crossovers, where the DNA of a chromatid crosses over to 
become continuous with the DNA of a homologous chromatid (Figure 17-54B; 
also see Figure 5-54). 


Homolog Pairing Culminates in the Formation of a Synaptonemal 
Complex 


The paired homologs are brought into close juxtaposition, with their structural 
axes (axial cores) about 400 nm apart, by a mechanism that depends in most 
species on the double-strand DNA breaks that occur in sister chromatids. What 
pulls the axes together? One possibility is that the large protein machine, called 
a recombination complex, which assembles on a double-strand break in a chro- 
matid, binds the matching DNA sequence in the nearby homolog and helps reel 
in this partner. This so-called presynaptic alignment of the homologs is followed 
by synapsis, in which the axial core of a homolog becomes tightly linked to the 
axial core of its partner by a closely packed array of transverse filaments to cre- 
ate a synaptonemal complex, which bridges the gap, now only 100 nm, between 
the homologs (Figure 17-55). Although crossing-over begins before the synap- 
tonemal complex assembles, the final steps occur while the DNA is held in the 
complex. 

The morphological changes that occur during homolog pairing are the basis 
for dividing meiotic prophase into five sequential stages—leptotene, zygotene, 
pachytene, diplotene, and diakinesis (Figure 17-56). Prophase starts with lep- 
totene, when homologs condense and pair and genetic recombination begins. 
At zygotene, the synaptonemal complex begins to assemble at sites where the 
homologs are closely associated and recombination events are occurring. At 
pachytene, the assembly process is complete, and the homologs are synapsed 


Figure 17-54 Homolog pairing and crossing-over. (A) The structure 
formed by two closely aligned duplicated homologs is called a bivalent. 

As in mitosis, the sister chromatids in each homolog are tightly connected 
along their entire lengths, as well as at their centromeres. At this stage, the 
homologs are usually joined by a protein complex called the synaptonemal 
complex (not shown; see Figure 17-55). (B) A later-stage bivalent in which 
a single crossover has occurred between nonsister chromatids. It is only 
when the synaptonemal complex disassembles and the paired homologs 
separate a little at the end of prophase |, as shown, that the crossover is 
seen microscopically as a thin connection between the homologs called a 
chiasma. 
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Figure 17-55 Simplified schematic 
drawing of a synaptonemal complex. 
Each homolog is organized around a 
protein axial core, and the synaptonemal 
complex forms when these homolog 
axes are linked by rod-shaped transverse 
filaments. The axial core of each homolog 
also interacts with the cohesin complexes 
that hold the sister chromatids together 
(see Figure 9-35). (Modified from 

K. Nasmyth, Annu. Rev. Genet. 35:673- 
745, 2001.) 
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along their entire lengths (see Figure 9-35). The pachytene stage can persist for 
days or longer, until desynapsis begins at diplotene with the disassembly of the 
synaptonemal complexes and the concomitant condensation and shortening of 
the chromosomes. It is only at this stage, after the complexes have disassembled, 
that the individual crossover events between nonsister chromatids can be seen 
as inter-homolog connections called chiasmata (singular chiasma), which now 
play a crucial part in holding the compact homologs together (Figure 17-57). The 
homologs are now ready to begin the process of segregation. 
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Figure 17-56 Homolog synapsis and desynapsis during the different stages of prophase l. (A) A single bivalent is shown schematically. At 
leptotene, the two sister chromatids coalesce, and their chromatid loops extend out from a common axial core. Assembly of the synaptonemal 
complex begins in early zygotene and is complete in pachytene. The complex disassembles in diplotene. (B) An electron micrograph of a 
synaptonemal complex from a meiotic cell at pachytene in a lily flower. (C and D) Immunofluorescence micrographs of prophase | cells of the fungus 
Sordaria. Partially synapsed bivalents at zygotene are shown in (C) and fully synapsed bivalents are shown in (D). Red arrowheads in (C) point to 
regions where synapsis is still incomplete. (B, courtesy of Brian Wells; C and D, from A. Storlazzi et al., Genes Dev. 17:2675-2687, 2003. With 


permission from Cold Spring Harbor Laboratory Press.) 
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Figure 17-57 A bivalent with three chiasmata resulting from three 








crossover events. (A) Light micrograph of a grasshopper bivalent. & 
(B) Drawing showing the arrangement of the crossovers in (A). Note that = 
chromatid 1 has undergone an exchange with chromatid 3, and chromatid p - 
2 has undergone exchanges with chromatids 3 and 4. Note also how E% Aa 
the combination of the chiasmata and the tight attachment of the sister ee al a 
chromatid arms to each other (mediated by cohesin complexes) holds the y i7 
two homologs together after the synaptonemal complex has disassembled; H ; t 
if either the chiasmata or the sister-chromatid cohesion failed to form, the $ ; 
homologs would come apart at this stage and not be segregated properly in AS 
meiosis |. (A, courtesy of Bernard John.) F 
N a 
Homolog Segregation Depends on Several Unique Features of Se 
Meiosis | 
ae (A) (B) 


A fundamental difference between meiosis I and mitosis (and meiosis II) is that 
in meiosis I homologs rather than sister chromatids separate and then segregate 
(see Figure 17-53). This difference depends on three features of meiosis I that dis- 
tinguish it from mitosis (Figure 17-58). 

First, both sister kinetochores in a homolog must attach stably to the same 
spindle pole. This type of attachment is normally avoided during mitosis (see Fig- 
ure 17-33). In meiosis I, however, the two sister kinetochores are fused into a sin- 
gle microtubule-binding unit that attaches to just one pole (see Figure 17-58A). 
The fusion of sister kinetochores is achieved by a complex of proteins that is 
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Figure 17-58 Comparison of chromosome behavior in meiosis I, meiosis Il, and mitosis. Chromosomes behave similarly in mitosis and meiosis 
ll, but they behave very differently in meiosis |. (A) In meiosis I, the two sister kinetochores are located side-by-side on each homolog and attach to 
microtubules from the same spindle pole. The proteolytic cleavage of cohesin along the sister-chromatid arms unglues the arms and resolves the 
crossovers, allowing the duplicated homologs to separate at anaphase |, while the residual cohesin at the centromeres keeps the sisters together. 
Cleavage of centromeric cohesin allows the sister chromatids to separate at anaphase Il. (B) In mitosis, by contrast, the two sister kinetochores 
attach to microtubules from different spindle poles, and the two sister chromatids come apart at the start of anaphase and segregate into separate 


daughter nuclei. 
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localized at the kinetochores in meiosis I, but we do not know in any detail how 
these proteins work. They are removed from kinetochores after meiosis I, so that 
in meiosis II the sister-chromatid pairs can be bi-oriented on the spindle as they 
are in mitosis. 

Second, crossovers generate a strong physical linkage between homologs, 
allowing their bi-orientation at the equator of the spindle—much like cohesion 
between sister chromatids is important for their bi-orientation in mitosis (and 
meiosis II). Crossovers hold homolog pairs together only because the arms of 
the sister chromatids are connected by sister-chromatid cohesion (see Figure 
17-58A). 

Third, cohesion is removed in anaphase I only from chromosome arms and 
not from the regions near the centromeres, where the kinetochores are located. 
The loss of arm cohesion triggers homolog separation at the onset of anaphase I. 
This process depends on APC/C activation, which leads to securin destruction, 
separase activation, and cohesin cleavage along the arms (see Figure 17-38). 

Cohesins near the centromeres are protected from separase in meiosis I by 
a kinetochore-associated protein called shugoshin (from the Japanese word 
for “guardian spirit”). Shugoshin acts by recruiting a protein phosphatase that 
removes phosphates from centromeric cohesins. Cohesin phosphorylation is 
normally required for separase to cleave cohesin; thus, removal of this phosphor- 
ylation near the centromere prevents cohesin cleavage. Sister-chromatid pairs 
therefore remain linked through meiosis I, allowing their correct bi-orientation 
on the spindle in meiosis II. Shugoshin is inactivated after meiosis I. At the onset 
of anaphase II, APC/C activation triggers centromeric cohesin cleavage and sister- 
chromatid separation—much as it does in mitosis. Following anaphase II, nuclear 
envelopes form around the chromosomes to produce four haploid nuclei, after 
which cytokinesis and other differentiation processes lead to the production of 
haploid gametes. 


Crossing-Over Is Highly Regulated 


Crossing-over has two distinct functions in meiosis: it helps hold homologs 
together so that they are properly segregated to the two daughter nuclei produced 
by meiosis I, and it contributes to the genetic diversification of the gametes that 
are eventually produced. As might be expected, therefore, crossing-over is highly 
regulated: the number and location of double-strand breaks along each chromo- 
some is controlled, as is the likelihood that a break will be converted into a cross- 
over. On average, the result of this regulation is that each pair of human homologs 
is linked by about two or three crossovers (Figure 17-59). 

Although the double-strand breaks that occur in meiosis I can be located 
almost anywhere along the chromosome, they are not distributed uniformly: they 
cluster at “hot spots,’ where the DNA is accessible, and occur only rarely in “cold 
spots, such as the heterochromatin regions around centromeres and telomeres. 
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Figure 17-59 Crossovers between 
homologs in the human testis. In 

these immunofluorescence micrographs, 
antibodies have been used to stain the 
synaptonemal complexes (red), the 
centromeres (blue), and the sites of 
crossing-over (green). Note that all of the 
bivalents have at least one crossover and 
none have more than four. (Modified from 
A. Lynn et al., Science 296:2222-2225, 
2002. With permission from AAAS.) 
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At least two kinds of regulation influence the location and number of cross- 
overs that form, neither of which is well understood. Both operate before the 
synaptonemal complex assembles. One ensures that at least one crossover forms 
between the members of each homolog pair, as is necessary for normal homolog 
segregation in meiosis I. In the other, called crossover interference, the presence 
of one crossover event inhibits another from forming close by, perhaps by locally 
depleting proteins required for converting a double-strand DNA break into a sta- 
ble crossover. 


Meiosis Frequently Goes Wrong 


The sorting of chromosomes that takes place during meiosis is a remarkable feat 
of intracellular bookkeeping. In humans, each meiosis requires that the starting 
cell keep track of 92 chromatids (46 chromosomes, each of which has duplicated), 
distributing one complete set of each type of autosome to each of the four hap- 
loid progeny. Not surprisingly, mistakes can occur in allocating the chromosomes 
during this elaborate process. Mistakes are especially common in human female 
meiosis, which arrests for years after diplotene: meiosis I is completed only at ovu- 
lation, and meiosis II only after the egg is fertilized. Indeed, such chromosome 
segregation errors during egg development are the most common cause of both 
spontaneous abortion (miscarriage) and mental retardation in humans. 

When homologs fail to separate properly—a phenomenon called nondisjunc- 
tion—the result is that some of the resulting haploid gametes lack a particular 
chromosome, while others have more than one copy of it. Upon fertilization, 
these gametes form abnormal embryos, most of which die. Some survive, how- 
ever. Down syndrome in humans, for example, which is the leading cause of men- 
tal retardation, is caused by an extra copy of chromosome 21, usually resulting 
from nondisjunction during meiosis I in the female ovary. Segregation errors dur- 
ing meiosis I increase greatly with advancing maternal age. 


Summary 


Haploid gametes are produced by meiosis, in which a diploid nucleus undergoes 
two successive cell divisions after one round of DNA replication. Meiosis is domi- 
nated by a prolonged prophase. At the start of prophase, the chromosomes have 
replicated and consist of two tightly joined sister chromatids. Homologous chromo- 
somes then pair up and become progressively more closely juxtaposed as prophase 
proceeds. The tightly aligned homologs undergo genetic recombination, forming 
crossovers that help hold each pair of homologs together during metaphase I. Meio- 
sis-specific, kinetochore-associated proteins help ensure that both sister chromatids 
in a homolog attach to the same spindle pole; other kinetochore-associated proteins 
ensure that the homologs remain connected at their centromeres during anaphase 
I, so that homologs rather than sister chromatids are segregated in meiosis I. After 
meiosis I, meiosis II follows rapidly, without DNA replication, in a process that 
resembles mitosis, in that sister chromatids are pulled apart at anaphase. 


CONTROL OF CELL DIVISION AND CELL GROWTH 


A fertilized mouse egg and a fertilized human egg are similar in size, yet they pro- 
duce animals of very different sizes. What factors in the control of cell behavior in 
humans and mice are responsible for these size differences? The same fundamen- 
tal question can be asked for each organ and tissue in an animal’s body. What fac- 
tors determine the length of an elephant’s trunk or the size of its brain or its liver? 
These questions are largely unanswered, but it is nevertheless possible to say what 
the ingredients of an answer must be. 

The size of an organ or organism depends on its total cell mass, which depends 
on both the total number of cells and their size. Cell number, in turn, depends 
on the amounts of cell division and cell death. Organ and body size are therefore 
determined by three fundamental processes: cell growth, cell division, and cell 
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survival. Each is tightly regulated—both by intracellular programs and by extra- 
cellular signal molecules that control these programs. 

The extracellular signal molecules that regulate cell growth, division, and 
survival are generally soluble secreted proteins, proteins bound to the surface of 
cells, or components of the extracellular matrix. They can be divided operation- 
ally into three major classes: 


1. Mitogens, which stimulate cell division, primarily by triggering a wave of 
G,/S-Cdk activity that relieves intracellular negative controls that other- 
wise block progress through the cell cycle. 


2. Growth factors, which stimulate cell growth (an increase in cell mass) by 
promoting the synthesis of proteins and other macromolecules and by 
inhibiting their degradation. 


3. Survival factors, which promote cell survival by suppressing the form of 
programmed cell death known as apoptosis. 


Many extracellular signal molecules promote all of these processes, while oth- 
ers promote one or two of them. Indeed, the term growth factor is often used inap- 
propriately to describe a factor that has any of these activities. Even worse, the 
term cell growth is often used to mean an increase in cell number, or cell prolifera- 
tion. 

In addition to these three classes of stimulating signals, there are extracellular 
signal molecules that suppress cell proliferation, cell growth, or both; in general, 
less is known about them. There are also extracellular signal molecules that acti- 
vate apoptosis. 

In this section, we focus primarily on how mitogens and other factors, such as 
DNA damage, control the rate of cell division. We then turn to the important but 
poorly understood problem of how a proliferating cell coordinates its growth with 
cell division so as to maintain its appropriate size. We discuss the control of cell 
survival and cell death by apoptosis in Chapter 18. 


Mitogens Stimulate Cell Division 


Unicellular organisms tend to grow and divide as fast as they can, and their rate of 
proliferation depends largely on the availability of nutrients in the environment. 
The cells of a multicellular organism, however, divide only when the organism 
needs more cells. Thus, for an animal cell to proliferate, it must receive stimu- 
latory extracellular signals, in the form of mitogens, from other cells, usually its 
neighbors. Mitogens overcome intracellular braking mechanisms that block prog- 
ress through the cell cycle. 

One of the first mitogens to be identified was platelet-derived growth factor 
(PDGF), and it is typical of many others discovered since. The path to its isola- 
tion began with the observation that fibroblasts in a culture dish proliferate when 
provided with serum but not when provided with plasma. Plasma is prepared by 
removing the cells from blood without allowing clotting to occur; serum is pre- 
pared by allowing blood to clot and taking the cell-free liquid that remains. When 
blood clots, platelets incorporated in the clot are stimulated to release the con- 
tents of their secretory vesicles (Figure 17-60). The superior ability of serum to 
support cell proliferation suggested that platelets contain one or more mitogens. 
This hypothesis was confirmed by showing that extracts of platelets could serve 
instead of serum to stimulate fibroblast proliferation. The crucial factor in the 
extracts was shown to be a protein, which was subsequently purified and named 
PDGE In the body, PDGF liberated from blood clots helps stimulate cell division 
during wound healing. 

PDGF is only one of over 50 animal proteins that are known to act as mitogens. 
Most of these proteins have a broad specificity. PDGF, for example, can stimulate 
many types of cells to divide, including fibroblasts, smooth muscle cells, and neu- 
roglial cells. Similarly, epidermal growth factor (EGF) acts not only on epidermal 
cells but also on many other cell types, including both epithelial and nonepithe- 
lial cells. Some mitogens, however, have a narrow specificity; erythropoietin, for 
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Figure 17-60 A platelet. Platelets are 
miniature cells without a nucleus. They 
circulate in the blood and help stimulate 
blood clotting at sites of tissue damage, 
thereby preventing excessive bleeding. 
They also release various factors that 
stimulate wound healing. The platelet 
shown here has been cut in half to show its 
secretory vesicles, some of which contain 
platelet-derived growth factor (PDGF). 
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example, only induces the proliferation of red blood cell precursors. Many mito- 
gens, including PDGF also have actions other than the stimulation of cell division: 
they can stimulate cell growth, survival, differentiation, or migration, depending 
on the circumstances and the cell type. 

In some tissues, inhibitory extracellular signal proteins oppose the positive 
regulators and thereby inhibit organ growth. The best-understood inhibitory 
signal proteins are transforming growth factor-B (TGF) and its relatives. TGFB 
inhibits the proliferation of several cell types, mainly by blocking cell-cycle pro- 
gression in Gj. 


Cells Can Enter a Specialized Nondividing State 


In the absence of a mitogenic signal to proliferate, Cdk inhibition in G is main- 
tained by the multiple mechanisms discussed earlier, and progression into a new 
cell cycle is blocked. In some cases, cells partly disassemble their cell-cycle con- 
trol system and withdraw from the cycle to a specialized nondividing state called 
Go. 

Most cells in our body are in Go, but the molecular basis and reversibility of 
this state vary in different cell types. Most of our neurons and skeletal muscle cells, 
for example, are in a terminally differentiated Go state, in which their cell-cycle 
control system is completely dismantled: the expression of the genes encoding 
various Cdks and cyclins is permanently turned off, and cell division rarely occurs. 
Some cell types withdraw from the cell cycle only transiently and retain the ability 
to reassemble the cell-cycle control system quickly and re-enter the cycle. Most 
liver cells, for example, are in Go, but they can be stimulated to divide if the liver 
is damaged. Still other types of cells, including fibroblasts and some lymphocytes, 
withdraw from and re-enter the cell cycle repeatedly throughout their lifetime. 

Almost all the variation in cell-cycle length in the adult body occurs during the 
time the cell spends in G; or Go. By contrast, the time a cell takes to progress from 
the beginning of S phase through mitosis is usually brief (typically 12-24 hours in 
mammals) and relatively constant, regardless of the interval from one division to 
the next. 


Mitogens Stimulate Gy-Cdk and G4/S-Cdk Activities 


For the vast majority of animal cells, mitogens control the rate of cell division by 
acting in the G, phase of the cell cycle. As discussed earlier, multiple mechanisms 
act during G; to suppress Cdk activity. Mitogens release these brakes on Cdk 
activity, thereby allowing entry into a new cell cycle. 

As we discuss in Chapter 15, mitogens interact with cell-surface receptors to 
trigger multiple intracellular signaling pathways. One major pathway acts through 
the monomeric GTPase Ras, which leads to the activation of a mitogen-activated 
protein kinase (MAP kinase) cascade (see Figure 15-49). This leads to an increase 
in the production of transcription regulatory proteins, including Myc. Myc is 
thought to promote cell-cycle entry by several mechanisms, one of which is to 
increase the expression of genes encoding G cyclins (D cyclins), thereby increas- 
ing G,-Cdk (cyclin D-Cdk4) activity. Myc also has a major role in stimulating the 
transcription of genes that increase cell growth. 

The key function of G;-Cdk complexes in animal cells is to activate a group 
of gene regulatory factors called the E2F proteins, which bind to specific DNA 
sequences in the promoters ofa wide variety of genes that encode proteins required 
for S-phase entry, including G;/S-cyclins, S-cyclins, and proteins involved in DNA 
synthesis and chromosome duplication. In the absence of mitogenic stimulation, 
E2F-dependent gene expression is inhibited by an interaction between E2F and 
members of the retinoblastoma protein (Rb) family. When cells are stimulated 
to divide by mitogens, active G;-Cdk accumulates and phosphorylates Rb family 
members, reducing their binding to E2F The liberated E2F proteins then activate 
expression of their target genes (Figure 17-61). 

This transcriptional control system, like so many other control systems that 
regulate the cell cycle, includes feedback loops that ensure that entry into the 
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cell cycle is complete and irreversible. The liberated E2F proteins, for example, 
increase the transcription of their own genes. In addition, E2F-dependent tran- 
scription of G;/S-cyclin (cyclin E) and S-cyclin (cyclin A) genes leads to increased 
G,/S-Cdk and S-Cdk activities, which in turn increase Rb protein phosphorylation 
and promote further E2F release (see Figure 17-61). 

The central member of the Rb family, the Rb protein itself, was identified orig- 
inally through studies of an inherited form of eye cancer in children, known as 
retinoblastoma (discussed in Chapter 20). The loss of both copies of the Rb gene 
leads to excessive proliferation of some cells in the developing retina, suggesting 
that the Rb protein is particularly important for restraining cell division in this tis- 
sue. The complete loss of Rb does not immediately cause increased proliferation 
of retinal or other types of cells, in part because Cdh1 and CKIs also help inhibit 
progression through G, and in part because other cell types contain Rb-related 
proteins that provide backup support in the absence of Rb. It is also likely that 
other proteins, unrelated to Rb, help to regulate the activity of E2F. 
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Figure 17-61 Mitogen stimulation of cell- 
cycle entry. As discussed in Chapter 15, 
mitogens bind to cell-surface receptors to 
initiate intracellular signaling pathways. One 
of the major pathways involves activation 
of the small GTPase Ras, which activates a 
MAP kinase cascade, leading to increased 
expression of numerous immediate early 
genes, including the gene encoding the 
transcription regulatory protein Myc. Myc 
increases the expression of many delayed- 
response genes, including some that lead 
to increased Gy-Cdk activity (cyclin D- 
Cdk4), which triggers the phosphorylation 
of members of the Rb family of proteins. 
This inactivates the Rb proteins, freeing 

the gene regulatory protein E2F to activate 
the transcription of G1/S genes, including 
the genes for a G4/S-cyclin (cyclin E) and 
S-cyclin (cyclin A). The resulting G1/S-Cdk 
and S-Cdk activities further enhance Rb 
protein phosphorylation, forming a positive 
feedback loop. E2F proteins also stimulate 
the transcription of their own genes, 
forming another positive feedback loop. 
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Additional layers of control promote an overwhelming increase in S-Cdk 
activity at the beginning of S phase. We mentioned earlier that the APC/C acti- 
vator Cdh1 suppresses cyclin levels after mitosis. In animal cells, however, G1- 
and G)/S-cyclins are resistant to Cdh1-APC/C and can therefore act unopposed 
by the APC/C to promote Rb protein phosphorylation and E2F-dependent gene 
expression. S-cyclin, by contrast, is not resistant, and its level is initially restrained 
by Cdh1-APC/C activity. However, G,/S-Cdk also phosphorylates and inactivates 
Cdh1-APC/C, thereby allowing the accumulation of S-cyclin, further promot- 
ing S-Cdk activation. G,/S-Cdk also inactivates CKI proteins that suppress S-Cdk 
activity. The overall effect of all these interactions is the rapid and complete acti- 
vation of the S-Cdk complexes required for S-phase initiation. 


DNA Damage Blocks Cell Division: The DNA Damage Response 


Progression through the cell cycle, and thus the rate of cell proliferation, is con- 
trolled not only by extracellular mitogens but also by other extracellular and intra- 
cellular signals. One of the most important influences is DNA damage, which can 
occur as a result of spontaneous chemical reactions in DNA, errors in DNA repli- 
cation, or exposure to radiation or certain chemicals (discussed in Chapter 5). It is 
essential that damaged chromosomes are repaired before attempting to duplicate 
or segregate them. The cell-cycle control system can readily detect DNA damage 
and arrest the cycle at either of two transitions—one at Start, which prevents entry 
into the cell cycle and into S phase, and one at the G2/M transition, which pre- 
vents entry into mitosis (see Figure 17-16). 

DNA damage initiates a signaling pathway by activating one of a pair of related 
protein kinases called ATM and ATR, which associate with the site of damage and 
phosphorylate various target proteins, including two other protein kinases called 
Chk1 and Chk2. These various kinases phosphorylate other target proteins that 
lead to cell-cycle arrest. A major target is the gene regulatory protein p53, which 
stimulates transcription of the gene encoding p21, a CKI protein; p21 binds to 
G,/S-Cdk and S-Cdk complexes and inhibits their activities, thereby helping to 
block entry into the cell cycle (Figure 17-62 and Movie 17.8). 

DNA damage activates p53 by an indirect mechanism. In undamaged cells, p53 
is highly unstable and is present at very low concentrations. This is largely because 
it interacts with another protein, Mdm2, which acts as a ubiquitin ligase that tar- 
gets p53 for destruction by proteasomes. Phosphorylation of p53 after DNA dam- 
age reduces its binding to Mdm2. This decreases p53 degradation, which results 
in a marked increase in p53 concentration in the cell. In addition, the decreased 
binding to Mdm2 enhances the ability of p53 to stimulate gene transcription (see 
Figure 17-62). 

The protein kinases Chkl and Chk2 also block cell-cycle progression by 
phosphorylating members of the Cdc25 family of protein phosphatases, thereby 
inhibiting their function. As described earlier, these phosphatases are particu- 
larly important in the activation of M-Cdk at the beginning of mitosis (see Figure 
17-20). Chk1 and Chk2 phosphorylate Cdc25 at inhibitory sites that are distinct 
from the phosphorylation sites that stimulate Cdc25 activity. The inhibition of 
Cdc25 activity by DNA damage helps block entry into mitosis (see Figure 17-16). 

The DNA damage response can also be activated by problems that arise when 
a replication fork fails during DNA replication. When nucleotides are depleted, for 
example, replication forks stall during the elongation phase of DNA synthesis. To 
prevent the cell from attempting to segregate partially replicated chromosomes, 
the same mechanisms that respond to DNA damage detect the stalled replication 
forks and block entry into mitosis until the problems are resolved. 

A low level of DNA damage occurs in the normal life of any cell, and this dam- 
age accumulates in the cell’s progeny if the DNA damage response is not func- 
tioning. Over the long term, the accumulation of genetic damage in cells lacking 
the DNA damage response leads to an increased frequency of cancer-promoting 
mutations. Indeed, mutations in the p53 gene occur in at least half of all human 
cancers (discussed in Chapter 20). This loss of p53 function allows the cancer cell 
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to accumulate mutations more readily. Similarly, a rare genetic disease known as 
ataxia telangiectasia is caused by a defect in ATM, one of the protein kinases that 
are activated in response to x-ray-induced DNA damage; patients with this dis- 
ease are very Sensitive to x-rays and suffer from increased rates of cancer. 

What happens if DNA damage is so severe that repair is not possible? The 
answer differs in different organisms. Unicellular organisms such as budding 
yeast arrest their cell cycle to try to repair the damage, but the cycle resumes even 
if the repair cannot be completed. For a single-celled organism, life with muta- 
tions is apparently better than no life at all. In multicellular organisms, however, 
the health of the organism takes precedence over the life of an individual cell. 
Cells that divide with severe DNA damage threaten the life of the organism, since 
genetic damage can often lead to cancer and other diseases. Thus, animal cells 
with severe DNA damage do not attempt to continue division, but instead com- 
mit suicide by undergoing apoptosis. Thus, unless the DNA damage is repaired, 
the DNA damage response can lead to either cell-cycle arrest or cell death. DNA 
damage-induced apoptosis often depends on the activation of p53. Indeed, it is 
this apoptosis-promoting function of p53 that is apparently most important in 
protecting us against cancer. 
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Figure 17-62 How DNA damage 

arrests the cell cycle in G1. When DNA 

is damaged, various protein kinases 

are recruited to the site of damage and 
initiate a signaling pathway that causes 
cell-cycle arrest. The first kinase at 

the damage site is either ATM or ATR, 
depending on the type of damage. 
Additional protein kinases, called Chk1 

and Chk2, are then recruited and 
activated, resulting in the phosphorylation 
of the transcription regulatory protein 

p53. Mdm2 normally binds to p53 and 
promotes its ubiquitylation and destruction 
in proteasomes. Phosphorylation of p53 
blocks its binding to Mdm2; as a result, 
053 accumulates to high levels and 
stimulates transcription of numerous genes, 
including the gene that encodes the CKI 
protein p21. The p21 binds and inactivates 
G1/S-Cdk and S-Cdk complexes, arresting 
the cell in G4. In some cases, DNA damage 
also induces either the phosphorylation of 
Mdm_2 or a decrease in Mdm2 production, 
which causes a further increase in 053 (not 
shown). 
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Many Human Cells Have a Built-In Limitation on the Number of 
Times They Can Divide 


Many human cells divide a limited number of times before they stop and undergo 
a permanent cell-cycle arrest. Fibroblasts taken from normal human tissue, for 
example, go through only about 25-50 population doublings when cultured in 
a standard mitogenic medium. Toward the end of this time, proliferation slows 
down and finally halts, and the cells enter a nondividing state from which they 
never recover. This phenomenon is called replicative cell senescence. 

Replicative cell senescence in human fibroblasts seems to be caused by 
changes in the structure of the telomeres, the repetitive DNA sequences and asso- 
ciated proteins at the ends of chromosomes. As discussed in Chapter 5, when a 
cell divides, telomeric DNA sequences are not replicated in the same manner as 
the rest of the genome but instead are synthesized by the enzyme telomerase. 
Telomerase also promotes the formation of protein cap structures that protect the 
chromosome ends. Because human fibroblasts, and many other human somatic 
cells, do not produce telomerase, their telomeres become shorter with every cell 
division, and their protective protein caps progressively deteriorate. Eventu- 
ally, the exposed chromosome ends are sensed as DNA damage, which activates 
a p53-dependent cell-cycle arrest (see Figure 17-62). Rodent cells, by contrast, 
maintain telomerase activity when they proliferate in culture and therefore do not 
have such a telomere-dependent mechanism for limiting proliferation. The forced 
expression of telomerase in normal human fibroblasts, using genetic engineer- 
ing techniques, blocks this form of senescence. Unfortunately, most cancer cells 
have regained the ability to produce telomerase and therefore maintain telomere 
function as they proliferate; as a result, they do not undergo replicative cell senes- 
cence. 


Abnormal Proliferation Signals Cause Cell-Cycle Arrest or 
Apoptosis, Except in Cancer Cells 


Many of the components of mitogenic signaling pathways are encoded by genes 
that were originally identified as cancer-promoting genes, because mutations in 
them contribute to the development of cancer. The mutation of a single amino 
acid in the small GTPase Ras, for example, causes the protein to become perma- 
nently overactive, leading to constant stimulation of Ras-dependent signaling 
pathways, even in the absence of mitogenic stimulation. Similarly, mutations that 
cause an overexpression of Myc stimulate excessive cell growth and proliferation 
and thereby promote the development of cancer (discussed in Chapter 20). 

Surprisingly, however, when a hyperactivated form of Ras or Myc is experi- 
mentally overproduced in most normal cells, the result is not excessive prolif- 
eration but the opposite: the cells undergo either permanent cell-cycle arrest or 
apoptosis. The normal cell seems able to detect abnormal mitogenic stimulation, 
and it responds by preventing further division. Such responses help prevent the 
survival and proliferation of cells with various cancer-promoting mutations. 

Although it is not known how a cell detects excessive mitogenic stimulation, 
such stimulation often leads to the production of a cell-cycle inhibitor protein 
called Arf, which binds and inhibits Mdm2. As discussed earlier, Mdm2 nor- 
mally promotes p53 degradation. Activation of Arf therefore causes p53 levels to 
increase, inducing either cell-cycle arrest or apoptosis (Figure 17-63). 

How do cancer cells ever arise if these mechanisms block the division or sur- 
vival of mutant cells with overactive proliferation signals? The answer is that the 
protective system is often inactivated in cancer cells by mutations in the genes 
that encode essential components of the blocking mechanisms, such as Arf or p53 
or the proteins that help activate them. 


Cell Proliferation is Accompanied by Cell Growth 


If cells proliferated without growing, they would get progressively smaller and 
there would be no net increase in total cell mass. In most proliferating cell popula- 
tions, therefore, cell growth accompanies cell division. In single-celled organisms 
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such as yeasts, both cell growth and cell division require only nutrients. In ani- 
mals, by contrast, both cell growth and cell proliferation depend on extracellular 
signal molecules, produced by other cells, which we call growth factors and mito- 
gens, respectively. 

Like mitogens, the extracellular growth factors that stimulate animal cell 
growth bind to receptors on the cell surface and activate intracellular signaling 
pathways. These pathways stimulate the accumulation of proteins and other mac- 
romolecules, and they do so by both increasing their rate of synthesis and decreas- 
ing their rate of degradation. They also trigger increased uptake of nutrients and 
production of the ATP required to fuel the increased protein synthesis. One of the 
mostimportant intracellular signaling pathways activated by growth factor recep- 
tors involves the enzyme phosphoinositide 3-kinase (PI 3-kinase), which adds 
a phosphate from ATP to the 3’ position of inositol phospholipids in the plasma 
membrane (discussed in Chapter 15). The activation of PI 3-kinase leads to the 
activation of a kinase called TOR, which lies at the heart of cell growth regulatory 
pathways in all eukaryotes. TOR activates many targets in the cell that stimulate 
metabolic processes , including protein synthesis. One target is a protein kinase 
called S6 kinase (S6K), which phosphorylates ribosomal protein S6, increasing the 
ability of ribosomes to translate a subset of mRNAs that mostly encode ribosomal 
components. TOR also indirectly activates a translation initiation factor called 
eIF4E and directly activates transcription regulators that promote the increased 
expression of genes encoding ribosomal subunits (Figure 17-64). 
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Figure 17-63 Cell-cycle arrest or 
apoptosis induced by excessive 
stimulation of mitogenic pathways. 
Abnormally high levels of Myc cause the 
activation of Arf, which binds and inhibits 
Mdm2 and thereby increases p53 levels 
(see Figure 17-62). Depending on the cell 
type and extracellular conditions, p53 then 
causes either cell-cycle arrest or apoptosis. 


Figure 17-64 Stimulation of cell growth 
by extracellular growth factors and 
nutrients. The occupation of cell-surface 
receptors by growth factors leads to 

the activation of PI 3-kinase, which 
promotes protein synthesis through a 
complex signaling pathway that leads 

to the activation of the protein kinase 
TOR; extracellular nutrients such as 
amino acids also help activate TOR. 

TOR phosphorylates multiple proteins to 
stimulate protein synthesis, as shown; 

it also inhibits protein degradation (not 
shown). Growth factors also stimulate 
increased production of the transcription 
regulatory protein Myc (not shown), 
which activates the transcription of 
various genes that promote cell metabolism 
and growth. 4E-BP is an inhibitor 

of the translation initiation factor elF4E. 
PI(4,5)P2, phosphatidylinositol 
4,5-bisphosphate; PI(3,4,5)Ps, 
phosphatidylinositol 3,4,5-trisohosphate. 
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Figure 17-65 Potential mechanisms for coordinating cell growth and division. In proliferating cells, cell size is maintained 
by mechanisms that coordinate rates of cell division and cell growth. Numerous alternative coupling mechanisms are thought to 
exist, and different cell types appear to employ different combinations of these mechanisms. (A) In many cell types — particularly 
yeast—the rate of cell division is governed by the rate of cell growth, so that division occurs only when growth rate achieves 
some minimal threshold; in yeasts, it is mainly the levels of extracellular nutrients that regulate the rate of cell growth and thereby 
the rate of cell division. (B) In some animal cell types, growth and division can each be controlled by separate extracellular 
factors (growth factors and mitogens, respectively), and cell size depends on the relative levels of the two types of factors. 
(C) Some extracellular factors can stimulate both cell growth and cell division by simultaneously activating signaling pathways 


that promote growth and other pathways that promote cell-cycle progression. 


Proliferating Cells Usually Coordinate Their Growth and Division 


For proliferating cells to maintain a constant size, they must coordinate their 
growth with cell division to ensure that cell size doubles with each division: if cells 
grow too slowly, they will get smaller with each division, and if they grow too fast, 
they will get larger with each division. It is not clear how cells achieve this coordi- 
nation, but it is likely to involve multiple mechanisms that vary in different organ- 
isms and even in different cell types of the same organism (Figure 17-65). 

Animal cell growth and division are not always coordinated, however. In many 
cases, they are completely uncoupled to allow growth without division or division 
without growth. Muscle cells and nerve cells, for example, can grow dramatically 
after they have permanently withdrawn from the cell cycle. Similarly, the eggs of 
many animals grow to an extremely large size without dividing; after fertilization, 
however, this relationship is reversed, and many rounds of division occur without 
growth. 

Compared to cell division, there has been surprisingly little study of how cell 
size is controlled in animals. As a result, it remains a mystery how cell size is deter- 
mined and why different cell types in the same animal grow to be so different in 
size. One of the best-understood cases in mammals is the adult sympathetic neu- 
ron, which has permanently withdrawn from the cell cycle. Its size depends on the 
amount of nerve growth factor (NGF) secreted by the target cells it innervates; the 
greater the amount of NGF the neuron has access to, the larger it becomes. It seems 
likely that the genes a cell expresses set limits on the size it can be, while extracel- 
lular signal molecules and nutrients regulate the size within these limits. The chal- 
lenge is to identify the relevant genes and signal molecules for each cell type. 


Summary 


In multicellular animals, cell size, cell division, and cell survival are carefully con- 
trolled to ensure that the organism and its organs achieve and maintain an appro- 
priate size. Mitogens stimulate the rate of cell division by removing intracellular 
molecular brakes that restrain cell-cycle progression in G;. Growth factors promote 
cell growth (an increase in cell mass) by stimulating the synthesis and inhibiting 
the degradation of macromolecules. To maintain a constant cell size, proliferating 
cells employ multiple mechanisms to ensure that cell growth is coordinated with 
cell division. 


WHAT WE DON’T KNOW 


e Progression through the cell cycle 
depends on the phosphorylation 

of hundreds of different proteins by 
cyclin—Cdk complexes. What are the 
molecular mechanisms ensuring that 
these proteins are phosphorylated at 
precisely the right time and place? 


e During S phase, how are histones 
and their modifying enzymes 
controlled to replicate chromatin 
structure on the duplicated DNA’? 


e What is the structural basis of 
chromosome condensation, and 
how is the process stimulated during 
mitosis? 


e What are the mechanisms by which 
microtubule attachment and tension 
are sensed at the kinetochore by the 
components of the spindle assembly 
checkpoint? 


e How is cell growth coordinated with 
cell division to ensure that cell size 
remains constant? 


CHAPTER 17 END-OF-CHAPTER PROBLEMS 


PROBLEMS 


Which statements are true? Explain why or why not. 


17-1 Since there are about 10! cells in an adult human, 
and about 10!° cells die and are replaced each day, we 
become new people every three years. 


17-2 In order for proliferating cells to maintain a rela- 
tively constant size, the length of the cell cycle must match 
the time it takes for the cell to double in size. 


17-3 While other proteins come and go during the 
cell cycle, the proteins of the origin recognition complex 
remain bound to the DNA throughout. 


17-4 Chromosomes are positioned on the metaphase 
plate by equal and opposite forces that pull them toward 
the two poles of the spindle. 


17-5 Meiosis segregates the paternal homologs into 
sperm and the maternal homologs into eggs. 


17-6 If we could turn on telomerase activity in all our 
cells, we could prevent aging. 


Discuss the following problems. 


17-/ Many cell-cycle genes from human cells function 
perfectly well when expressed in yeast cells. Why do you 
suppose that is considered remarkable? After all, many 
human genes encoding enzymes for metabolic reactions 
also function in yeast, and no one thinks that is remark- 
able. 


17-8 Hoechst 33342 is a membrane-permeant dye that 
fluoresces when it binds to DNA. When a population of 
cells is incubated briefly with Hoechst dye and then sorted 
in a flow cytometer, which measures the fluorescence of 
each cell, the cells display various levels of fluorescence as 
shown in Figure Q17-1. 


A. Which cells in Figure Q17-1 are in the Gy, S, Go, 
and M phases of the cell cycle? Explain the basis for your 
answer. 

B. Sketch the sorting distributions you would expect 
for cells that were treated with inhibitors that block the cell 
cycle in the Gj, S, or M phase. Explain your reasoning. 


Figure Q17-1 Analysis of Hoechst 33342 
fluorescence in a population of cells sorted 
in a flow cytometer (Problem 17-8). 


number of cells 


0 2 
relative fluorescence per cell 


17-9 ‘The yeast cohesin subunit Sccl, which is essen- 
tial for sister-chromatid cohesion, can be artificially 
regulated for expression at any point in the cell cycle. If 
expression is turned on at the beginning of S phase, all the 
cells divide satisfactorily and survive. By contrast, if Sccl 
expression is turned on only after S phase is completed, 
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the cells fail to divide and they die, even though Sccl 
accumulates in the nucleus and interacts efficiently with 
chromosomes. Why do you suppose that cohesin must 
be present during S phase for cells to divide normally? 


17-10 High doses of caffeine interfere with the DNA 
damage response in mammalian cells. Why then do you 
suppose the Surgeon General has not yet issued an appro- 
priate warning to heavy coffee and cola drinkers? A typical 
cup of coffee (150 mL) contains 100 mg of caffeine (196 g/ 
mole). How many cups of coffee would you have to drink 
to reach the dose (10 mM) required to interfere with the 
DNA damage response? (A typical adult contains about 40 
liters of water.) 


17-11 How many kinetochores are there in a human cell 
at mitosis? 


17-12 A living cell from the lung epithelium of a newt 
is shown at different stages in M phase in Figure Q17-2. 
Order these light micrographs into the correct sequence 
and identify the stage in M phase that each represents. 





Figure Q17-2 Light micrographs of a single cell at different stages of 
M phase (Problem 17-12). (Courtesy of Conly L. Rieder.) 


17-13 Down syndrome (trisomy 21) and Edwards syn- 
drome (trisomy 18) are the most common autosomal triso- 
mies seen in human infants. Does this fact mean that these 
chromosomes are the most difficult to segregate properly 
during meiosis? 


17-14 The human genome consists of 23 pairs of chro- 
mosomes (22 pairs of autosomes and one pair of sex chro- 
mosomes). During meiosis, the maternal and paternal sets 
of homologs pair, and then are separated into gametes, so 
that each contains 23 chromosomes. If you assume that 
the chromosomes in the paired homologs are randomly 
assorted to daughter cells, how many potential combina- 
tions of paternal and maternal homologs can be gener- 
ated during meiosis? (For the purposes of this calculation, 
assume that no recombination occurs.) 
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Cell Death 


The growth, development, and maintenance of multicellular organisms depend 
not only on the production of cells but also on mechanisms to destroy them. The 
maintenance of tissue size, for example, requires that cells die at the same rate as 
they are produced. During development, carefully orchestrated patterns of cell 
death help determine the size and shape of limbs and other tissues. Cells also die 
when they become damaged or infected, ensuring that they are removed before 
they threaten the health of the organism. In these and most other cases, cell death 
is not a random process but occurs by a programmed sequence of molecular 
events, in which the cell systematically destroys itself from within and is then 
eaten by other cells, leaving no trace. In most cases, this programmed cell death 
occurs by a process called apoptosis—from the Greek word meaning “falling off,” 
as leaves from a tree. 

Cells dying by apoptosis undergo characteristic morphological changes. They 
shrink and condense, the cytoskeleton collapses, the nuclear envelope disassem- 
bles, and the nuclear chromatin condenses and breaks up into fragments (Figure 
18-1A). The cell surface often bulges outward and, if the cell is large, it breaks 
up into membrane-enclosed fragments called apoptotic bodies. The surface of the 
cell or apoptotic bodies becomes chemically altered, so that a neighboring cell 
or a macrophage (a specialized phagocytic cell, discussed in Chapter 22) rapidly 
engulfs them, before they can spill their contents (Figure 18-1B). In this way, the 
cell dies neatly and is rapidly cleared away, without causing a damaging inflam- 
matory response. Because the cells are eaten and digested so quickly, there are 
usually few dead cells to be seen, even when large numbers of cells have died by 
apoptosis. This is probably why biologists overlooked apoptosis for many years 
and still might underestimate its extent. 

In contrast to apoptosis, animal cells that die in response to an acute insult, 
such as trauma or a lack of blood supply, usually do so by a process called cell 
necrosis. Necrotic cells swell and burst, spilling their contents over their neighbors 
and eliciting an inflammatory response (Figure 18-1C). In most cases, necrosis is 
likely to be caused by energy depletion, which leads to metabolic defects and loss 
of the ionic gradients that normally exist across the cell membrane. One form of 
necrosis, called necroptosis, is a form of programmed cell death that is triggered by 
a specific regulatory signal from other cells, although we are only just beginning to 
understand the underlying mechanisms. 

Some form of programmed cell death occurs in many organisms, but apopto- 
sis is found primarily in animals. This chapter focuses on the major functions of 
apoptosis, its mechanism and regulation, and how excessive or insufficient apop- 
tosis can contribute to human disease. 


Apoptosis Eliminates Unwanted Cells 


The amount of apoptotic cell death that occurs in developing and adult animal 
tissues is astonishing. In the developing vertebrate nervous system, for exam- 
ple, more than half of many types of nerve cells normally die soon after they are 
formed. It seems remarkably wasteful for so many cells to die, especially as the 
vast majority are perfectly healthy at the time they kill themselves. What purposes 
does this massive cell death serve? 
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In some cases, the answer is clear. Cell death helps sculpt hands and feet dur- 
ing embryonic development: they start out as spade-like structures, and the indi- 
vidual digits separate only as the cells between them die, as illustrated for a mouse 
paw in Figure 18-2. In other cases, cells die when the structure they form is no 
longer needed. When a tadpole changes into a frog at metamorphosis, the cells 
in the tail die, and the tail, which is not needed in the frog, disappears. Apopto- 
sis also functions as a quality-control process in development, eliminating cells 
that are abnormal, misplaced, nonfunctional, or potentially dangerous to the ani- 
mal. Striking examples occur in the vertebrate adaptive immune system, where 
apoptosis eliminates developing T and B lymphocytes that either fail to produce 
potentially useful antigen-specific receptors or produce self-reactive receptors 
that make the cells potentially dangerous (discussed in Chapter 24); it also elimi- 
nates most of the lymphocytes activated by an infection, after they have helped 
destroy the responsible microbes. 

In adult tissues that are neither growing nor shrinking, cell death and cell divi- 
sion must be tightly regulated to ensure that they are exactly in balance. If part of 
the liver is removed in an adult rat, for example, liver cell proliferation increases 
to make up the loss. Conversely, if a rat is treated with the drug phenobarbital— 
which stimulates liver cell division (and thereby liver enlargement)—and then the 
phenobarbital treatment is stopped, apoptosis in the liver greatly increases until 
the liver has returned to its original size, usually within a week or so. Thus, the liver 
is kept at a constant size through the regulation of both the cell death rate and the 
cell birth rate. The control mechanisms responsible for such regulation are largely 
unknown. 

Animal cells can recognize damage in their various organelles and, if the dam- 
age is great enough, they can kill themselves by undergoing apoptosis. An impor- 
tant example is DNA damage, which can produce cancer-promoting mutations 
if not repaired. Cells have various ways of detecting DNA damage, and undergo 
apoptosis if they cannot repair it. 


Apoptosis Depends on an Intracellular Proteolytic Cascade That Is 
Mediated by Caspases 


Apoptosis is triggered by members of a family of specialized intracellular pro- 
teases, which cleave specific sequences in numerous proteins inside the cell, 
thereby bringing about the dramatic changes that lead to cell death and engulf- 
ment. These proteases have a cysteine at their active site and cleave their target 
proteins at specific aspartic acids; they are therefore called caspases (c for cys- 
teine and asp for aspartic acid). Caspases are synthesized in the cell as inactive 
precursors and are activated only during apoptosis. There are two major classes of 
apoptotic caspases: initiator caspases and executioner caspases. 


Figure 18-1 Two distinct forms of cell 
death. These electron micrographs show 
cells that have died by apoptosis (A and 

B) or by necrosis (C). The cells in (A) and 
(C) died in a culture dish, whereas the cell 
in (B) died in a developing tissue and has 
been engulfed by a phagocytic cell. Note 
that the cells in (A) and (B) have condensed 
but seem relatively intact, whereas the cell 
in (C) seems to have exploded. The large 
vacuoles visible in the cytoplasm of the cell 
in (A) are a variable feature of apoptosis. 
(Courtesy of Julia Burne.) 
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Figure 18-2 Sculpting the digits in the 
developing mouse paw by apoptosis. 
(A) The paw in this mouse fetus has been 
stained with a dye that specifically labels 
cells that have undergone apoptosis. 

The apoptotic cells appear as bright 
green dots between the developing digits. 
(B) The interdigital cell death has eliminated 
the tissue between the developing digits, 
as seen one day later, when there are very 
few apoptotic cells. (From W. Wood et 

al., Develooment 127:5245-5252, 2000. 
With permission from The Company of 
Biologists.) 
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Initiator caspases, as their name implies, begin the apoptotic process. They 
normally exist as inactive, soluble monomers in the cytosol. An apoptotic signal 
triggers the assembly of large protein platforms that bring multiple initiator cas- 
pases together into large complexes. Within these complexes, pairs of caspases 
associate to form dimers, resulting in protease activation (Figure 18-3). Each cas- 
pase in the dimer then cleaves its partner at a specific site in the protease domain, 
which stabilizes the active complex and is required for the proper function of the 
enzyme in the cell. 

The major function of the initiator caspases is to activate the executioner cas- 
pases. These normally exist as inactive dimers. When they are cleaved by an ini- 
tiator caspase at a site in the protease domain, the active site is rearranged from 
an inactive to an active conformation. One initiator caspase complex can activate 
many executioner caspases, resulting in an amplifying proteolytic cascade. Once 
activated, executioner caspases catalyze the widespread protein cleavage events 
that kill the cell. 

Various experimental approaches have led to the identification of over a thou- 
sand proteins that are cleaved by caspases during apoptosis. Only a few of these 
proteins have been studied in any detail. These include the nuclear lamins, the 
cleavage of which causes the irreversible breakdown of the nuclear lamina (dis- 
cussed in Chapter 12). Another target is a protein that normally holds a DNA- 
degrading endonuclease in an inactive form; its cleavage frees the endonuclease 
to cut up the DNA in the cell nucleus (Figure 18-4). Other target proteins include 
components of the cytoskeleton and cell-cell adhesion proteins that attach cells 
to their neighbors; the cleavage of these proteins helps the apoptotic cell to round 
up and detach from its neighbors, making it easier for a neighboring cell to engulf 
it, or, in the case of an epithelial cell, for the neighbors to extrude the apoptotic cell 
from the cell sheet. The caspase cascade is not only destructive and self-amplify- 
ing but also irreversible, so that once a cell starts out along the path to destruction, 
it cannot turn back. 

How is the initiator caspase first activated in response to an apoptotic signal? 
The two best-understood activation mechanisms in mammalian cells are called 
the extrinsic pathway and the intrinsic, or mitochondrial, pathway. Each uses its 
own initiator caspase and activation system, as we now discuss. 
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Figure 18-3 Caspase activation during 
apoptosis. An initiator caspase contains 

a protease domain in its carboxy-terminal 
region and a small protein interaction 
domain near its amino terminus. It is initially 
made in an inactive, monomeric form, 
sometimes called procaspase. Apoptotic 
signals trigger the assembly of adaptor 
proteins carrying multiple binding sites 

for the caspase amino-terminal domain. 
Upon binding to the adaptor proteins, the 
initiator caspases dimerize and are thereby 
activated, leading to cleavage of a specific 
site in their protease domains. Each 
protease domain is then rearranged into 

a large and small subunit. In some cases 
(not shown), the adaptor-binding domain 
of the initiator caspase is also cleaved (see 
Figure 18-5). Executioner caspases are 
initially formed as inactive dimers. Upon 
cleavage at a site in the protease domain 
by an initiator caspase, the executioner 
caspase dimer undergoes an activating 
conformational change. The executioner 
caspases then cleave a variety of key 
proteins, leading to the controlled death of 
the cell. 
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Figure 18-4 DNA fragmentation during apoptosis. (A) In healthy cells, the endonuclease CAD associates with its inhibitor, 
iCAD. Activation of executioner caspases in the cell leads to cleavage of iCAD, which unleashes the nuclease. Activated CAD 
cuts the chromosomal DNA between nucleosomes, resulting in the production of DNA fragments that form a ladder pattern 
(see B) upon gel electrophoresis. (B) Mouse thymus lymphocytes were treated with an antibody against the cell-surface death 
receptor Fas (discussed in the text), inducing the cells to undergo apoptosis. DNA was extracted at the times indicated above 
the figure, and the fragments were separated by size by electrophoresis in an agarose gel and stained with ethidium bromide. 
Because the cleavages occur in the linker regions between nucleosomes, the fragments separate into a characteristic ladder 
pattern on these gels. Note that in gel electrophoresis, smaller molecules are more widely separated in the lower part of the gel, 
so that removal of a single nucleosome has a greater apparent effect on their gel mobility. (C) Apoptotic nuclei can be detected 
using a technique that adds a fluorescent label to DNA ends. In the image shown here, this technique was used in a tissue 
section of a developing chick leg bud; this cross section through the skin and underlying tissue is from a region between two 
developing digits, as indicated in the underlying drawing. The procedure is called the TUNEL (TdT-mediated dUTP nick end 
labeling) technique because the enzyme terminal deoxynucleotidyl transferase (TdT) adds chains of labeled deoxynucleotide 
(dUTP) to the 3’-OH ends of DNA fragments. The presence of large numbers of DNA fragments therefore results in bright 
fluorescent dots in apoptotic cells. (B, from D. Mcllroy et al., Genes Dev. 14:549-558, 2000. With permission from Cold Spring 
Harbor Laboratory Press; C, from V. Zuzarte-Luis and J.M. Hurlé, /nt. J. Dev. Biol. 46:871-876, 2002. With permission from 
UBC Press.) 


Cell-Surface Death Receptors Activate the Extrinsic Pathway of 
Apoptosis 


Extracellular signal proteins binding to cell-surface death receptors trigger the 
extrinsic pathway of apoptosis. Death receptors are transmembrane proteins 
containing an extracellular ligand-binding domain, a single transmembrane 
domain, and an intracellular death domain, which is required for the receptors 
to activate the apoptotic program. The receptors are homotrimers and belong to 
the tumor necrosis factor (TNF) receptor family, which includes a receptor for TNF 
itself and the Fas death receptor. The ligands that activate the death receptors are 
also homotrimers; they are structurally related to one another and belong to the 
TNF family of signal proteins. 

A well-understood example of how death receptors trigger the extrinsic path- 
way of apoptosis is the activation of Fas on the surface of a target cell by Fas ligand 
on the surface of a killer (cytotoxic) lymphocyte. When activated by the binding 
of Fas ligand, the death domains on the cytosolic tails of the Fas death receptors 
bind intracellular adaptor proteins, which in turn bind initiator caspases (pri- 
marily caspase-8), forming a death-inducing signaling complex (DISC). Once 
dimerized and activated in the DISC, the initiator caspases cleave their partners 
and then activate downstream executioner caspases to induce apoptosis (Figure 
18-5). In some cells, the extrinsic pathway recruits the intrinsic apoptotic path- 
way to amplify the caspase cascade and kill the cell. 
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Many cells produce inhibitory proteins that act to restrain the extrinsic path- 
way. For example, some cells produce the protein FLIP, which resembles an ini- 
tiator caspase but has no protease activity because it lacks the key cysteine in its 
active site. FLIP dimerizes with caspase-8 in the DISC; although caspase-8 appears 
to be active in these heterodimers, it is not cleaved at the site required for its stable 
activation, and the apoptotic signal is blocked. Such inhibitory mechanisms help 
prevent the inappropriate activation of the extrinsic pathway of apoptosis. 


The Intrinsic Pathway of Apoptosis Depends on Mitochondria 


Cells can also activate their apoptosis program from inside the cell, often in 
response to stresses, such as DNA damage, or in response to developmen- 
tal signals. In vertebrate cells, these responses are governed by the intrinsic, or 
mitochondrial, pathway of apoptosis, which depends on the release into the cyto- 
sol of mitochondrial proteins that normally reside in the intermembrane space 
of these organelles (see Figure 12-19). Some of the released proteins activate a 
caspase proteolytic cascade in the cytoplasm, leading to apoptosis. 

A key protein in the intrinsic pathway is cytochrome c, a water-soluble com- 
ponent of the mitochondrial electron-transport chain. When released into the 
cytosol (Figure 18-6), it takes on a new function: it binds to an adaptor protein 
called Apafl (apoptotic protease activating factor-1), causing the Apafl to oligo- 
merize into a wheel-like heptamer called an apoptosome. The Apafl proteins in 
the apoptosome then recruit initiator caspase-9 proteins, which are thought to 
be activated by proximity in the apoptosome, just as caspase-8 is activated in the 
DISC. The activated caspase-9 molecules then activate downstream executioner 
caspases to induce apoptosis (Figure 18-7). 


Bcl2 Proteins Regulate the Intrinsic Pathway of Apoptosis 


The intrinsic pathway of apoptosis is tightly regulated to ensure that cells kill 
themselves only when it is appropriate. A major class of intracellular regulators of 
the intrinsic pathway is the Bcl2 family of proteins, which, like the caspase family, 
has been conserved in evolution from worms to humans; a human Bcl2 protein, 
for example, can suppress apoptosis when expressed in the worm Caenorhabditis 
elegans. 
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Figure 18-5 The extrinsic pathway of 
apoptosis activated through Fas death 
receptors. Trimeric Fas ligands on the 
surface of a killer lymphocyte interact with 
trimeric Fas receptors on the surface of the 
target cell, leading to clustering of several 
ligand-bound receptor trimers (only one 
trimer is shown here for clarity). Receptor 
clustering activates death domains on the 
receptor tails, which interact with similar 
domains on the adaptor protein FADD 
(FADD stands for Fas-associated death 
domain). Each FADD protein then recruits 
an initiator caspase (caspase-8) via a 
death effector domain on both FADD and 
the caspase, forming a death-inducing 
signaling complex (DISC). Within the DISC, 
two adjacent initiator caspases interact and 
cleave one another to form an activated 
protease dimer, which then cleaves itself 
in the region linking the protease to the 
death effector domain. This stabilizes and 
releases the active caspase dimer into 

the cytosol, where it activates executioner 
caspases by cleaving them. 
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Mammalian Bcl2 family proteins regulate the intrinsic pathway of apopto- 
sis mainly by controlling the release of cytochrome c and other intermembrane 
mitochondrial proteins into the cytosol. Some Bcl2 family proteins are pro-apop- 
totic and promote apoptosis by enhancing the release, whereas others are anti- 
apoptotic and inhibit apoptosis by blocking the release. The pro-apoptotic and 
anti-apoptotic proteins can bind to each other in various combinations to form 
heterodimers in which the two proteins inhibit each other’s function. The bal- 
ance between the activities of these two functional classes of Bcl2 family proteins 
largely determines whether a mammalian cell lives or dies by the intrinsic path- 
way of apoptosis. 

As illustrated in Figure 18-8, the anti-apoptotic Bcl2 family proteins, including 
Bcl2 itself (the founding member of the Bcl2 family) and BclX,, share four distinc- 
tive Bcl2 homology (BH) domains (BH1-4). The pro-apoptotic Bcl2 family proteins 
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Figure 18-6 Release of cytochrome 

c from mitochondria in the intrinsic 
pathway of apoptosis. Fluorescence 
micrographs of human cancer cells 

in culture. (A) The control cells were 
transfected with a gene encoding a fusion 
protein consisting of cytochrome c linked to 
green fluorescent protein (cytochrome- 
c-GFP); they were also treated with a red 
dye that accumulates in mitochondria. 

The overlapping distribution of the green 
and red indicates that the cytochrome- 
c-GFP is located in mitochondria. 

(B) Cells expressing cytochrome-c-GFP 
were irradiated with ultraviolet (UV) light to 
induce the intrinsic pathway of apoptosis 
and were photographed 5 hours later. 

The six cells in the bottom half of this 
micrograph have released their cytochrome 
c from mitochondria into the cytosol, 
whereas the cells in the upper half of the 
micrograph have not yet done so 

(Movie 18.1). (From J.C. Goldstein et al., 
Nat. Cell Biol. 2:156-162, 2000. With 
permission from Macmillan Publishers Ltd.) 


q—- CARD 


| caspase-9 





activation of caspase-9, 
which cleaves and thereby activates 
executioner caspases 





CASPASE CASCADE 
LEADING TO APOPTOSIS 








Figure 18-7 The intrinsic pathway of apoptosis. Intracellular apoptotic stimuli cause mitochondria to release cytochrome c, which interacts 
with Apaf1. The binding of cytochrome c causes Apafi to unfold partly, exposing a domain that interacts with the same domain in other activated 
Apafi molecules. Seven activated Apaf1 proteins form a large ring complex called the apoptosome. Each Apaf1 protein contains a caspase 
recruitment domain (CARD), and these are clustered above the central hub of the apoptosome. The CARDs bind similar domains in multiple 
caspase-9 molecules, which are thereby recruited into the apoptosome and activated. The mechanism of caspase-9 activation is not clear: it 
probably results from dimerization and cleavage of adjacent caspase-9 proteins, but it might also depend on interactions between caspase-9 and 
Apaf1. Once activated, caspase-9 cleaves and thereby activates downstream executioner caspases. Note that the CARD is related in structure 
and function to the death effector domain of caspase-8 (see Figure 18-5). Some scientists use the term “apoptosome?” to refer to the complex 


containing caspase-9. 


CELL DEATH 


BH4  BH3 BHI BH2 


anti-apoptotic 
Bcl2 family protein 
(e.g., Bcl2, BclX,) 


pro-apoptotic 
effector Bcl2 family 
protein 

(e.g., Bax, Bak) 


pro-apoptotic 


BH3-only protein el 
(e.g., Bad, Bim, 
Bid, Puma, Noxa) 


consist of two subfamilies—the effector Bcl2 family proteins and the BH3-only pro- 
teins. The main effector proteins are Bax and Bak, which are structurally similar to 
Bcl2 but lack the BH4 domain. The BH3-only proteins share sequence homology 
with Bcl2 in only the BH3 domain. 

When an apoptotic stimulus triggers the intrinsic pathway, the pro-apoptotic 
effector Bcl2 family proteins become activated and aggregate to form oligomers 
in the mitochondrial outer membrane, inducing the release of cytochrome c and 
other intermembrane proteins by an unknown mechanism (Figure 18-9). In 
mammalian cells, Bax and Bak are the main effector Bcl2 family proteins, and 
at least one of them is required for the intrinsic pathway of apoptosis to operate: 
mutant mouse cells that lack both proteins are resistant to all pro-apoptotic sig- 
nals that normally activate this pathway. Whereas Bak is bound to the mitochon- 
drial outer membrane even in the absence of an apoptotic signal, Bax is mainly 
located in the cytosol and translocates to the mitochondria only after an apop- 
totic signal activates it. As we discuss below, the activation of Bax and Bak usually 
depends on activated pro-apoptotic BH3-only proteins. 

The anti-apoptotic Bcl2 family proteins such as Bcl2 itself and BclXy are also 
located on the cytosolic surface of the outer mitochondrial membrane, where 
they help prevent inappropriate release of intermembrane proteins. The anti- 
apoptotic Bcl2 family proteins inhibit apoptosis mainly by binding to and inhibit- 
ing pro-apoptotic Bcl2 family proteins—either on the mitochondrial membrane 
or in the cytosol. On the outer mitochondrial membrane, for example, they bind 
to Bak and prevent it from oligomerizing, thereby inhibiting the release of cyto- 
chrome c and other intermembrane proteins. There are at least five mammalian 
anti-apoptotic Bcl2 family proteins, and every mammalian cell requires at least 
one to survive. Moreover, a number of these proteins must be inhibited for the 
intrinsic pathway to induce apoptosis; the BH3-only proteins mediate the inhibi- 
tion. 

The BH3-only proteins are the largest subclass of Bcl2 family proteins. The cell 
either produces or activates them in response to an apoptotic stimulus, and they 
are thought to promote apoptosis mainly by inhibiting anti-apoptotic Bcl2 family 
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Figure 18-8 The three classes of Bcl2 
family proteins. Note that the BH3 domain 
is the only BH domain shared by all Bcl2 
family members; it mediates the direct 
interactions between pro-apoptotic and 
anti-apoptotic family members. 


Figure 18-9 The role of pro-apoptotic 
effector Bcl2 family proteins (mainly Bax 
and Bak) in the release of mitochondrial 
intermembrane proteins in the intrinsic 
pathway of apoptosis. When activated 
by an apoptotic stimulus, the effector Bcl2 
family proteins aggregate on the outer 
mitochondrial membrane and release 
cytochrome c and other proteins from the 
intermembrane space into the cytosol by 
an unknown mechanism. 
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proteins. Their BH3 domain binds to a long hydrophobic groove on anti-apop- 
totic Bcl2 family proteins, neutralizing their activity. This binding and inhibition 
enables the aggregation of Bax and Bak on the surface of mitochondria, which 
triggers the release of the intermembrane mitochondrial proteins that induce 
apoptosis (Figure 18-10). Some BH3-only proteins may bind directly to Bax and 
Bak to help stimulate their aggregation. 

BH3-only proteins provide the crucial link between apoptotic stimuli and the 
intrinsic pathway of apoptosis, with different stimuli activating different BH3- 
only proteins. Some extracellular survival signals, for example, block apopto- 
sis by inhibiting the synthesis or activity of certain BH3-only proteins (see Fig- 
ure 18-12B). Similarly, in response to DNA damage that cannot be repaired, the 
tumor suppressor protein p53 accumulates (discussed in Chapters 17 and 20) and 
activates the transcription of genes that encode the BH3-only proteins Puma and 
Noxa. These BH3-only proteins then trigger the intrinsic pathway, thereby elimi- 
nating a potentially dangerous cell that could otherwise become cancerous. 

As mentioned earlier, in some cells the extrinsic apoptotic pathway recruits 
the intrinsic pathway to amplify the caspase cascade to kill the cell. The BH3-only 
protein Bid is the link between the two pathways. Bid is normally inactive. How- 
ever, when death receptors activate the extrinsic pathway in some cells, the initia- 
tor Caspase, caspase-8, cleaves Bid, producing an active form of Bid that trans- 
locates to the outer mitochondrial membrane and inhibits anti-apoptotic Bcl2 
family proteins, thereby amplifying the death signal. 
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Figure 18-10 How pro-apoptotic BH3- 
only and anti-apoptotic Bcl2 family 
proteins regulate the intrinsic pathway 
of apoptosis. (A) In the absence of 

an apoptotic stimulus, anti-apoptotic 

Bcl2 family proteins bind to and inhibit 

the effector Bcl2 family proteins on the 
mitochondrial outer membrane (and in the 
cytosol—not shown). (B) In the presence of 
an apoptotic stimulus, BH3-only proteins 
are activated and bind to the anti-apoptotic 
Bcl2 family proteins so that they can 

no longer inhibit the effector Bcl2 family 
proteins; the latter then become activated, 
aggregate in the outer mitochondrial 
membrane, and promote the release of 
intermembrane mitochondrial proteins 

into the cytosol. Some activated BH3- 

only proteins may stimulate mitochondrial 
protein release more directly by binding 

to and activating the effector Bcl2 family 
proteins. Although not shown, the anti- 
apoptotic Bcl2 family proteins are bound to 
the mitochondrial surface. 
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IAPs Help Control Caspases 


Because activation of a caspase cascade leads to certain death, the cell employs 
multiple robust mechanisms to ensure that these proteases are activated only 
when appropriate. One line of defense is provided by a family of proteins called 
inhibitors of apoptosis (IAPs). These proteins were first identified in certain 
insect viruses (baculoviruses), which encode IAP proteins to prevent a host cell 
that is infected by the virus from killing itself by apoptosis. It is now known that 
most animal cells also make IAP proteins. 

All IAPs have one or more BIR (baculovirus IAP repeat) domains, which enable 
them to bind to and inhibit activated caspases. Some IAPs also polyubiquitylate 
Caspases, marking the caspases for destruction by proteasomes. In this way, the 
IAPs set an inhibitory threshold that caspases must overcome to trigger apoptosis. 

In Drosophila at least, the inhibitory barrier provided by IAPs can be neutral- 
ized by anti-IAP proteins, which are produced in response to various apoptotic 
stimuli. There are numerous anti-IAPs in flies, including Reaper, Grim, and Hid, 
and their only structural similarity is their short, N-terminal, I[AP-binding motif, 
which binds to the BIR domain of IAPs, preventing the domain from binding to 
a caspase. Deletion of the three genes encoding Reaper, Grim, and Hid blocks 
apoptosis in flies. Conversely, inactivation of one of the two genes that encode 
IAPs in Drosophila causes all of the cells in the developing fly embryo to undergo 
apoptosis. Clearly, the balance between IAPs and anti-IAPs is tightly regulated 
and is crucial for controlling apoptosis in the fly. 

The role of mammalian IAP and anti-IAP proteins in apoptosis is less clear. 
Anti-IAPs are released from the mitochondrial intermembrane space when the 
intrinsic pathway of apoptosis is activated, blocking IAPs in the cytosol and thereby 
promoting apoptosis. However, mice appear to develop normally if they are miss- 
ing either the major mammalian IAP (called XIAP) or the two known mammalian 
anti-IAPs (called Smac/Diablo and Omi). Worms do not even contain a caspase- 
inhibiting IAP protein. Apparently, the tight control of caspase activity is achieved 
by different mechanisms in different animals. 


Extracellular Survival Factors Inhibit Apoptosis in Various Ways 


Intercellular signals regulate most activities of animal cells, including apoptosis. 
These extracellular signals are part of the normal “social” controls that ensure that 
individual cells behave for the good of the organism as a whole—in this case, by 
surviving when they are needed and killing themselves when they are not. Some 
extracellular signal molecules stimulate apoptosis, whereas others inhibit it. We 
have discussed signal proteins such as Fas ligand that activate death receptors 
and thereby trigger the extrinsic pathway of apoptosis. Other extracellular signal 
molecules that stimulate apoptosis are especially important during vertebrate 
development: a surge of thyroid hormone in the bloodstream, for example, signals 
cells in the tadpole tail to undergo apoptosis at metamorphosis. In mice, locally 
produced signal proteins stimulate cells between developing fingers and toes to 
kill themselves (see Figure 18-2). Here, however, we focus on extracellular signal 
molecules that inhibit apoptosis, which are collectively called survival factors. 

Most animal cells require continuous signaling from other cells to avoid apop- 
tosis. This surprising arrangement apparently helps ensure that cells survive only 
when and where they are needed. Nerve cells, for example, are produced in excess 
in the developing nervous system and then compete for limited amounts of sur- 
vival factors that are secreted by the target cells that they normally connect to (see 
Figure 21-81). Nerve cells that receive enough survival signals live, while the oth- 
ers die. In this way, the number of surviving neurons is automatically adjusted 
so that it is appropriate for the number of target cells they connect with (Figure 
18-11). A similar competition for limited amounts of survival factors produced by 
neighboring cells is thought to control cell numbers in other tissues, both during 
development and in adulthood. 
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Survival factors usually bind to cell-surface receptors, which activate intracel- 
lular signaling pathways that suppress the apoptotic program, often by regulating 
members of the Bcl2 family of proteins. Some survival factors, for example, stimu- 
late the synthesis of anti-apoptotic Bcl2 family proteins such as Bcl2 itself or BclX;, 
(Figure 18-12A). Others act by inhibiting the function of pro-apoptotic BH3-only 
proteins such as Bad (Figure M18-12B). In Drosophila, some survival factors 
act by phosphorylating and inactivating anti-I[AP proteins such as Hid, thereby 
enabling IAP proteins to suppress apoptosis (Figure 18-12C). Some develop- 
ing neurons, like those illustrated in Figure 18-11, use an ingenious alternative 
approach: survival-factor receptors stimulate apoptosis—by an unknown mecha- 
nism—when they are not occupied, and then stop promoting death when survival 
factor binds. The end result in all these cases is the same: cell survival depends on 
survival factor binding. 


Phagocytes Remove the Apoptotic Cell 


Apoptotic cell death is a remarkably tidy process: the apoptotic cell and its frag- 
ments do not break open and release their contents, but instead remain intact 
as they are efficiently eaten—or phagocytosed—by neighboring cells, leaving no 
trace and therefore triggering no inflammatory response (see Figure 18-1B and 
Movie 13.5). This engulfment process depends on chemical changes on the sur- 
face of the apoptotic cell, which displays signals that recruit phagocytic cells. An 
especially important change occurs in the distribution of the negatively charged 
phospholipid phosphatidylserine on the cell surface. This phospholipid is nor- 
mally located exclusively in the inner leaflet of the lipid bilayer of the plasma 
membrane (see Figure 10-15), but it flips to the outer leaflet in apoptotic cells. The 
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Figure 18-11 The role of survival factors 
and cell death in adjusting the number 
of developing nerve cells to the amount 
of target tissue. More nerve cells are 
produced than can be supported by the 
limited amount of survival factors released 
by the target cells. Therefore, some nerve 
cells receive an insufficient amount of 
survival factors to avoid apoptosis. This 
strategy of overproduction followed by 
culling helps ensure that all target cells are 
contacted by nerve cells and that the extra 
nerve cells are automatically eliminated. 


Figure 18-12 Three ways that 
extracellular survival factors can inhibit 
apoptosis. (A) Some survival factors 
suppress apoptosis by stimulating the 
transcription of genes that encode anti- 
apoptotic Bcl2 family proteins such as Bcl2 
itself or BclX_. (B) Many others activate the 
serine/threonine protein kinase Akt, which, 
among many other targets, phosphorylates 
and inactivates the pro-apoptotic BH3-only 
protein Bad (see Figure 15-53). When not 
phosphorylated, Bad promotes apoptosis 
by binding to and inhibiting Bcl2; once 
phosphorylated, Bad dissociates, freeing 
Bcl2 to suppress apoptosis. Akt also 
suppresses apoptosis by phosphorylating 
and inactivating transcription regulatory 
proteins that stimulate the transcription 

of genes encoding proteins that promote 
apoptosis (not shown). (C) In Drosophila, 
some survival factors inhibit apoptosis by 
stimulating the phosphorylation of the anti- 
IAP protein Hid. When not phosphorylated, 
Hid promotes cell death by inhibiting 

IAPs. Once phosphorylated, Hid no longer 
inhibits IAPs, which become active and 
block apoptosis. MAP kinase, mitogen- 
activated protein kinase. 
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underlying mechanism is poorly understood, but the external exposure of phos- 
phatidylserine is likely to depend on caspase cleavage of some protein involved in 
phospholipid distribution in the membrane. A variety of soluble “bridging” pro- 
teins interact with the exposed phosphatidylserine on the apoptotic cell. These 
bridging proteins also interact with specific receptors on the surface of a neigh- 
boring cell or macrophage, triggering cytoskeletal and other changes that initiate 
the engulfment process. 

Macrophages do not phagocytose healthy cells in the animal—despite the fact 
that healthy cells normally expose some phosphatidylserine on their surfaces. 
Healthy cells express signal proteins on their surface that interact with inhibitory 
receptors on macrophages that block phagocytosis. Thus, in addition to express- 
ing cell-surface signals such as phosphatidylserine that stimulate phagocytosis, 
apoptotic cells must lose or inactivate these “don’t eat me” signals that block 
phagocytosis. 


Either Excessive or Insufficient Apoptosis Can Contribute to 
Disease 


There are many human disorders in which excessive numbers of cells undergo 
apoptosis and thereby contribute to tissue damage. Among the most dramatic 
examples are heart attacks and strokes. In these acute conditions, many cells die 
by necrosis as a result of ischemia (inadequate blood supply), but some of the 
less affected cells die by apoptosis. It is hoped that, in the future, drugs that block 
apoptosis—such as specific caspase inhibitors—will prove useful in saving such 
cells. 

There are other conditions where too few cells die by apoptosis. Mutations 
in mice and humans, for example, that inactivate the genes that encode the Fas 
death receptor or the Fas ligand prevent the normal death of some lymphocytes, 
causing these cells to accumulate in excessive numbers in the spleen and lymph 
glands. In many cases, this leads to autoimmune disease, in which the lympho- 
cytes react against the individual’s own tissues. 

Decreased apoptosis also makes an important contribution to many tumors, as 
cancer cells often regulate their apoptotic program abnormally. The Bcl2 gene, for 
example, was first identified in a common form of lymphocyte cancer in humans, 
where a chromosome translocation causes excessive production of the Bcl2 pro- 
tein; indeed, Bcl2 gets its name from this B cell lymphoma. The high level of Bcl2 
protein in the lymphocytes that carry the translocation promotes the develop- 
ment of cancer by inhibiting apoptosis, thereby prolonging lymphocyte survival 
and increasing their number; it also decreases the cells’ sensitivity to anticancer 
drugs, which commonly work by causing cancer cells to undergo apoptosis. 

Similarly, the gene encoding the tumor suppressor protein p53 is mutated in 
about 50% of human cancers so that it no longer promotes apoptosis or cell-cycle 
arrest in response to DNA damage. The lack of p53 function therefore enables the 
cancer cells to survive and proliferate even when their DNA is damaged; in this 
way, the cells accumulate more mutations, some of which make the cancer more 
malignant (discussed in Chapter 20). As many anticancer drugs induce apoptosis 
(and cell-cycle arrest) by a p53-dependent mechanism (discussed in Chapters 17 
and 20), the loss of p53 function also makes cancer cells less sensitive to these 
drugs. 

If decreased apoptosis contributes to many cancers, then we might be able to 
treat those cancers with drugs that stimulate apoptosis. This line of thinking has 
recently led to the development of small chemicals that interfere with the function 
of anti-apoptotic Bcl2 family proteins such as Bcl2 and BclX,. These chemicals 
bind with high affinity to the hydrophobic groove on anti-apoptotic Bcl2 family 
proteins, blocking their function in essentially the same way that BH3-only pro- 
teins do (Figure 18-13). The intrinsic pathway of apoptosis is thereby stimulated, 
which in certain tumors increases the amount of cell death. 

Most human cancers arise in epithelial tissues such as those in the lung, intes- 
tinal tract, breast, and prostate. Such cancer cells display many abnormalities in 
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their behavior, including a decreased ability to adhere to the extracellular matrix 
and to one another at specialized cell-cell junctions. In the next chapter, we dis- 
cuss the remarkable structures and functions of the extracellular matrix and cell 
junctions. 


Summary 


Animal cells can activate an intracellular death program and kill themselves in 
a controlled way when they are irreversibly damaged, no longer needed, or are 
a threat to the organism. In most cases, these deaths occur by apoptosis: the cells 
shrink, condense, and frequently fragment, and neighboring cells or macrophages 
rapidly phagocytose the cells or fragments before there is any leakage of cytoplas- 
mic contents. Apoptosis is mediated by proteolytic enzymes called caspases, which 
cleave specific intracellular proteins to help kill the cell. Caspases are present in 
all nucleated animal cells as inactive precursors. Initiator caspases are activated 
when brought into proximity in activation complexes: once activated, they cleave 
and thereby activate downstream executioner caspases, which then cleave various 
target proteins in the cell, producing an amplifying, irreversible proteolytic cascade. 

Cells use at least two distinct pathways to activate initiator caspases and trig- 
ger a caspase cascade leading to apoptosis: the extrinsic pathway is activated by 
extracellular ligands binding to cell-surface death receptors; the intrinsic pathway 
is activated by intracellular signals generated when cells are stressed. Each path- 
way uses its own initiator caspases, which are activated in distinct activation com- 
plexes: in the extrinsic pathway, the death receptors recruit caspase-8 via adaptor 
proteins to form the DISC; in the intrinsic pathway, cytochrome c released from the 
intermembrane space of mitochondria activates Apafl, which assembles into an 
apoptosome and recruits and activates caspase-9. 

Intracellular Bcl2 family proteins and IAP proteins tightly regulate the apoptotic 
program to ensure that cells kill themselves only when it benefits the animal. Both 
anti-apoptotic and pro-apoptotic Bcl2 family proteins regulate the intrinsic path- 
way by controlling the release of mitochondrial intermembrane proteins, while IAP 
proteins inhibit activated caspases and promote their degradation. 





Figure 18-13 How the chemical ABT- 
737 inhibits anti-apoptotic Bcl2 family 
proteins. As shown in Figure 18—-10B, 

an apoptotic signal results in activation of 
BH3-only proteins, which interact with a 
long hydrophobic groove in anti-apoptotic 
Bcl2 family proteins, thereby preventing 
them from blocking apoptosis. Using 

the crystal structure of the groove, the 
drug shown in (A), called ABT-737, was 
designed and synthesized to bind tightly in 
the groove, as shown for the anti-apoptotic 
Bcl2 family protein, BclX,, in (B). By 
inhibiting the activity of these proteins, the 
drug promotes apoptosis in any cell that 
depends on them for survival. (PDB code: 
PYA) 


WHAT WE DON’T KNOW 


e How many forms of programmed 
cell death exist? What are the 
underlying mechanisms and benefits 
of each? 


e Thousands of caspase substrates 
have been identified. Which ones 
are the critical proteins that must 
be cleaved to trigger the major 

cell remodeling events underlying 
apoptosis? 


e How did the intrinsic pathway of 
apoptosis evolve, and what is the 
advantage of having mitochondria 
play such a central role in regulating 
apoptosis? 


e How are “don’t eat me” signals 
eliminated or inactivated during 
apoptosis to allow the cells to be 
phagocytosed? 


CHAPTER 18 END-OF-CHAPTER PROBLEMS 


PROBLEMS 


Which statements are true? Explain why or why not. 


18-1 In normal adult tissues, cell death usually balances 
cell division. 


18-2 Mammalian cells that do not have cytochrome c 
should be resistant to apoptosis induced by DNA damage. 


Discuss the following problems. 


18-3 Oneimportantrole of Fas and Fas ligand is to medi- 
ate the elimination of tumor cells by killer lymphocytes. 
In a study of 35 primary lung and colon tumors, half the 
tumors were found to have amplified and overexpressed a 
gene for a secreted protein that binds to Fas ligand. How do 
you suppose that overexpression of this protein might con- 
tribute to the survival of these tumor cells? Explain your 
reasoning. 


18-4 Development ofthe nematode Caenorhabditis ele- 
gans generates exactly 959 somatic cells; it also produces 
an additional 131 cells that are later eliminated by apop- 
tosis. Classical genetic experiments in C. elegans isolated 
mutants that led to the identification of the first genes 
involved in apoptosis. Of the many mutations affecting 
apoptosis in the nematode, none have ever been found in 
the gene for cytochrome c. Why do you suppose that such 
a central effector molecule in apoptosis was not found in 
the many genetic screens for “death” genes that have been 
carried out in C. elegans? 


18-5 Imagine that you could microinject cytochrome c 
into the cytosol of wild-type mammalian cells and of cells 
that were doubly defective for Bax and Bak. Would you 
expect one, both, or neither type of cell to undergo apopto- 
sis? Explain your reasoning. 


18-6 In contrast to their similar brain abnormalities, 
newborn mice deficient in Apafl or caspase-9 have dis- 
tinctive abnormalities in their paws. Apafl-deficient mice 
fail to eliminate the webs between their developing digits, 
whereas caspase-9-deficient mice have normally formed 
digits (Figure Q18-1). If Apafl and caspase-9 function in 
the same apoptotic pathway, how is it possible for these 
deficient mice to differ in web-cell apoptosis? 


m 
hab 


Apafi 
Figure Q18-1 Appearance of paws in 
Apaf1~- and Casp9-- newborn mice 
relative to normal newborn mice (Problem 
+/+ == 18-6). (From H. Yoshida et al., Cell 94:739- 
Casp9 750, 1998. With permission from Elsevier.) 
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18-7 When human cancer cells are exposed to ultra- 
violet (UV) light at 90 mJ/cm?, most of the cells undergo 
apoptosis within 24 hours. Release of cytochrome c from 
mitochondria can be detected as early as 6 hours after 
exposure of a population of cells to UV light, and it contin- 
ues to increase for more than 10 hours thereafter. Does this 
mean that individual cells slowly release their cytochrome 
c over this time period? Or, alternatively, do individual 
cells release their cytochrome c rapidly but with different 
cells being triggered over the longer time period? 

To answer this fundamental question, you have 
fused the gene for green fluorescent protein (GFP) to the 
gene for cytochrome c, so that you can observe the behav- 
ior of individual cells by confocal fluorescence microscopy. 
In cells that are expressing the cytochrome c-GFP fusion, 
fluorescence shows the punctate pattern typical of mito- 
chondrial proteins. You then irradiate these cells with UV 
light and observe individual cells for changes in the punc- 
tate pattern. Two such cells (outlined in white) are shown 
in Figure Q18-2A and B. Release of cytochrome c-GFP is 
detected as a change from a punctate to a diffuse pattern 
of fluorescence. Times after UV exposure are indicated as 
hours:minutes below the individual panels. 

Which model for cytochrome c release do these observa- 
tions support? Explain your reasoning. 


(A) 





17:18 





Figure Q18-2 Time-lapse video fluorescence microscopic analysis 

of cytochrome c—GFP release from mitochondria of individual cells 
(Problem 18-7). (A) Cells observed for 6 minutes, 10 hours after 

UV irradiation. (B) Cells observed for 8 minutes, 17 hours after UV 
irradiation. One cell in (A) and one in (B), each outlined in white, 

have released their cytochrome c-GFP during the time frame of the 
observation, which is shown as hours:minutes below each panel. (From 
J.C. Goldstein et al., Nat. Cell Biol. 2:156-162, 2000. With permission 
from Macmillan Publishers Ltd.) 


18-8 Fas ligand is a trimeric, extracellular protein that 
binds to its receptor, Fas, which is composed of three iden- 
tical transmembrane subunits (Figure Q18-3). The bind- 
ing of Fas ligand alters the conformation of Fas so that it 
binds an adaptor protein, which then recruits and acti- 
vates caspase-8, triggering a caspase cascade that leads to 
cell death. In humans, the autoimmune lymphoprolifera- 
tive syndrome (ALPS) is associated with dominant muta- 
tions in Fas that include point mutations and C-terminal 
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truncations. In individuals that are heterozygous for such 
mutations, lymphocytes do not die at their normal rate and 
accumulate in abnormally large numbers, causing a vari- 
ety of clinical problems. In contrast to these patients, indi- 
viduals that are heterozygous for mutations that eliminate 
Fas expression entirely have no clinical symptoms. 

A. Assuming that the normal and dominant forms 
of Fas are expressed to the same level and bind Fas ligand 
equally, what fraction of Fas-Fas ligand complexes on a 
lymphocyte from a heterozygous ALPS patient would be 
expected to be composed entirely of normal Fas subunits? 
B. In an individual heterozygous for a mutation that 
eliminates Fas expression, what fraction of Fas-Fas ligand 
complexes would be expected to be composed entirely of 
normal Fas subunits? 

C. Why are the Fas mutations that are associated with 
ALPS dominant, while those that eliminate expression of 
Fas are recessive? 
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Figure Q18-3 The binding of trimeric Fas ligand to Fas (Problem 18-8). 
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CELLS IN THEIR SOCIAL CONTEXT 


CHAPTER 


Cell Junctions and the 1 O 
Extracellular Matrix 


Of all the social interactions between cells in a multicellular organism, the most IN THIS CHAPTER 
fundamental are those that hold the cells together. Cells may be linked by direct 

interactions, or they may be held together within the extracellular matrix, acom- = CELL-CELL JUNCTIONS 

plex network of proteins and polysaccharide chains that the cells secrete. By one 

means or another, cells must cohere if they are to form an organized multicellular THE EXTRACELLULAR MATRIX 
structure that can withstand and respond to the various external forces that tryto OF ANIMALS 

pull it apart. 

The mechanisms of cohesion govern the architecture of the body—its shape, CELL-MATRIX JUNCTIONS 
its strength, and the arrangement of its different cell types. The making and break- 
ing of the attachments between cells and the modeling of the extracellular matrix 
govern the way cells move within the organism, guiding them as the body grows, 
develops, and repairs itself. Attachments to other cells and to extracellular matrix 
control the orientation and behavior of the cell’s cytoskeleton, thereby allowing 
cells to sense and respond to changes in the mechanical features of their environ- 
ment. Thus, the apparatus of cell junctions and the extracellular matrix is criti- 
cal for every aspect of the organization, function, and dynamics of multicellular 
structures. Defects in this apparatus underlie an enormous variety of diseases. 

The key features of cell junctions and the extracellular matrix are best illus- 
trated by considering two broad categories of tissues that are found in all animals 
(Figure 19-1). Connective tissues, such as bone or tendon, are formed from an 
extracellular matrix produced by cells that are distributed sparsely in the matrix. 
It is the matrix—rather than the cells—that bears most of the mechanical stress to 
which the tissue is subjected. Direct attachments between one cell and another 
are relatively rare, but the cells have important attachments to the matrix. These 
cell-matrix junctions link the cytoskeleton to the matrix, allowing the cells to move 
through the matrix and monitor changes in its mechanical properties. 

In epithelial tissues, such as the lining of the gut or the epidermal covering of 
the skin, cells are tightly bound together into sheets called epithelia. The extracel- 
lular matrix is less pronounced, consisting mainly of a thin mat called the basal 
lamina (or basement membrane) underlying the sheet. Within the epithelium, 
cells are attached to each other directly by cell-cell junctions, where cytoskeletal 
filaments are anchored, transmitting stresses across the interiors of the cells, from 
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adhesion site to adhesion site. The cytoskeleton of epithelial cells is also linked to 
the basal lamina through cell-matrix junctions. 

Figure 19-2 provides a closer view of epithelial cells to illustrate the major 
types of cell-cell and cell-matrix junctions that we will discuss in this chapter. The 
diagram shows the typical arrangement of junctions in a simple columnar epithe- 
lium such as the lining of the small intestine of a vertebrate. Here, a single layer 
of tall cells stands on a basal lamina, with the cells’ uppermost surface, or apex, 
free and exposed to the extracellular medium. On their sides, or lateral surfaces, 
the cells make junctions with one another. Two types of anchoring junctions link 
the cytoskeletons of adjacent cells: adherens junctions are anchorage sites for 
actin filaments; desmosomes are anchorage sites for intermediate filaments. Two 
additional types of anchoring junctions link the cytoskeleton of the epithelial cells 
to the basal lamina: actin-linked cell-matrix junctions anchor actin filaments to 
the matrix, while hemidesmosomes anchor intermediate filaments to it. 
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mechanical stresses 

are transmitted from 

cell to cell by cytoskeletal 
filaments anchored to 
cell-matrix and cell-cell 
adhesion sites 


extracellular matrix 
directly bears mechanical 
stresses of tension and 
compression 


Figure 19-1 Two main ways in which 
animal cells are bound together. In 
connective tissue, the main stress-bearing 
component is the extracellular matrix. In 
epithelial tissue, it is the cytoskeletons 

of the cells themselves, linked from cell 

to cell by adhesive junctions. Cell-matrix 
attachments bond epithelial tissue to the 
connective tissue beneath it. 


tight junction seals gap between 
epithelial cells 


adherens junction connects actin 
filament bundle in one cell with 
that in the next cell 


desmosome connects intermediate 
filaments in one cell to those in 
the next cell 


gap junction allows the passage 
of small water-soluble molecules 
from cell to cell 


hemidesmosome anchors intermediate 
filaments in a cell to extracellular matrix 


Figure 19-2 A summary of the various cell junctions found in a vertebrate epithelial cell, classified according to their primary functions. 

In the most apical portion of the cell, the relative positions of the junctions are the same in nearly all vertebrate epithelia. The tight junction occupies 
the most apical position, followed by the adherens junction (adhesion belt) and then by a special parallel row of desmosomes; together these form 
a structure called a junctional complex. Gap junctions and additional desmosomes are less regularly organized. Two types of cell-matrix anchoring 
junctions tether the basal surface of the cell to the basal lamina. The drawing is based on epithelial cells of the small intestine. 
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junctions, mediated typically by cadherins) 
or to extracellular matrix (cell-matrix 
junctions, mediated typically by integrins). 
The internal linkage to the cytoskeleton is 
generally indirect, via intracellular adaptor 
proteins, to be discussed later. 





intracellular transmembrane 
adaptor proteins adhesion proteins 


Two other types of cell-cell junction are shown in Figure 19-2. Tight junctions 
hold the cells closely together near the apex, sealing the gap between the cells and 
thereby preventing molecules from leaking across the epithelium. Near the basal 
end of the cells are channel-forming junctions, called gap junctions, that create 
passageways linking the cytoplasms of adjacent cells. 

Each of the four major anchoring junction types depends on transmembrane 
adhesion proteins that span the plasma membrane, with one end linking to the 
cytoskeleton inside the cell and the other end linking to other structures outside it 
(Figure 19-3). These cytoskeleton-linked transmembrane proteins fall neatly into 
two superfamilies, corresponding to the two basic kinds of external attachment. 
Proteins of the cadherin superfamily chiefly mediate attachment of cell to cell 
(Movie 19.1). Proteins of the integrin superfamily chiefly mediate attachment of 
cells to matrix. There is specialization within each family: some cadherins link to 
actin and form adherens junctions, while others link to intermediate filaments 
and form desmosomes; likewise, some integrins link to actin and form actin- 
linked cell-matrix junctions, while others link to intermediate filaments and form 
hemidesmosomes (Table 19-1). 


TABLE 19-1 


Cell—Cell 


Adherens junction Classical cadherins | Classical cadherin on Actin filaments a-Catenin, B-catenin, 
neighboring cell plakoglobin (y-catenin), 
0120-catenin, vinculin 


Desmosome Nonclassical Desmoglein and Intermediate filaments Plakoglobin (y-catenin), 


cadherins desmocollin on plakophilin, desmoplakin 
(desmoglein, neighboring cell 
desmocollin) 


Cell—Matrix 


Actin-linked cell- Integrin Extracellular matrix Actin filaments Talin, Kindlin, vinculin, 

matrix junction proteins paxillin, focal adhesion 
kinase (FAK), numerous 
others 


Hemidesmosome asß4 Integrin, type Extracellular matrix Intermediate filaments Plectin, BP230 
XVII collagen proteins 
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There are some exceptions to these rules. Some integrins, for example, medi- 
ate cell-cell rather than cell-matrix attachment. Moreover, there are other types 
of cell adhesion molecules that can provide transient cell-cell attachments more 
flimsy than anchoring junctions, but sufficient to stick cells together in special 
circumstances. 

We begin the chapter with a discussion of the major forms of cell-cell junc- 
tions. We then consider in turn the extracellular matrix of animals, the structure 
and function of integrin-mediated cell-matrix junctions, and, finally, the plant 
cell wall, a special form of extracellular matrix. 


CELL-CELL JUNCTIONS 


Cell-cell junctions come in many forms and can be regulated by a variety of 
mechanisms. The best understood and most common are the two types of cell- 
cell anchoring junctions, which employ cadherins to link the cytoskeleton of one 
cell with that of its neighbor. Their primary function is to resist the external forces 
that pull cells apart. The epithelial cells of your skin, for example, must remain 
tightly linked when they are stretched, pinched, or poked. Cell-cell anchoring 
junctions must also be dynamic and adaptable, so that they can be altered or rear- 
ranged when tissues are remodeled or repaired, or when there are changes in the 
forces acting on them. 

In this section, we focus primarily on the cadherin-based anchoring junctions. 
We then briefly describe tight junctions and gap junctions. Finally, we consider 
the more transient cell-cell adhesion mechanisms employed by some cells in the 
bloodstream. 


Cadherins Form a Diverse Family of Adhesion Molecules 


Cadherins are present in all multicellular animals whose genomes have been 
analyzed. They are also present in the choanoflagellates, which can exist either as 
free-living unicellular organisms or as multicellular colonies and are thought to 
be representatives of the group of protists from which all animals evolved. Other 
eukaryotes, including fungi and plants, lack cadherins, and they are also absent 
from bacteria and archaea. Cadherins therefore seem to be part of the essence of 
what it is to be an animal. 

The cadherins take their name from their dependence on Ca** ions: removing 
Ca** from the extracellular medium causes adhesions mediated by cadherins to 
come apart. The first three cadherins to be discovered were named according to 
the main tissues in which they were found: E-cadherin is present on many types 
of epithelial cells; N-cadherin on nerve, muscle, and lens cells; and P-cadherin on 
cells in the placenta and epidermis. All are also found in other tissues. These and 
other classical cadherins are closely related in sequence throughout their extra- 
cellular and intracellular domains. 

There are also a large number of nonclassical cadherins that are more dis- 
tantly related in sequence, with more than 50 expressed in the brain alone. The 
nonclassical cadherins include proteins with known adhesive function, such as 
the diverse protocadherins found in the brain, and the desmocollins and desmo- 
gleins that form desmosomes (see Table 19-1). Other family members are involved 
primarily in signaling. Together, the classical and nonclassical cadherin proteins 
constitute the cadherin superfamily (Figure 19-4), with more than 180 members 
in humans. 


Cadherins Mediate Homophilic Adhesion 


Anchoring junctions between cells are usually symmetrical: if the linkage is to 
actin in the cell on one side of the junction, it will be to actin in the cell on the 
other side. In fact, the binding between cadherins is generally homophilic (like- 
to-like, Figure 19-5): cadherin molecules of a specific subtype on one cell bind to 
cadherin molecules of the same or closely related subtype on adjacent cells. 
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The spacing between the cell membranes at an anchoring junction is precisely 
defined and depends on the structure of the participating cadherin molecules. All 
the members of the superfamily, by definition, have an extracellular portion con- 
sisting of several copies of the extracellular cadherin (EC) domain. Homophilic 
binding occurs at the N-terminal tips of the cadherin molecules—the cadherin 
domains that lie furthest from the membrane. These terminal domains each form 
a knob and a nearby pocket, and the cadherin molecules protruding from oppo- 
site cell membranes bind by insertion of the knob of one domain into the pocket 
of the other (Figure 19-6A). 

Each cadherin domain forms a more-or-less rigid unit, joined to the next cad- 
herin domain by a hinge. Ca** ions bind to sites near each hinge and prevent it 
from flexing, so that the whole string of cadherin domains behaves as a rigid and 
slightly curved rod. When Ca?* is removed, the hinges can flex, and the structure 
becomes floppy (Figure 19-6B). At the same time, the conformation at the N-ter- 
minus is thought to change slightly, weakening the binding affinity for the match- 
ing cadherin molecule on the opposite cell. 

Unlike receptors for soluble signal molecules, which bind their specific ligand 
with high affinity, cadherins (and most other cell-cell adhesion proteins) typi- 
cally bind to their partners with relatively low affinity. Strong attachments result 
from the formation of many such weak bonds in parallel. When binding to oppo- 
sitely oriented partners on another cell, cadherin molecules are often clustered 
side-to-side with many other cadherin molecules on the same cell (Figure 19-6C). 
The strength of this junction is far greater than that of any individual intermolecu- 
lar bond, and yet regulatory mechanisms can easily disassemble the junction by 
separating the molecules sequentially, just as two pieces of fabric can be joined 
strongly by Velcro and yet easily peeled apart from the sides. A similar “Velcro 
principle” also operates at cell-cell and cell-matrix adhesions formed by other 
types of transmembrane adhesion proteins. 
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Figure 19-4 The cadherin superfamily. 
The diagram shows some of the diversity 
among cadherin superfamily members. 
These proteins all have extracellular 
portions containing multiple copies of the 
extracellular cadherin domain (green ovals). 
In the classical cadherins of vertebrates 
there are 5 of these domains, and in 
desmogleins and desmocollins there are 
4 or 5, but some nonclassical cadherins 
have more than 30. The intracellular 
portions are more varied, reflecting 
interactions with a wide variety of 
intracellular ligands, including signaling 
molecules and adaptor proteins 

that connect the cadherin to the 
cytoskeleton. In some cases, such as 
T-cadherin, a transmembrane domain 

is not present and the protein is 
attached to the plasma membrane by a 
glycosylohosphatidylinositol (GPI) anchor. 
The differently colored motifs in Fat, 
Flamingo, and Ret represent conserved 
domains that are also found in other protein 
families. 


HOMOPHILIC BINDING 





HETEROPHILIC BINDING 


Figure 19-5 Homophilic versus 
heterophilic binding. Cadherins in general 
bind homophilically; some other cell 
adhesion molecules, discussed later, bind 
heterophilically. 
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Figure 19-6 Cadherin structure and function. (A) The extracellular region of 






plasma 
membrane ~ «æ a classical cadherin contains five copies of the extracellular cadherin domain 
of cell 2 “— (see Figure 19-4) separated by flexible hinge regions. Ca? ions (red dots) bind 


in the neighborhood of each hinge, preventing it from flexing. As a result, the 
extracellular region forms a rigid, curved structure as shown here. To generate 
cell-cell adhesion, the cadherin domain at the N-terminal tip of one cadherin 
molecule binds the N-terminal domain from a cadherin molecule on another cell. 
The structure was determined by x-ray diffraction of the crystallized C-cadherin 
extracellular region. (B) In the absence of Ca?+, increased flexibility in the hinge 
regions results in a floppier molecule that is no longer oriented correctly to 
interact with a cadherin on another cell—and adhesion fails. (C) At a typical 


cell-cell junction, an organized array of cadherin molecules functions like Velcro 
to hold cells together. Cadherins on the same cell are thought to be coupled by 
side-to-side interactions between their N-terminal head regions, resulting in a 

linear array like the alternating green and light green cadherins on the lower cell 





= = = sass —~ plasma membrane 7 i ge e 
= ee á of cell 1 shown here. These arrays are thought to interact with similar linear arrays on an 


adjacent cell (blue cadherin molecules, top cell). The linear arrays on one cell are 
perpendicular to those on the other cell, as indicated by the red arrows. Multiple 


perpendicular arrays on both cells interact to form a tight-knit mat of cadherin 
proteins. (A, based on T.J. Boggon et al., Science 296:1308-1313, 2002; 
C, based on O.J. Harrison et al. Structure 19:244-256, 2011.) 


Cadherin-Dependent Cell-Cell Adhesion Guides the Organization 
of Developing Tissues 


Cadherins form specific homophilic attachments, explaining why there are so 
many different family members. Cadherins are not like glue, making cell surfaces 
generally sticky. Rather, they mediate highly selective recognition, enabling cells 
of a similar type to stick together and to stay segregated from other types of cells. 
Selectivity in the way that animal cells consort with one another was first dem- 
onstrated in the 1950s, long before the discovery of cadherins, in experiments in 
which amphibian embryos were dissociated into single cells. These cells were then 
mixed up and allowed to reassociate. Remarkably, the dissociated cells often reas- 
sembled into structures resembling those of the original embryo (Figure 19-7). 
These experiments, together with numerous more recent experiments, revealthat Figure 19-7 Sorting out. Cells from 
selective cell-cell recognition systems make cells of the same differentiated tissue Mfferent layers of an early amphibian 


f embryo will sort out according to their 
preferentially adhere to one another. origins. In the classical experiment shown 


here, mesoderm cells (green), neural plate 
cells (blue), and epidermal cells (red) have 
been disaggregated and then reaggregated 
in a random mixture. They sort out into 
an arrangement reminiscent of a normal 
embryo, with a “neural tube” internally, 
epidermis externally, and mesoderm in 
between. (Modified from P.L. Townes and 
J. Holtfreter, J. Exp. Zool. 128:53-120, 
1955. With permission from Wiley-Liss.) 
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Cadherins play a crucial part in these cell-sorting processes during develop- 
ment. The appearance and disappearance of specific cadherins correlate with 
steps in embryonic development where cells regroup and change their contacts 
to create new tissue structures. In the vertebrate embryo, for example, changes in 
cadherin expression are seen when the neural tube forms and pinches off from 
the overlying ectoderm: neural tube cells lose E-cadherin and acquire other cad- 
herins, including N-cadherin, while the cells in the overlying ectoderm continue 
to express E-cadherin (Figure 19-8A and B). Then, when the neural crest cells 
migrate away from the neural tube, these cadherins become scarcely detectable, 
and another cadherin (cadherin 7) appears that helps hold the migrating cells 
together as loosely associated cell groups (Figure 19-8C). Finally, when the cells 
ageregate to form a ganglion, they switch on expression of N-cadherin again. If 
N-cadherin is artificially overexpressed in the emerging neural crest cells, the cells 
fail to escape from the neural tube. 

Studies with cultured cells further support the idea that the homophilic bind- 
ing of cadherins controls these processes of tissue segregation. In a line of cultured 
fibroblasts called L cells, for example, cadherins are not expressed and the cells 
do not adhere to one another. When these cells are transfected with DNA encod- 
ing E-cadherin, E-cadherins on one cell bind to E-cadherins on another, resulting 
in cell-cell adhesion. If L cells expressing different cadherins are mixed together, 
they sort out and aggregate separately, indicating that different cadherins pref- 
erentially bind to their own type (Figure 19-9A), mimicking what happens when 
cells derived from tissues that express different cadherins are mixed together. A 
similar segregation of cells occurs if L cells expressing different amounts of the 
same cadherin are mixed together (Figure 19-9B). It therefore seems likely that 
both qualitative and quantitative differences in the expression of cadherins have 
a role in organizing tissues. 
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Figure 19-8 Changing patterns of 
cadherin expression during construction 
of the vertebrate nervous system. The 
figure shows cross sections of the early 
chick embryo, as the neural tube detaches 
from the ectoderm and then as neural 

crest cells detach from the neural tube. 

(A, B) Immunofluorescence micrographs 
showing the developing neural tube labeled 
with antibodies against (A) E-cadherin (blue) 
and (B) N-cadherin (yellow). (C) As the 
patterns of gene expression change, the 
different groups of cells segregate from one 
another according to the cadherins they 
express. (Micrographs courtesy of Miwako 
Nomura and Masatoshi Takeichi.) 
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Figure 19-9 Cadherin-dependent cell 
sorting. Cells in culture can sort themselves 
out according to the type and level of 
cadherins they express. This can be 
visualized by labeling different populations of 
cells with dyes of different colors. (A) Cells 
expressing N-cadherin sort out from cells 
expressing E-cadherin. (B) Cells expressing 
high levels of E-cadherin sort out from cells 
expressing low levels of E-cadherin. The 
cells expressing high levels adhere more 
strongly and end up internally. 
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Epithelial-Mesenchymal Transitions Depend on Control of 
Cadherins 


The assembly of cells into an epithelium is a reversible process. By switching on 
expression of adhesion molecules, dispersed unattached mesenchymal cells, such 
as fibroblasts, can come together to form an epithelium. Conversely, epithelial 
cells can change their character, disassemble, and migrate away from their par- 
ent epithelium as separate cells. Such epithelial-mesenchymal transitions play an 
important part in normal embryonic development; the origin of the neural crest 
is one example. These transitions depend in part on transcription regulatory pro- 
teins called Slug, Snail, and Twist. Increased expression of Twist, for example, 
converts epithelial cells to a mesenchymal character, and switching it off does the 
opposite. Twist exerts its effects, in part, by inhibiting expression of cadherins, 
including E-cadherin, that hold epithelial cells together. 

Epithelial-mesenchymal transitions also occur as pathological events during 
adult life, in cancer. Most cancers originate in epithelia, but become dangerously 
prone to spread—that is, malignant—only when the cancer cells escape from the 
epithelium of origin and invade other tissues. Experiments with malignant breast 
cancer cells in culture show that blocking expression of Twist can convert the cells 
back toward a nonmalignant character. Conversely, by forcing Twist expression, 
one can make normal epithelial cells undergo an epithelial-mesenchymal tran- 
sition and behave like malignant cells. Mutations that disrupt the production or 
function of E-cadherin are often found in cancer cells and are thought to help 
make them malignant. 


Catenins Link Classical Cadherins to the Actin Cytoskeleton 


The extracellular domains of cadherins mediate homophilic binding at adherens 
junctions. The intracellular domains of typical cadherins, including all classical 
and some nonclassical ones, interact with filaments of the cytoskeleton: actin at 
adherens junctions and intermediate filaments at desmosomes (see Table 19-1). 
These cytoskeletal linkages are essential for efficient cell-cell adhesion, as cad- 
herins that lack their cytoplasmic domains cannot stably hold cells together. 

The linkage of cadherins to the cytoskeleton is indirect and depends on adap- 
tor proteins that assemble on the cytoplasmic tail of the cadherin. At adherens 
junctions, the cadherin tail binds two such proteins: /-catenin and a distant rela- 
tive called p120-catenin; a third protein called a-catenin interacts with B-catenin 
and recruits a variety of other proteins to provide a dynamic linkage to actin 
filaments (Figure 19-10). At desmosomes, cadherins are linked to intermediate 
filaments through other adaptor proteins, including a B-catenin-related protein 
called plakoglobin, as we discuss later. 

In their mature form, adherens junctions are enormous protein complexes 
containing hundreds to thousands of cadherin molecules, packed into dense, reg- 
ular arrays that are linked on the extracellular side by lateral interactions between 
cadherin domains, as we discussed earlier (see Figure 19-6C). On the cytoplasmic 
side, a complex network of catenins, actin regulators, and contractile actin bun- 
dles holds the cluster of cadherins together and links it to the actin cytoskeleton. 
Assembling a structure of this complexity is not a simple task, and it involves a 
complex sequence of events controlled by the actin-regulatory proteins discussed 
in Chapter 16. The general features of the assembly process are summarized in 
Figure 19-11. 


Adherens Junctions Respond to Forces Generated by the Actin 
Cytoskeleton 


Most adherens junctions are linked to contractile bundles of actin filaments and 
non-muscle myosin II. These junctions are therefore subjected to pulling forces 
generated by the attached actin. The pulling forces are important for junction 
assembly and maintenance: disruption of myosin activity, for example, results in 
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Figure 19-10 The linkage of classical 
cadherins to actin filaments. The 
cadherins are coupled indirectly to actin 
filaments through an adaptor protein 
complex containing p120-catenin, 
B-catenin, and a-catenin. Other proteins, 
including vinculin, associate with a-catenin 
and help provide the linkage to actin. 
B-Catenin has a second, and very 
important, function in intracellular signaling, 
as we discuss in Chapter 15 (See Figure 
15-60). For clarity, this diagram does not 
show the cadherin of the adjacent cell in 
the junction. 
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the disassembly of many adherens junctions. Furthermore, the contractile forces 
acting on a junction in one cell are balanced by contractile forces at the junction 
of the opposite cell, so that no cell pulls others toward it and thereby disrupts the 
uniform distribution of cells in the tissue. 

We do not understand the mechanisms responsible for maintaining this bal- 
ance. Adherens junctions seem to sense the forces acting on them and modify 
local actin and myosin behavior to balance the forces on both sides of the junc- 
tion. Evidence for these mechanisms comes from studies of pairs of cultured 
mammalian cells connected by adherens junctions. If contractile activity in one 
cell is increased experimentally, the adherens junctions linking the two cells 
increase in size, and the contractile activity of the second cell increases to match 
that of the first—resulting in a balance of forces across the junction. These and 
other experiments suggest that adherens junctions are not simply passive sites 
of protein-protein binding but are dynamic tension sensors that regulate their 
behavior in response to changing mechanical conditions. This ability to transduce 
a mechanical signal into a change in junctional behavior is an example of mecha- 
notransduction. We will see later that it is also important at cell-matrix junctions. 

The mechanotransduction at cell-cell junctions is thought to depend, at least 
in part, on proteins in the cadherin complex that alter their shape when stretched 
by tension. The protein a-catenin, for example, is stretched from a folded to an 
extended conformation when contractile activity increases at the junction. The 
unfolding exposes a cryptic binding site for another protein, vinculin, which pro- 
motes the recruitment of more actin to the junction (Figure 19-12). By mecha- 
nisms such as this, pulling on a junction makes it stronger. Furthermore, as noted 
above, pulling on a junction in one cell will increase the contractile force gener- 
ated in the attached cell. 

In some cell types, actin contractility reduces cell-cell adhesion, particularly if 
large forces are involved. Large actin-based contractile forces might, in some tis- 
sues, pull sufficiently hard on the edges of cell-cell adhesions to peel them apart, 
particularly if contraction is coupled to additional regulatory mechanisms that 
weaken the adhesion. This mechanism might be important in certain forms of 
tissue remodeling during development, as we describe next. 


Tissue Remodeling Depends on the Coordination of Actin- 
Mediated Contraction With Cell—Cell Adhesion 


Adherens junctions are an essential part of the machinery for modeling the shapes 
of multicellular structures in the animal body. By indirectly linking the actin fila- 
ments in one cell to those in its neighbors, they enable the cells in the tissue to use 
their actin cytoskeletons in a coordinated way. 

Adherens junctions occur in various forms. In many nonepithelial tissues, 
they appear as small punctate or linear attachments that connect the cortical 
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Figure 19-11 Assembly of an adherens 
junction. (A) Assembly begins when 

two unattached epithelial cell precursors 
explore their Surroundings with membrane 
protrusions, generated by local nucleation 
of actin networks. When the cells make 
contact, small cadherin and catenin 
clusters take shape at the contact sites and 
associate with actin, leading to activation 
of the small monomeric GTPase Rac (not 
shown), an important actin regulator (see 
Figure 16-85). (B) Rac promotes additional 
actin protrusions in the vicinity, expanding 
the size of the contact zone and thereby 
promoting further recruitment of cadherins 
and their associated catenin proteins. 

(C) Eventually, Rac is inactivated and 
replaced by the related GTPase Rho (not 
shown), which shifts actin remodeling 
toward the assembly of linear, contractile 
filament bundles. Rho also promotes 

the assembly of myosin II filaments that 
associate with bundles of actin filaments to 
generate contractile activity. This contractile 
activity generates tension that stimulates 
further actin recruitment and expansion 

of the junction, in part through the 
mechanisms illustrated in Figure 19-12. 
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actin filaments beneath the plasma membranes of two interacting cells. In heart 
muscle, they anchor the actin bundles of the contractile apparatus and act in par- 
allel with desmosomes to link the contractile cells end-to-end. But the prototypi- 
cal examples of adherens junctions occur in epithelia, where they often form a 
continuous adhesion belt (or zonula adherens) just beneath the apical face of the 
epithelium, encircling each of the interacting cells in the sheet (Figure 19-13). 
Within each cell, a contractile bundle of actin filaments and myosin II lies adjacent 
to the adhesion belt, oriented parallel to the plasma membrane and tethered to 
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Figure 19-12 Mechanotransduction 

in an adherens junction. (A) Cell—cell 
junctions are able to sense increased 
tension and respond by strengthening 
their actin linkages. Tension sensing is 
thought to depend in part on a-catenin 
(see Figure 19-10). (B) When actin 
filaments are pulled from within the cell by 
non-muscle myosin Il, the resulting force 
unfolds a domain in a-catenin, thereby 
exposing an otherwise hidden binding site 
for the adaptor protein vinculin. Vinculin 
then promotes additional actin recruitment, 
strengthening the linkages between the 
junction and the cytoskeleton. 


Figure 19-13 Adherens junctions 
between epithelial cells in the small 
intestine. These cells are specialized 

for absorption of nutrients; at their apex, 
facing the lumen of the gut, they have 
many microvilli (protrusions that increase 
the absorptive surface area). The adherens 
junction takes the form of an adhesion 
belt, encircling each of the interacting cells. 
Its most obvious feature is a contractile 
bundle of actin filaments running along 
the cytoplasmic surface of the junctional 
plasma membrane. The actin filament 
bundles are tethered by intracellular 
proteins to cadherins, which bind to 
cadherins on the adjacent cell. In this way, 
the actin filament bundles in adjacent cells 
are tied together. For clarity, this drawing 
does not show most of the other cell-cell 
and cell-matrix junctions of epithelial cells 
(see Figure 19-2). 
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it by the cadherins and their associated intracellular adaptor proteins. The actin- 
myosin bundles are thus linked, via the cadherins, into an extensive transcellular 
network. Coordinated contraction of this network provides the motile force for 
a fundamental process in animal morphogenesis—the folding of epithelial cell 
sheets into tubes, vesicles, and other related structures (Figure 19-14). 

The coordination of cell-cell adhesion and actin contractility is beautifully 
illustrated by cellular rearrangements that occur early in the development of the 
fruit fly Drosophila melanogaster. Soon after gastrulation, the outer epithelium 
of the embryo is elongated by a process called germ-band extension, in which the 
cells converge inward toward the dorsal-ventral axis and extend along the ante- 
rior-posterior axis (Figure 19-15). Actin-dependent contraction along specific 
cell boundaries is coordinated with a loss of specific adherens junctions to allow 
cells to insert themselves between other cells (a process called intercalation), 
resulting in a longer and narrower epithelium. The mechanisms underlying the 
loss of adhesion along specific cell boundaries are not clear, but they depend in 
part on increased degradation of B-catenin, due to its phosphorylation by a pro- 
tein kinase that is localized specifically at those boundaries. 


Desmosomes Give Epithelia Mechanical Strength 


Desmosomes are structurally similar to adherens junctions but contain special- 
ized cadherins that link to intermediate filaments instead of actin filaments. Their 
main function is to provide mechanical strength. Desmosomes are important 
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Figure 19-14 The folding of an epithelial 
sheet to form an epithelial tube. The 
oriented contraction of the bundles of 

actin and myosin filaments running along 
adhesion belts causes the epithelial cells to 
narrow at their apex and helps the epithelial 
sheet to roll up into a tube. An example 

is the formation of the neural tube in early 
vertebrate develooment (See Figure 19-8). 


Figure 19-15 Remodeling of cell-cell 
adhesions in embryonic Drosophila 
epithelium. Depicted at /eft is a group of 
cells in the outer epithelium of a Drosophila 
embryo. During germ-band extension, cells 
converge toward each other (middle) on 
the dorsal-ventral axis and then extend 
(right) along the anterior—posterior axis. 
The result is intercalation: cells that were 
originally far apart along the dorsal-ventral 
axis (dark green) are inserted between 

the cells (light green) that separated 

them. These rearrangements depend 

on the spatial regulation of actin-myosin 
contractile bundles, which are localized 
primarily at the vertical cell boundaries 
(red, left). Contraction of these bundles is 
accompanied by removal of E-cadherin 
(not shown) at the same cell boundaries, 
resulting in shrinkage and loss of adhesion 
along the vertical axis (middle). New 
cadherin-based adhesions (blue, right) 
then form and expand along horizontal 
boundaries, resulting in extension of the 
cells in the anterior—posterior dimension. 
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Figure 19-16 Desmosomes. (A) The structural components of a desmosome. On the cytoplasmic surface of each interacting 
plasma membrane is a dense plaque composed of a mixture of intracellular adaptor proteins. A bundle of keratin intermediate 
filaments is attached to the surface of each plaque. Transmembrane nonclassical cadherins bind to the plaques and interact 
through their extracellular domains to hold the adjacent membranes together. (B) Some of the molecular components of a 
desmosome. Desmoglein and desmocollin are nonclassical cadherins. Their cytoplasmic tails bind plakoglobin (y-catenin) 

and plakophilin (a distant relative of p120-catenin), which in turn bind to desmoplakin. Desmoplakin binds to the sides of 
intermediate filaments, thereby tying the desmosome to these filaments. (C) An electron micrograph of desmosome junctions 
between three epidermal cells in the skin of a baby mouse. (D) Part of the same tissue at higher magnification, showing a single 
desmosome, with intermediate filaments attached to it. (C and D, from W. He, P. Cowin and D.L. Stokes, Science 302:109-118, 
2003. With permission from AAAS.) 


in vertebrates but are not found, for example, in Drosophila. They are present in 
most mature vertebrate epithelia and are particularly plentiful in tissues that are 
subject to high levels of mechanical stress, such as heart muscle and the epider- 
mis, the epithelium that forms the outer layer of the skin. 

Figure 19-16A shows the general structure of a desmosome, and Figure 
19-16B shows some of the proteins that form it. Desmosomes typically appear as 
buttonlike spots of adhesion, riveting the cells together (Figure 19-16C). Inside 
the cell, the bundles of ropelike intermediate filaments that are anchored to 
the desmosomes form a structural framework of great tensile strength (Figure 
19-16D), with linkage to similar bundles in adjacent cells, creating a network that 
extends throughout the tissue (Figure 19-17). The particular type of intermediate 
filaments attached to the desmosomes depends on the cell type: they are kera- 
tin filaments in most epithelial cells, for example, and desmin filaments in heart 
muscle cells. 

The importance of desmosomes is demonstrated by some forms of the poten- 
tially fatal skin disease pemphigus. Affected individuals make antibodies against 
one of their own desmosomal cadherin proteins. These antibodies bind to and 
disrupt the desmosomes that hold their epidermal cells (keratinocytes) together. 
This results in a severe blistering of the skin, with leakage of body fluids into the 
loosened epithelium. 


CELL—-CELL JUNCTIONS 


Tight Junctions Form a Seal Between Cells and a Fence Between 
Plasma Membrane Domains 


Sheets of epithelial cells enclose and partition the animal body, lining all its sur- 
faces and cavities, and creating internal compartments where specialized pro- 
cesses occur. The epithelial sheet seems to be one of the inventions that lie at the 
origin of animal evolution, diversifying in a huge variety of ways but retaining an 
organization based on a set of conserved molecular mechanisms. 

Essentially all epithelia are anchored to other tissue on one side—the basal 
side—and free of such attachment on their opposite side—the apical side. A basal 
lamina lies at the interface with the underlying tissue, mediating the attachment, 
while the apical surface of the epithelium is generally bathed by extracellular fluid. 
Thus, all epithelia are structurally polarized, and so are their individual cells: the 
basal end ofa cell, adherent to the basal lamina below, differs from the apical end, 
exposed to the medium above. 

Correspondingly, all epithelia have at least one function in common: they 
serve as selective permeability barriers, separating the fluid that permeates the 
tissue on their basal side from fluid with a different chemical composition on their 
apical side. This barrier function requires that the adjacent cells be sealed together 
by tight junctions, so that molecules cannot leak freely across the cell sheet. 

The epithelium of the small intestine provides a good illustration of tight-junc- 
tion structure and function (see Figure 19-2). This epithelium has a simple colum- 
nar structure; that is, it consists of a single layer of tall (columnar) cells. These are 
of several differentiated types, but the majority are absorptive cells, specialized for 
uptake of nutrients from the internal cavity, or lumen, of the gut. The absorptive 
cells have to transport selected nutrients across the epithelium from the lumen 
into the extracellular fluid on the other side. From there, these nutrients diffuse 
into small blood vessels to provide nourishment to the organism. This transcellu- 
lar transport depends on two sets of transport proteins in the plasma membrane 
of the absorptive cell. One set is confined to the apical surface of the cell (facing 
the lumen) and actively transports selected molecules into the cell from the gut. 
The other set is confined to the basolateral (basal and lateral) surfaces of the cell, 
and it allows the same molecules to leave the cell by passive transport into the 
extracellular fluid on the other side of the epithelium. For this transport activity 
to be effective, the spaces between the epithelial cells must be tightly sealed, so 
that the transported molecules cannot leak back into the gut lumen through these 
spaces (Figure 19-18). Moreover, the transport proteins must be correctly distrib- 
uted in the plasma membranes: the apical transporters must be delivered to the 
cell apex and must not be allowed to drift to the basolateral membrane, and the 
basolateral transporters must be delivered to and remain in the basolateral mem- 
brane. Tight junctions, besides sealing the gaps between the cells, also function 
as “fences” that help prevent apical or basolateral proteins from diffusing into the 
wrong region. 

The sealing function of tight junctions is easy to demonstrate experimentally: a 
low-molecular-weight tracer added to one side of an epithelium will generally not 
pass beyond the tight junction (Figure 19-19). This seal is not absolute, however. 
Although all tight junctions are impermeable to macromolecules, their permea- 
bility to ions and other small molecules varies. Tight junctions in the epithelium 
lining the small intestine, for example, are 10,000 times more permeable to inor- 
ganic ions, such as Na’, than the tight junctions in the epithelium lining the uri- 
nary bladder. The movement of ions and other molecules between epithelial cells 
is called paracellular transport, and tissue-specific differences in transport rates 
generally result from differences in the proteins that form tight junctions. 


Tight Junctions Contain Strands of Transmembrane Adhesion 
Proteins 


When tight junctions are visualized by freeze-fracture electron microscopy, they 
are seen as a branching network of sealing strands that completely encircles the 
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Figure 19-17 Desmosomes, 
hemidesmosomes, and the intermediate 
filament network. The keratin intermediate 
filament networks of adjacent cells — in 

this example, epithelial cells of the small 
intestine — are indirectly connected to one 
another through desmosomes, and to the 
basal lamina through hemidesmosomes. 
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apical end of each cell in the epithelial sheet (Figure 19-20A and B). In conven- 
tional electron micrographs, the outer leaflets of the two interacting plasma mem- 
branes are tightly apposed where sealing strands are present (Figure 19-20C). 
Each sealing strand is composed of a long row of transmembrane homophilic 
adhesion proteins embedded in each of the two interacting plasma membranes. 
The extracellular domains of these proteins adhere directly to one another to 
occlude the intercellular space (Figure 19-21). 

The main transmembrane proteins forming these strands are the claudins, 
which are essential for tight-junction formation and function. Mice that lack the 
claudin-1 gene, for example, fail to make tight junctions between the cells in the 
epidermal layer of the skin; as a result, the baby mice lose water rapidly by evapo- 
ration through the skin and die within a day after birth. Conversely, if nonepithe- 
lial cells such as fibroblasts are artificially caused to express claudin genes, they 
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Figure 19-18 The role of tight junctions 
in transcellular transport. For clarity, only 
the tight junctions are shown. Transport 
proteins are confined to different regions of 
the plasma membrane in epithelial cells of 
the small intestine. This segregation permits 
a vectorial transfer of nutrients across the 
epithelium from the gut lumen to the blood. 
In the example shown, glucose is actively 
transported into the cell by Nat-driven 
glucose transporters at its apical surface, 
and it leaves the cell through passive 
glucose transporters in its basolateral 
membrane. Tight junctions are thought 

to confine the transport proteins to their 
appropriate membrane domains by 

acting as diffusion barriers, or “fences,” 
within the lipid bilayer of the plasma 
membrane; these junctions also block 

the backflow of glucose from the basal 
side of the epithelium into the gut lumen 
(see Movie 11.2). 


Figure 19-19 The role of tight junctions 
in allowing epithelia to serve as barriers 
to solute diffusion. (A) The drawing shows 
how a small extracellular tracer molecule 
added on one side of an epithelium is 
prevented from crossing the epithelium by 
the tight junctions that seal adjacent cells 
together. Adherens junctions and other 
cell junctions are not shown for clarity. 

(B) Electron micrographs of cells in an 
epithelium in which a small, extracellular, 
electron-dense tracer molecule has been 
added to either the apical side (on the /eft) 
or the basolateral side (on the right). The 
tight junction blocks passage of the tracer 
in both directions. (B, courtesy of Daniel 
Friend.) 
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Figure 19-20 The structure of a tight junction between epithelial cells of the small intestine. The junctions are shown 

(A) schematically, (B) in a freeze-fracture electron micrograph, and (C) in a conventional electron micrograph. In (B), the plane of 
the micrograph is parallel to the plane of the membrane, and the tight junction appears as a band of branching sealing strands 
that encircle each cell in the epithelium (see Figure 19—21A). In (C), the junction is seen in cross section as a series of focal 
connections between the outer leaflets of the two interacting plasma membranes, each connection corresponding to a sealing 
strand in cross section. (B and C, from N.B. Gilula, in Cell Communication [R.P. Cox, ed.], pp. 1-29. New York: Wiley, 1974.) 


will form tight-junctional connections with one another. Normal tight junctions 
also contain a second major transmembrane protein called occludin, which is not 
essential for the assembly or structure of the tight junction but is important for 
limiting junctional permeability. A third transmembrane protein, tricellulin, is 
required to seal cell membranes together and prevent transepithelial leakage at 
the points where three cells meet. 

The claudin protein family has many members (24 in humans), and these are 
expressed in different combinations in different epithelia to confer particular per- 
meability properties on the epithelial sheet. They are thought to form paracellular 
pores—selective channels allowing specific ions to cross the tight-junctional bar- 
rier, from one extracellular space to another. A specific claudin found in kidney 
epithelial cells, for example, is needed to let Mg** pass between the cells of the 
kidney tubules so that this ion can be resorbed from the urine into the blood. A 
mutation in the gene encoding this claudin results in excessive loss of Mgt in the 
urine. 


scaffold Proteins Organize Junctional Protein Complexes 


Like the cadherin molecules of an adherens junction, the claudins and occludins 
of a tight junction interact with each other on their extracellular sides to promote 
junction assembly. Also as in adherens junctions, the organization of adhesion 
proteins in a tight junction depends on additional proteins that bind the cytoplas- 
mic side of the adhesion proteins. The key organizational proteins at tight junc- 
tions are the zonula occludens (ZO) proteins. The three major members of the ZO 
family—ZO-1, ZO-2, and ZO-3—are large scaffold proteins that provide a struc- 
tural support on which the tight junction is built. These intracellular molecules 
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consist of strings of protein-binding domains, typically including several PDZ 
domains—segments about 80 amino acids long that can recognize and bind the 
C-terminal tails of specific partner proteins (Figure 19-22). One domain of these 
scaffold proteins can attach to a claudin protein, while others can attach to occlu- 
din or the actin cytoskeleton. Moreover, one molecule of scaffold protein can bind 
to another. In this way, the cell can assemble a mat of intracellular proteins that 
organizes and positions the sealing strands of the tight junction. 

The tight-junctional network of sealing strands usually lies just apical to adhe- 
rens and desmosome junctions that bond the cells together mechanically; the 
whole assembly is called a junctional complex (see Figure 19-2). The parts of this 
junctional complex depend on each other for their formation. For example, anti- 
cadherin antibodies that block the formation of adherens junctions also block the 
formation of tight junctions. 


Gap Junctions Couple Cells Both Electrically and Metabolically 


Tight junctions block the passageways through the gaps between epithelial cells, 
preventing extracellular molecules from leaking from one side of an epithelium to 
the other. Another type of junctional structure has a radically different function: 
it bridges gaps between adjacent cells so as to create direct channels from the 
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Figure 19-21 A model of a tight junction. 
(A) The sealing strands hold adjacent 
plasma membranes together. The strands 
are composed of transmembrane proteins 
that make contact across the intercellular 
space and create a seal. (B) The molecular 
composition of a sealing strand. The 
major extracellular components of the 
tight junction are members of a family of 
proteins with four transmembrane domains. 
One of these proteins, claudin, is the most 
important for the assembly and structure 
of the sealing strands, whereas the related 
protein occludin has the less critical role of 
determining junction permeability. The two 
termini of these proteins are both on the 
cytoplasmic side of the membrane, where 
they interact with large scaffolding proteins 
that organize the sealing strands and link 
the tight junction to the actin cytoskeleton 
(not shown here, but see Figure 19-22). 


Figure 19-22 Scaffold proteins at the 
tight junction. The scaffold proteins ZO-1, 
ZO-2, and ZO-3 are concentrated beneath 
the plasma membrane at tight junctions. 
Each of the proteins contains multiple 
protein-binding domains, including three 
PDZ domains, an SH3 domain, and a GK 
domain, linked together like beads on a 
flexible string. These domains enable the 
proteins to interact with each other and 
with numerous other partners, as indicated 
here, to generate a tightly woven protein 
network that organizes the sealing strands 
of the tight junction and links them to 

the actin cytoskeleton. Scaffold proteins 
with similar structure help organize other 
junctional complexes, including those at 
neural synapses. 
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cytoplasm of one to that of the other. These channels are called gap junctions. 

Gap junctions are present in most animal tissues, including connective tissues 
as well as epithelia and heart muscle. Each gap junction appears in conventional 
electron micrographs as a patch where the membranes of two adjacent cells are 
separated by a uniform narrow gap of about 2-4 nm (Figure 19-23). The gap is 
spanned by channel-forming proteins, of which there are two distinct families, 
called the connexins and the innexins. Connexins are the predominant gap-junc- 
tion proteins in vertebrates, with 21 isoforms in humans. Innexins are found in the 
gap junctions of invertebrates. 

Gap junctions have a pore size of about 1.4 nm, which allows the exchange 
of inorganic ions and other small water-soluble molecules, but not of macro- 
molecules such as proteins or nucleic acids (Figure 19-24). An electric current 
injected into one cell through a microelectrode causes an electrical disturbance 
in the neighboring cell, due to the flow of ions carrying electric charge through 
gap junctions. This electrical coupling via gap junctions serves an obvious pur- 
pose in tissues containing electrically excitable cells: action potentials can spread 
rapidly from cell to cell, without the delay that occurs at chemical synapses. In 
vertebrates, for example, electrical coupling through gap junctions synchronizes 
the contractions of heart muscle cells as well as those of the smooth muscle cells 
responsible for the peristaltic movements of the intestine. Gap junctions also 
occur in many tissues whose cells are not electrically excitable. In principle, the 
sharing of small metabolites and ions provides a mechanism for coordinating the 
activities of individual cells in such tissues and for smoothing out random fluctua- 
tions in small-molecule concentrations in different cells. 


A Gap-Junction Connexon Is Made of Six Transmembrane 
Connexin Subunits 


Connexins are four-pass transmembrane proteins, six of which assemble to form 
a hemichannel, or connexon. When the connexons in the plasma membranes of 
two cells in contact are aligned, they form a continuous aqueous channel that 
connects the two cell interiors (Figure 19-25). A gap junction consists of many 
such connexon pairs in parallel, forming a sort of molecular sieve. Not only does 
this sieve provide a communication channel between cells, but it also provides a 
form of cell-cell adhesion that supplements the cadherin- and claudin-mediated 
adhesions we discussed earlier. 
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Figure 19-23 Gap junctions as seen 

in the electron microscope. (A) Thin- 
section and (B) freeze-fracture electron 
micrographs of a large and a small gap- 
junction plaque between fibroblasts in 
culture. In (B), each gap junction is seen as 
a cluster of homogeneous intramembrane 
particles. Each intramembrane particle 
corresponds to a connexon (see Figure 
19-25). (From N.B. Gilula, in Cell 
Communication [R.P. Cox, ed.], 

pp. 1-29. New York: Wiley, 1974.) 
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Figure 19-24 Determining the size of a 
gap-junction channel. When fluorescent 
molecules of various sizes are injected into 
one of two cells coupled by gap junctions, 
molecules with a molecular weight (MVWV) of 
less than about 1000 daltons can pass into 
the other cell, but larger molecules cannot. 
Thus, the coupled cells share their small 
molecules (Such as inorganic ions, Sugars, 
amino acids, nucleotides, vitamins, and the 
intracellular signaling molecules cyclic AMP 
and inositol trisohosphate) but not their 
macromolecules (proteins, nucleic acids, 
and polysaccharides). 
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Gap junctions in different tissues can have different properties because they 
are formed from different combinations of connexins, creating channels that dif- 
fer in permeability and regulation. Most cell types express more than one type of 
connexin, and two different connexin proteins can assemble into a heteromeric 
connexon, with its own distinct properties. Moreover, adjacent cells expressing 
different connexins can form intercellular channels in which the two aligned half- 
channels are different (see Figure 19-25B). 

Like conventional ion channels (discussed in Chapter 11), individual gap- 
junction channels do not remain open all the time; instead, they flip between open 
and closed states. These changes are triggered by a variety of stimuli, including the 
voltage difference between the two connected cells, the membrane potential of 
each cell, and various chemical properties of the cytoplasm, including the pH and 
concentration of free Ca**. Some subtypes of gap junctions can also be regulated 
by extracellular signals such as neurotransmitters. We are only just beginning to 
understand the physiological functions and structural basis of these various gat- 
ing mechanisms. 

Each gap-junctional plaque is a dynamic structure that can readily assem- 
ble, disassemble, or be remodeled, and it can contain a cluster of a few to many 
thousands of connexons (see Figure 19-23B). Studies with fluorescently labeled 
connexins in living cells show that new connexons are continually added around 
the periphery of an existing junctional plaque, while old connexons are removed 
from the middle of it and destroyed (Figure 19-26). This turnover is rapid: the 
connexin molecules have a half-life of only a few hours. 

The mechanism of removal of old connexons from the middle of the plaque 
is not known, but the route of delivery of new connexons to its periphery seems 
clear: they are inserted into the plasma membrane by exocytosis, like other inte- 
gral membrane proteins, and then diffuse in the plane of the membrane until 
they bump into the periphery of a connexon plaque and become trapped. This 
has a corollary: the plasma membrane away from the gap junction should contain 
connexons—hemichannels—that have not yet paired with their counterparts on 
another cell. It is thought that these unpaired hemichannels are normally held 
in a closed conformation, preventing the cell from losing its small molecules by 
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Figure 19-25 Gap junctions. (A) A drawing 
of the interacting plasma membranes 

of two adjacent cells connected by gap 
junctions. Each lipid bilayer is shown as 

a pair of red sheets. Protein assemblies 
called connexons (green), each of which 

is formed by six connexin subunits, 
penetrate the apposed lipid bilayers. Two 
connexons join across the intercellular 

gap to form a continuous aqueous 

channel connecting the two cells. (B) The 
organization of connexins into connexons, 
and connexons into intercellular channels. 
The connexons can be homomeric or 
heteromeric, and the intercellular channels 
can be homotypic or heterotypic. (C) The 
high-resolution structure of a homomeric 
gap-junction channel, determined by x-ray 
crystallography of human connexin 26. In 
this view, we are looking down on the pore, 
formed from six connexin subunits. The 
structure illustrates the general features of 
the channel and suggests a pore size of 
about 1.4 nm, as predicted from studies 

of gap-junction permeability with molecules 
of various sizes (see Figure 19-24). 

(PDB code: 2ZW3.) 
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Figure 19-26 Connexin turnover at a gap junction. Cells were transfected 
with a slightly modified connexin gene, coding for a connexin with a short 
amino acid tag containing four cysteines in the sequence Cys-Cys-X-X- 
Cys-Cys (where X denotes an arbitrary amino acid). This tetracysteine tag 
can bind strongly to certain small fluorescent dye molecules, which can be 
added to the culture medium and will readily enter cells by diffusing across 
the plasma membrane. In the experiment shown, a green dye was added 
first to label all the connexin molecules in the cells, and the cells were then 
washed and incubated for 4 or 8 hours. At the end of this time, a red dye 
was added to the medium and the cells were washed again and fixed. 
Connexin molecules already present at the beginning of the experiment are 
labeled green (and take up no red dye because their tetracysteine tags are 
already saturated with green dye), while connexins synthesized subsequently, 
during the 4- or 8-hour incubation, are labeled red. The fluorescence images 
show gap junctions between pairs of cells treated in this way. The central 
part of the gap-junction plaque is green, indicating that it consists of old 
connexin molecules, while the periphery is red, indicating that it consists 

of connexins synthesized during the previous 4 or 8 hours. The longer the 
time of incubation, the smaller the green central patch of old molecules, and 
the larger the peripheral ring of new molecules that have been recruited to 
replace the old ones. (From G. Gaietta et al., Science 296:503-507, 2002. 
With permission from AAAS.) 


leakage through them. But there is also evidence that in some circumstances they 
can open and serve as channels for the release of small signal molecules. 


In Plants, Plasmodesmata Perform Many of the Same Functions 
as Gap Junctions 


The tissues of a plant are organized on different principles from those of an ani- 
mal. This is because plant cells are imprisoned within tough cell walls composed 
of an extracellular matrix rich in cellulose and other polysaccharides, as we dis- 
cuss later. The cell walls of adjacent cells are firmly cemented to those of their 
neighbors, which eliminates the need for anchoring junctions to hold the cells 
in place. But a need for direct cell-cell communication remains. Thus, plant cells 
have only one class of intercellular junctions, plasmodesmata. Like gap junc- 
tions, they directly connect the cytoplasms of adjacent cells. 

In plants, the cell wall between a typical pair of adjacent cells is at least 0.1 um 
thick, and so a structure very different from a gap junction is required to medi- 
ate communication across it. Plasmodesmata solve the problem. With a few spe- 
cialized exceptions, every living cell in a higher plant is connected to its living 
neighbors by these structures, which form fine cytoplasmic channels through the 
intervening cell walls. As shown in Figure 19-27A, the plasma membrane of one 
cell is continuous with that of its neighbor at each plasmodesma, which connects 
the cytoplasms of the two cells by a roughly cylindrical channel with a diameter 
of 20-40 nm. 

Running through the center of the channel in most plasmodesmata is a nar- 
rower cylindrical structure, the desmotubule, which is continuous with elements 
of the smooth endoplasmic reticulum (ER) in each of the connected cells (Fig- 
ure 19-27B-D). Between the outside of the desmotubule and the inner face of 
the cylindrical channel formed by plasma membrane is an annulus of cytosol 
through which small molecules can pass from cell to cell. As each new cell wall 
is assembled during the cytokinesis phase of cell division, plasmodesmata are 
created within it. They form around elements of smooth ER that become trapped 
across the developing cell plate (discussed in Chapter 17). They can also be 
inserted de novo through preexisting cell walls, where they are commonly found 
in dense clusters called pit fields. When no longer required, plasmodesmata can 
be removed. 

In spite of the radical difference in structure between plasmodesmata and gap 
junctions, they seem to function in remarkably similar ways. Evidence obtained 
by injecting tracer molecules of different sizes suggests that plasmodesmata allow 
the passage of molecules with a molecular weight of less than about 800, which 
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Figure 19-27 Plasmodesmata. (A) The cytoplasmic channels of 
plasmodesmata pierce the plant cell wall and connect cells in a plant 
together. (B) Each plasmodesma is lined with plasma membrane that is 
common to the two connected cells. It usually also contains a fine tubular 
structure, the desmotubule, derived from smooth endoplasmic reticulum. 
(C) Electron micrograph of a longitudinal section of a plasmodesma from 
a water fern. The plasma membrane lines the pore and is continuous from 
one cell to the next. Endoplasmic reticulum and its association with the 
central desmotubule can also be seen. (D) A similar plasmodesma seen in 
cross section. (C and D, from R. Overall, J. Wolfe and B.E.S. Gunning, in 
Protoplasma 111, pp. 184-150. Heidelberg: Springer-Verlag, 1982.) 


is similar to the molecular-weight cutoff for gap junctions. As with gap junctions, 
transport through plasmodesmata is regulated. Dye-injection experiments, for 
example, show that there can be barriers to the movement of even low-molecular- (D) 
weight molecules between certain cells, or groups of cells, that are connected by wall 
apparently normal plasmodesmata; the mechanisms that restrict communication 

in these cases are not understood. 
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Selectins Mediate Transient Cell-—Cell Adhesions in the 
Bloodstream 


We now complete our overview of cell-cell junctions and adhesion by briefly 
describing some of the more specialized adhesion mechanisms used in some tis- 
sues. In addition to those we have already discussed, at least three other super- 
families of cell-cell adhesion proteins are important: the integrins, the selectins, 
and the adhesive immunoglobulin (Ig) superfamily members. We shall discuss 
integrins in more detail later: their main function is in cell-matrix adhesion, but a 
few of them mediate cell-cell adhesion in specialized circumstances. Ca** depen- 
dence provides one simple way to distinguish among these classes of adhesion 
proteins experimentally. Selectins, like cadherins and integrins, require Ca** for 
their adhesive function; Ig superfamily members do not. 

Selectins are cell-surface carbohydrate-binding proteins (lectins) that medi- 
ate a variety of transient cell-cell adhesion interactions in the bloodstream. Their 
main role, in vertebrates at least, is in governing the traffic of white blood cells 
into normal lymphoid organs and any inflamed tissues. White blood cells lead a 
nomadic life, roving between the bloodstream and the tissues, and this necessi- 
tates special adhesive behavior. The selectins control the binding of white blood 
cells to the endothelial cells lining blood vessels, thereby enabling the blood cells 
to migrate out of the bloodstream into a tissue. 

Each selectin is a transmembrane protein with a conserved lectin domain that 
binds to a specific oligosaccharide on another cell (Figure 19-28A). There are at 
least three types: L-selectin on white blood cells, P-selectin on blood platelets and 
on endothelial cells that have been locally activated by an inflammatory response, 
and E-selectin on activated endothelial cells. In a lymphoid organ, such as a lymph 
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node or the spleen, the endothelial cells express oligosaccharides that are rec- 
ognized by L-selectin on lymphocytes, causing the lymphocytes to loiter and 
become trapped. At sites of inflammation, the roles are reversed: the endothe- 
lial cells switch on expression of selectins that recognize the oligosaccharides on 
white blood cells and platelets, flagging the cells down to help deal with the local 
emergency. Selectins do not act alone, however; they collaborate with integrins, 
which strengthen the binding of the blood cells to the endothelium. The cell-cell 
adhesions mediated by both selectins and integrins are heterophilic—that is, the 
binding is to a molecule of a different type: selectins bind to specific oligosaccha- 
rides on glycoproteins and glycolipids, while integrins bind to specific Ig-family 
proteins. 

Selectins and integrins act in sequence to let white blood cells leave the blood- 
stream and enter tissues (Figure 19-28B). The selectins mediate a weak adhesion 
because the binding of the lectin domain of the selectin to its carbohydrate ligand 
is of low affinity. This allows the white blood cell to adhere weakly and revers- 
ibly to the endothelium, rolling along the surface of the blood vessel, propelled 
by the flow of blood. The rolling continues until the blood cell activates its integ- 
rins. As we discuss later, these transmembrane molecules can be switched into an 
adhesive conformation that enables them to latch onto specific macromolecules 
external to the cell—in the present case, proteins on the surfaces of the endothe- 
lial cells. Once it has attached in this way, the white blood cell escapes from the 
bloodstream into the tissue by crawling out of the blood vessel between adjacent 
endothelial cells. 


Members of the Immunoglobulin Superfamily Mediate 
Ca*t-Independent Cell-Cell Adhesion 


The chief endothelial cell proteins that are recognized by the white blood cell inte- 
grins are called ICAMs (intercellular cell adhesion molecules) or VCAMs (vascular 
cell adhesion molecules). They are members of another large and ancient family 
of cell-surface molecules—the immunoglobulin (Ig) superfamily. These contain 
one or more extracellular Ig-like domains that are characteristic of antibody mol- 
ecules. They have many functions outside the immune system that are unrelated 
to immune defenses. 

While ICAMs and VCAMs on endothelial cells both mediate heterophilic 
binding to integrins, many other Ig superfamily members appear to mediate 
homophilic binding. An example is the neural cell adhesion molecule (NCAM), 
which is expressed by various cell types, including most nerve cells, and can take 
different forms, generated by alternative splicing of an RNA transcript produced 
from a single gene (Figure 19-29). Some forms of NCAM carry an unusually large 
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Figure 19-28 The structure and 

function of selectins. (A) The structure 

of P-selectin. The selectin attaches to the 
actin cytoskeleton through adaptor proteins 
that are still poorly characterized. (B) How 
selectins and integrins mediate the cell-cell 
adhesions required for a white blood cell 
to migrate out of the bloodstream into a 
tissue. First, selectins on endothelial cells 
bind to oligosaccharides on the white 
blood cell, so that it becomes loosely 
attached and rolls along the vessel wall. 
Then the white blood cell activates a cell- 
surface integrin called LFA1, which binds 
to a protein called ICAM1 (belonging to 

the Ig superfamily) on the membrane of 
the endothelial cell. The white blood cell 
adheres to the vessel wall and then crawls 
out of the vessel by a process that requires 
another immunoglobulin superfamily 
member called PECAM1 (or CD31), not 
shown (Movie 19.2). EGF, epidermal 
growth factor. 
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quantity of sialic acid (with chains containing hundreds of repeating sialic acid 
units). By virtue of their negative charge, the long polysialic acid chains can inter- 
fere with cell adhesion (because like charges repel one another); thus, these forms 
of NCAM can serve to inhibit adhesion, rather than cause it. 

A cell of a given type generally uses an assortment of different adhesion pro- 
teins to interact with other cells, just as each cell uses an assortment of differ- 
ent receptors to respond to the many soluble extracellular signal molecules in its 
environment. Although cadherins and Ig superfamily members are frequently 
expressed on the same cells, the adhesions mediated by cadherins are much 
stronger, and they are largely responsible for holding cells together, segregating 
cell collectives into discrete tissues, and maintaining tissue integrity. Molecules 
such as NCAM seem to contribute more to the fine-tuning of these adhesive inter- 
actions during development and regeneration, playing a part in various special- 
ized adhesive phenomena, such as that discussed for blood cells and endothelial 
cells. Thus, while mutant mice that lack N-cadherin die early in development, 
those that lack NCAM develop relatively normally but show some mild abnormal- 
ities in the development of certain specific tissues, including parts of the nervous 
system. 


Summary 


In epithelia, as well as in some other types of tissue, cells are directly attached to one 
another through strong cell-cell adhesions, mediated by transmembrane proteins 
called cadherins, which are anchored intracellularly to the cytoskeleton. Cadherins 
generally bind to one another homophilically: the head of one cadherin molecule 
binds to the head of a similar cadherin on an opposite cell. This selectivity enables 
mixed populations of cells of different types to sort out from one another according 
to the specific cadherins they express, and it helps to control cell rearrangements 
during development. 

The “classical” cadherins at adherens junctions are linked to the actin cytoskel- 
eton by intracellular adaptor proteins called catenins. These form an anchoring 
complex on the intracellular tail of the cadherin molecule, and are involved not 
only in physical anchorage but also in the detection of and response to tension and 
other regulatory signals at the junction. 

Tight junctions seal the gaps between cells in epithelia, creating a barrier to the 
diffusion of molecules across the cell sheet and also helping to separate the popula- 
tions of proteins in the apical and basolateral plasma membrane domains of the 
epithelial cell. Claudins are the major transmembrane proteins forming gap junc- 
tions. Intracellular scaffold proteins organize the claudins and other junctional 
proteins into a complex protein network that is linked to the actin cytoskeleton. 


Figure 19-29 Two members of the 

Ig superfamily of cell-cell adhesion 
molecules. NCAM is expressed on 
neurons and many other cell types, and 
mediates homophilic binding. ICAM is 
expressed on endothelial cells and some 
other cell types and binds heterophilically 
to an integrin on white blood cells. Both 
NCAM and ICAM are glycoproteins, but 
their attached carbohydrate chains are 
not shown. 


THE EXTRACELLULAR MATRIX OF ANIMALS 


The cells of many animal tissues are coupled by gap junctions, which take the 
form of plaques of clustered connexons, which usually allow molecules smaller than 
about 1000 daltons to pass directly from the inside of one cell to the inside of the 
next. Cells connected by gap junctions share many of their inorganic ions and other 
small molecules and are therefore chemically and electrically coupled. 

Three additional classes of transmembrane adhesion proteins mediate more 
transient cell-cell adhesion: selectins, immunoglobulin (Ig) superfamily members, 
and integrins. Selectins are expressed on white blood cells, blood platelets, and 
endothelial cells; they bind heterophilically to carbohydrate groups on cell surfaces, 
helping to mediate the adhesive interactions between these cells. Ig superfamily pro- 
teins also play a part in these interactions, as well as in many other adhesive pro- 
cesses; some of them bind homophilically, some heterophilically. Integrins, though 
they mainly serve to attach cells to the extracellular matrix, can also mediate cell- 
cell adhesion by binding to specific Ig superfamily proteins. 


THE EXTRACELLULAR MATRIX OF ANIMALS 


Tissues are not made up solely of cells. They also contain a remarkably complex 
and intricate network of macromolecules constituting the extracellular matrix. 
This matrix is composed of many different proteins and polysaccharides that are 
secreted locally and assembled into an organized meshwork in close association 
with the surfaces of the cells that produce them. 

The classes of macromolecules constituting the extracellular matrix in differ- 
ent animal tissues are broadly similar, but variations in the relative amounts of 
these different classes of molecules and in the ways in which they are organized 
give rise to an amazing diversity of materials. The matrix can become calcified to 
form the rock-hard structures of bone or teeth, or it can form the transparent sub- 
stance of the cornea, or it can adopt the ropelike organization that gives tendons 
their enormous tensile strength. It forms the jelly in a jellyfish. Covering the body 
of a beetle or a lobster, it forms a rigid carapace. Moreover, the extracellular matrix 
is more than a passive scaffold to provide physical support. It has an active and 
complex role in regulating the behavior of the cells that touch it, inhabit it, or crawl 
through its meshes, influencing their survival, development, migration, prolifera- 
tion, shape, and function. 

In this section, we describe the major features of the extracellular matrix in 
animal tissues, with an emphasis on vertebrates. We begin with an overview of the 
major classes of macromolecules in the matrix, after which we turn to the struc- 
ture and function of the basal lamina, the thin layer of specialized extracellular 
matrix that lies beneath all epithelial cells. In the sections that follow, we then 
describe the varied types of cell-matrix junctions through which cells are con- 
nected to the matrix. 


The Extracellular Matrix Is Made and Oriented by the Cells Within It 


The macromolecules that constitute the extracellular matrix are mainly produced 
locally by cells in the matrix. As we discuss later, these cells also help to organize 
the matrix: the orientation of the cytoskeleton inside the cell can control the ori- 
entation of the matrix produced outside. In most connective tissues, the matrix 
macromolecules are secreted by cells called fibroblasts (Figure 19-30). In certain 
specialized types of connective tissues, such as cartilage and bone, however, they 
are secreted by cells of the fibroblast family that have more specific names: chon- 
droblasts, for example, form cartilage, and osteoblasts form bone. 

The extracellular matrix is constructed from three major classes of macro- 
molecules: (1) glycosaminoglycans (GAGs), which are large and highly charged 
polysaccharides that are usually covalently linked to protein in the form of pro- 
teoglycans; (2) fibrous proteins, which are primarily members of the collagen fam- 
ily; and (3) a large class of noncollagen glycoproteins, which carry conventional 
asparagine-linked oligosaccharides (described in Chapter 12). All three classes of 
macromolecule have many members and come in a great variety of shapes and 
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Figure 19-30 Fibroblasts in connective 
tissue. This scanning electron micrograph 
shows tissue from the cornea of a rat. 
The extracellular matrix surrounding the 
fibroblasts is here composed largely 

of collagen fibrils. The glycoproteins, 
hyaluronan, and proteoglycans, which 
normally form a hydrated gel filling the 
interstices of the fibrous network, have 
been removed by enzyme and acid 
treatment. (Courtesy of T. Nishida.) 
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sizes (Figure 19-31). Mammals are thought to have almost 300 matrix proteins, 
including about 36 proteoglycans, about 40 collagens, and over 200 glycoproteins, 
which usually contain multiple subdomains and self-associate to form multimers. 
Add to this the large number of matrix-associated proteins and enzymes that can 
modify matrix behavior by cross-linking, degradation, or other mechanisms, and 
one begins to see that the matrix is an almost infinitely variable material. Each tis- 
sue contains its own unique blend of matrix components, resulting in an extracel- 
lular matrix that is specialized for the needs of that tissue. 

The proteoglycan molecules in connective tissue typically form a highly 
hydrated, gel-like “ground substance” in which collagens and glycoproteins are 
embedded. The polysaccharide gel resists compressive forces on the matrix while 
permitting the rapid diffusion of nutrients, metabolites, and hormones between 
the blood and the tissue cells. The collagen fibers strengthen and help organize 
the matrix, while other fibrous proteins, such as the rubberlike elastin, give it resil- 
ience. Finally, the many matrix glycoproteins help cells migrate, settle, and differ- 
entiate in the appropriate locations. 


Glycosaminoglycan (GAG) Chains Occupy Large Amounts of 
Space and Form Hydrated Gels 


Glycosaminoglycans (GAGs) are unbranched polysaccharide chains composed 
of repeating disaccharide units. One of the two sugars in the repeating disaccha- 
ride is always an amino sugar (N-acetylglucosamine or N-acetylgalactosamine), 
which in most cases is sulfated. The second sugar is usually a uronic acid (gluc- 
uronic or iduronic). Because there are sulfate or carboxyl groups on most of their 
sugars, GAGs are highly negatively charged (Figure 19-32). Indeed, they are the 
most anionic molecules produced by animal cells. Four main groups of GAGs 
are distinguished by their sugars, the type of linkage between the sugars, and the 
number and location of sulfate groups: (1) hyaluronan, (2) chondroitin sulfate and 
dermatan sulfate, (3) heparan sulfate, and (4) keratan sulfate. 

Polysaccharide chains are too stiff to fold into compact globular structures, 
and they are strongly hydrophilic. Thus, GAGs tend to adopt highly extended con- 
formations that occupy a huge volume relative to their mass (Figure 19-33), and 
they form hydrated gels even at very low concentrations. The weight of GAGs in 
connective tissue is usually less than 10% of the weight of proteins, but GAG chains 
fill most of the extracellular space. Their high density of negative charges attracts a 
cloud of cations, especially Na+, that are osmotically active, causing large amounts 
of water to be sucked into the matrix. This creates a swelling pressure, or turgor, 
that enables the matrix to withstand compressive forces (in contrast to collagen 
fibrils, which resist stretching forces). The cartilage matrix that lines the knee 
joint, for example, can support pressures of hundreds of atmospheres in this way. 

Defects in the production of GAGs can affect many different body systems. In 
one rare human genetic disease, for example, there is a severe deficiency in the 
synthesis of dermatan sulfate disaccharide. The affected individuals have a short 
stature, a prematurely aged appearance, and generalized defects in their skin, 
joints, muscles, and bones. 
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Figure 19-31 The comparative shapes 
and sizes of some of the major 
extracellular matrix macromolecules. 
Protein is shown in green, and 
glycosaminoglycan (GAG) in red. 
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Figure 19-32 The repeating disaccharide 
sequence of a heparan sulfate 
glycosaminoglycan (GAG) chain. These 
chains can consist of as many as 200 
disaccharide units, but are typically less 
than half that size. There is a high density 
of negative charges along the chain due to 
the presence of both carboxy! and sulfate 
groups. The molecule is shown here with 
its maximal number of sulfate groups. 

In vivo, the proportion of sulfated and 
nonsulfated groups is variable. Heparin 
typically has >70% sulfation, while heparan 
sulfate has <50%. 
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Hyaluronan Acts as a Space Filler During Tissue Morphogenesis 
and Repair 


Hyaluronan (also called hyaluronic acid or hyaluronate) is the simplest of the 
GAGs (Figure 19-34). It consists of a regular repeating sequence of up to 25,000 
disaccharide units, is found in variable amounts in all tissues and fluids in adult 
animals, and is especially abundant in early embryos. Hyaluronan is not a typical 
GAG because it contains no sulfated sugars, all its disaccharide units are identi- 
cal, its chain length is enormous, and it is not generally linked covalently to any 
core protein. Moreover, whereas other GAGs are synthesized inside the cell and 
released by exocytosis, hyaluronan is spun out directly from the cell surface by an 
enzyme complex embedded in the plasma membrane. 

Hyaluronan is thought to have a role in resisting compressive forces in tissues 
and joints. It is also important as a space filler during embryonic development, 
where it can be used to force a change in the shape of a structure, as a small quan- 
tity expands with water to occupy a large volume. Hyaluronan synthesized locally 
from the basal side of an epithelium can deform the epithelium by creating a cell- 
free space beneath it, into which cells subsequently migrate. In the developing 
heart, for example, hyaluronan synthesis helps in this way to drive formation of 
the valves and septa that separate the heart’s chambers. Similar processes occur 
in several other organs. When cell migration ends, the excess hyaluronan is gener- 
ally degraded by the enzyme hyaluronidase. Hyaluronan is also produced in large 
quantities during wound healing, and it is an important constituent of joint fluid, 
in which it serves as a lubricant. 


Proteoglycans Are Composed of GAG Chains Covalently Linked 
to a Core Protein 


Except for hyaluronan, all GAGs are covalently attached to protein as proteogly- 
cans, which are produced by most animal cells. Membrane-bound ribosomes 
make the polypeptide chain, or core protein, of a proteoglycan, which is then 
threaded into the lumen of the endoplasmic reticulum. The polysaccharide chains 
are mainly assembled on this core protein in the Golgi apparatus before delivery 
to the exterior of the cell by exocytosis. First, a special linkage tetrasaccharide is 
attached to a serine side chain on the core protein to serve as a primer for polysac- 
charide growth; then, one sugar at a time is added by specific glycosyl transferases 
(Figure 19-35). While still in the Golgi apparatus, many of the polymerized sugars 
are covalently modified by a sequential and coordinated series of reactions. Epi- 
merizations alter the configuration of the substituents around individual carbon 
atoms in the sugar molecule; sulfations increase the negative charge. 
Proteoglycans are clearly distinguished from other glycoproteins by the nature, 
quantity, and arrangement of their sugar side chains. By definition, at least one of 
the sugar side chains of a proteoglycan must be a GAG. Whereas glycoproteins 
generally contain relatively short, branched oligosaccharide chains that contrib- 
ute only a small fraction of their weight, proteoglycans can contain as much as 
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Figure 19-33 The relative dimensions 
and volumes occupied by various 
macromolecules. Several proteins, a 
glycogen granule, and a single hydrated 
molecule of hyaluronan are shown. 


Figure 19-34 The repeating disaccharide 
sequence in hyaluronan, a relatively 
simple GAG. This ubiquitous molecule in 
vertebrates consists of a single long chain 
of up to 25,000 sugar monomers. Note the 
absence of sulfate groups. 
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95% carbohydrate by weight, mostly in the form of long, unbranched GAG chains, 
each typically about 80 sugars long. 

In principle, proteoglycans have the potential for almost limitless heterogene- 
ity. Even a single type of core protein can carry highly variable numbers and types 
of attached GAG chains. Moreover, the underlying repeating sequence of disac- 
charides in each GAG can be modified by a complex pattern of sulfate groups. The 
core proteins, too, are diverse, though many of them share some characteristic 
domains such as the LINK domain, involved in binding to GAGs. 

Proteoglycans can be huge. The proteoglycan aggrecan, for example, which is 
a major component of cartilage, has a mass of about 3 x 10° daltons with over 
100 GAG chains. Other proteoglycans are much smaller and have only 1-10 GAG 
chains; an example is decorin, which is secreted by fibroblasts and has a single 
GAG chain (Figure 19-36). Decorin binds to collagen fibrils and regulates fibril 
assembly and fibril diameter; mice that cannot make decorin have fragile skin 
that has reduced tensile strength. The GAGs and proteoglycans of these various 
types can associate to form even larger polymeric complexes in the extracellu- 
lar matrix. Molecules of aggrecan, for example, assemble with hyaluronan in 
cartilage matrix to form aggregates that are as big as a bacterium (Figure 19-37). 
Moreover, besides associating with one another, GAGs and proteoglycans asso- 
ciate with fibrous matrix proteins such as collagen and with protein meshworks 
such as the basal lamina, creating extremely complex composites (Figure 19-38). 

Not all proteoglycans are secreted components of the extracellular matrix. 
Some are integral components of plasma membranes and have their core pro- 
tein either inserted across the lipid bilayer or attached to the lipid bilayer by a 
glycosylphosphatidylinositol (GPI) anchor. Among the best-characterized plasma 
membrane proteoglycans are the syndecans, which have a membrane-spanning 
core protein whose intracellular domain is thought to interact with the actin cyto- 
skeleton and with signaling molecules in the cell cortex. Syndecans are located 
on the surface of many types of cells, including fibroblasts and epithelial cells. In 
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Figure 19-35 The linkage between a 
GAG chain and its core protein in a 
proteoglycan molecule. A specific link 
tetrasaccharide is first assembled on a 
serine side chain. The rest of the GAG 
chain, consisting mainly of a repeating 
disaccharide unit, is then synthesized, 
with one sugar being added at a time. 
In chondroitin sulfate, the disaccharide 
is composed of D-glucuronic acid and 
N-acetyl-D-galactosamine; in heparan 
sulfate, it is either D-glucuronic acid 

or L-iduronic acid and N-acetyl-D- 
glucosamine; in keratan sulfate, it is 
D-galactose and N-acetyl-D-glucosamine. 


Figure 19-36 Examples of a small 
(decorin) and a large (aggrecan) 
proteoglycan found in the extracellular 
matrix. The figure compares these two 
proteoglycans with a typical secreted 
glycoprotein molecule, pancreatic 
ribonuclease B. All three are drawn to 
scale. The core proteins of both aggrecan 
and decorin contain oligosaccharide chains 
as well as the GAG chains, but these are 
not shown. Aggrecan typically consists of 
about 100 chondroitin sulfate chains and 
about 30 keratan sulfate chains linked to 
a serine-rich core protein of almost 3000 
amino acids. Decorin “decorates” the 
surface of collagen fibrils, hence its name. 


RIBONUCLEASE 
(MW ~15,000) 


short, branched 
oligosaccharide 


ide chai 
side chain <u 


polypeptide chain 


THE EXTRACELLULAR MATRIX OF ANIMALS 


ZZ Pr 
A 
p ea 


ees 


whee 
ne 
eee 


rT tee 
_ 


My 


ey giy 8. in ro TOSA 5 re, F 
Palate ys Fe pee eae) EA Eero SON x s ae, 
4 { Say 7 Seu Taft ee be Ponty OG 3 eos > 
aN ay, Uitte Fas SP pet et aie Renae et pate fo oi AORE taau E 
>` hd ETE eR RE tip et Tei NAN ir 
(A) | | 
1 um 


fibroblasts, syndecans can be found in cell-matrix adhesions, where they modu- 
late integrin function by interacting with fibronectin on the cell surface and with 
cytoskeletal and signaling proteins inside the cell. As we discuss later, syndecan 
and other proteoglycans also interact with soluble peptide growth factors, influ- 


encing their effects on cell growth and proliferation. 


Collagens Are the Major Proteins of the Extracellular Matrix 


The collagens are a family of fibrous proteins found in all multicellular animals. 
They are secreted in large quantities by connective-tissue cells, and in smaller 
quantities by many other cell types. As a major component of skin and bone, col- 
lagens are the most abundant proteins in mammals, where they constitute 25% of 


the total protein mass. 


The primary feature of a typical collagen molecule is its long, stiff, tri- 
ple-stranded helical structure, in which three collagen polypeptide chains, called 
a chains, are wound around one another in a ropelike superhelix (Figure 19-39). 
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Figure 19-37 An aggrecan aggregate 
from fetal bovine cartilage. (A) An 
electron micrograph of an aggrecan 
aggregate shadowed with platinum. Many 
free aggrecan molecules are also visible. 
(B) A drawing of the giant aggrecan 
aggregate shown in (A). It consists of about 
100 aggrecan monomers (each like the 
one shown in Figure 19-36) noncovalently 
bound through the N-terminal domain of 
the core protein to a single hyaluronan 
chain. A link protein binds both to the core 
protein of the proteoglycan and to the 
hyaluronan chain, thereby stabilizing the 
aggregate. The link proteins are members 
of a family of hyaluronan-binding proteins, 
some of which are cell-surface proteins. 
The molecular mass of such a complex can 
be 108 daltons or more, and it occupies a 
volume equivalent to that of a bacterium, 
which is about 2 x 10712 cm®. (A, courtesy 
of Lawrence Rosenberg.) 


Figure 19-38 Proteoglycans in the 
extracellular matrix of rat cartilage. 

The tissue was rapidly frozen at -196°C, 
and fixed and stained while still frozen 

(a process called freeze substitution) to 
prevent the GAG chains from collapsing. In 
this electron micrograph, the proteoglycan 
molecules are seen to form a fine 
filamentous network in which a single 
striated collagen fibril is embedded. 

The more darkly stained parts of the 
proteoglycan molecules are the core 
proteins; the faintly stained threads are 
the GAG chains. (Reproduced from 

E.B. Hunziker and R.K. Schenk, J. Cell 
Biol. 98:277-282, 1984. With permission 
from The Rockefeller University Press.) 
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Figure 19-39 The structure of a typical collagen molecule. (A) A model of 
part of a single collagen a chain, in which each amino acid is represented by 
a sphere. The chain is about 1000 amino acids long. It is arranged as a left- 
handed helix, with three amino acids per turn and with glycine as every third 
amino acid. Therefore, an a chain is composed of a series of triplet Gly-X-Y 
sequences, in which X and Y can be any amino acid (although X is commonly 
proline and Y is commonly hydroxyproline, a form of proline that is chemically 
modified during collagen synthesis in the cell). (B) A model of part of a 
collagen molecule, in which three a chains, each shown in a different color, 
are wrapped around one another to form a triple-stranded helical rod. Glycine 
is the only amino acid small enough to occupy the crowded interior of the 
triple helix. Only a short length of the molecule is shown; the entire molecule 
is 8300 nm long. (From a model by B.L. Trus.) 


Collagens are extremely rich in proline and glycine, both of which are important 
in the formation of the triple-stranded helix. 

The human genome contains 42 distinct genes coding for different collagen a 
chains. Different combinations of these genes are expressed in different tissues. 
Although in principle thousands of types of triple-stranded collagen molecules 
could be assembled from various combinations of the 42 a chains, only a limited 
number of triple-helical combinations are possible, and roughly 40 types of col- 
lagen molecules have been found. Type I is by far the most common, being the 
principal collagen of skin and bone. It belongs to the class of fibrillar collagens, 
or fibril-forming collagens: after being secreted into the extracellular space, they 
assemble into higher-order polymers called collagen fibrils, which are thin struc- 
tures (10-300 nm in diameter) many hundreds of micrometers long in mature 
tissues, where they are clearly visible in electron micrographs (Figure 19-40; see 
also Figure 19-38). Collagen fibrils often aggregate into larger, cablelike bundles, 
several micrometers in diameter, that are visible in the light microscope as colla- 
gen fibers. 

Collagen types IX and XII are called fibril-associated collagens because they 
decorate the surface of collagen fibrils. They are thought to link these fibrils to 
one another and to other components in the extracellular matrix. Type IV is a 
network-forming collagen, forming a major part of basal laminae, while type VII 
molecules form dimers that assemble into specialized structures called anchor- 
ing fibrils. Anchoring fibrils help attach the basal lamina of multilayered epithelia 
to the underlying connective tissue and therefore are especially abundant in the 
skin. There are also a number of “collagen-like” proteins containing short colla- 
gen-like segments. These include collagen type XVII, which has a transmembrane 
domain and is found in hemidesmosomes, and type XVIII, the core protein of a 
proteoglycan in basal laminae. 

Many proteins appear to have evolved by repeated duplications of an original 
DNA sequence, giving rise to a repetitive pattern of amino acids. The genes that 
encode the a chains of most of the fibrillar collagens provide a good example: they 
are very large (up to 44 kilobases in length) and contain about 50 exons. Most of 
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Figure 19-40 A fibroblast surrounded by 
collagen fibrils in the connective tissue 
of embryonic chick skin. In this electron 
micrograph, the fibrils are organized into 
bundles that run approximately at right 
angles to one another. Therefore, some 
bundles are oriented longitudinally, whereas 
others are seen in cross section. The 
collagen fibrils are produced by fibroblasts. 
(From C. Ploetz, E.l. Zycband and 

D.E. Birk, J. Struct. Biol. 106:73-81, 1991. 
With permission from Elsevier.) 
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TABLE 19-2 


Fibril-forming 
(fibrillar) 


Bone, skin, tendons, ligaments, Severe bone defects, fractures 
cornea, internal organs (accounts for | (osteogenesis imperfecta) 
90% of body collagen) 


Fibril Cartilage, intervertebral disc, Cartilage deficiency, dwarfism 
notochord, vitreous humor of the eye | (chondrodysplasia) 


Skin, blood vessels, internal organs Fragile skin, loose joints, blood 
vessels prone to rupture 
(Ehlers—Danlos syndrome) 


V Fibril (with type 1) As for type | Fragile skin, loose joints, blood 
vessels prone to rupture 
Fibril (with type Il) As for type I Myopia, blindness 


Fibril-associated IX Lateral association | Cartilage Osteoarthritis 
with type II fibrils 


Kidney disease (glomerulonephritis), 
deafness 


Proteoglycan Basal lamina Myopia, detached retina, 
core protein hydrocephalus 
( 


Note that types |, IV, V, IX, and XI are each composed of two or three types of a chains (distinct, nonoverlapping sets in each 
case), whereas types Il, Ill, VIL XVII, and XVIII are composed of only one type of a chain each. 





the exons are 54, or multiples of 54, nucleotides long, suggesting that these colla- 
gens originated through multiple duplications of a primordial gene containing 54 
nucleotides and encoding exactly six Gly-X-Y repeats (see Figure 19-39). 

Table 19-2 provides additional details for some of the collagen types discussed 
in this chapter. 


Secreted Fibril-Associated Collagens Help Organize the Fibrils 


In contrast to GAGs, which resist compressive forces, collagen fibrils form struc- 
tures that resist tensile forces. The fibrils have various diameters and are orga- 
nized in different ways in different tissues. In mammalian skin, for example, they 
are woven in a wickerwork pattern so that they resist tensile stress in multiple 
directions; leather consists of this material, suitably preserved. In tendons, col- 
lagen fibrils are organized in parallel bundles aligned along the major axis of ten- Se 
sion. In mature bone and in the cornea, they are arranged in orderly plywoodlike RRB. . - er 
phar cere COR : 


a 


layers, with the fibrils in each layer lying parallel to one another but nearly at right 
angles to the fibrils in the layers on either side. The same arrangement occurs in SS = 
tadpole skin (Figure 19-41). 

The connective-tissue cells themselves determine the size and arrangement Of ESS Sn Ts al 
the collagen fibrils. The cells can express one or more genes for the different types Sea Boies T Ree 
of fibrillar collagen molecules. But even fibrils composed of the same mixture of Se 
collagens have different arrangements in different tissues. How is this achieved? | 


: 


A š EEn Aya < Lees Me pet ie NET, (eye ot vad 
Part of the answer is that cells can regulate the disposition of the collagen SSSA SST en ato 
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Figure 19-41 Collagen fibrils in the tadpole skin. This electron micrograph Tye: ee all 

shows the plywoodlike arrangement of the fibrils: successive layers of fibrils lw — 

are laid down nearly at right angles to each other. This organization is also | | | 
found in mature bone and in the cornea. (Courtesy of Jerome Gross.) 5 um 
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molecules after secretion by guiding collagen fibril formation near the plasma 
membrane. In addition, cells can influence this organization by secreting, along 
with their fibrillar collagens, different kinds and amounts of other matrix mac- 
romolecules. In particular, they secrete the fibrous protein fibronectin, as we 
discuss later, and this precedes the formation of collagen fibrils and helps guide 
their organization. 

Fibril-associated collagens, such as types IX and XII collagens, are thought 
to be especially important in organizing collagen fibrils. They differ from fibril- 
lar collagens in the following ways. First, their triple-stranded helical structure is 
interrupted by one or two short nonhelical domains, which makes the molecules 
more flexible than fibrillar collagen molecules. Second, they do not aggregate 
with one another to form fibrils in the extracellular space. Instead, they bind in 
a periodic manner to the surface of fibrils formed by the fibrillar collagens. Type 
IX molecules bind to type-II-collagen-containing fibrils in cartilage, the cornea, 
and the vitreous of the eye (Figure 19-42), whereas type XII molecules bind to 
type-I-collagen-containing fibrils in tendons and various other tissues. 

Fibril-associated collagens are thought to mediate the interactions of collagen 
fibrils with one another and with other matrix macromolecules to help determine 
the organization of the fibrils in the matrix. 


Cells Help Organize the Collagen Fibrils They Secrete by Exerting 
Tension on the Matrix 


Cells interact with the extracellular matrix mechanically as well as chemically, 
and studies in culture suggest that the mechanical interaction can have dramatic 
effects on the architecture of connective tissue. Thus, when fibroblasts are mixed 
with a meshwork of randomly oriented collagen fibrils that form a gel in a culture 
dish, the fibroblasts tug on the meshwork, drawing in collagen from their sur- 
roundings and thereby causing the gel to contract to a small fraction of its initial 
volume. By similar activities, a cluster of fibroblasts surrounds itself with a capsule 
of densely packed and circumferentially oriented collagen fibers. 

If two small pieces of embryonic tissue containing fibroblasts are placed far 
apart on a collagen gel, the intervening collagen becomes organized into a com- 
pact band of aligned fibers that connect the two explants (Figure 19-43). The 
fibroblasts subsequently migrate out from the explants along the aligned collagen 
fibers. Thus, the fibroblasts influence the alignment of the collagen fibers, and the 
collagen fibers in turn affect the distribution of the fibroblasts. 

Fibroblasts may have a similar role in organizing the extracellular matrix inside 
the body. First they synthesize the collagen fibrils and deposit them in the correct 
orientation. Then they work on the matrix they have secreted, crawling over it and 
tugging on it so as to create tendons and ligaments and the tough, dense layers of 
connective tissue that surround and bind together most organs. 
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Figure 19-42 Type IX collagen. (A) Type IX collagen molecules binding 

in a periodic pattern to the surface of a fibril containing type II collagen. 

(B) Electron micrograph of a rotary-shadowed type-ll-collagen-containing 
fibril in cartilage, decorated by type IX collagen molecules. (C) An individual 
type IX collagen molecule. (B and C, from L. Vaughan et al., J. Cell Biol. 
106:991-997, 1988. With permission from The Rockefeller University Press.) 
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Figure 19-43 The shaping of the 
extracellular matrix by cells. This 
micrograph shows a region between two 
pieces of embryonic chick heart (rich in 
fibroblasts as well as heart muscle cells) 
that were cultured on a collagen gel for 

4 days. A dense tract of aligned collagen 
fibers has formed between the explants, 
presumably as a result of the fibroblasts 
in the explants tugging on the collagen. 
(From D. Stopak and A.K. Harris, Dev. Biol. 
90:383-398, 1982. With permission from 
Academic Press.) 
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Elastin Gives Tissues Their Elasticity 


Many vertebrate tissues, such as skin, blood vessels, and lungs, need to be both 
strong and elastic in order to function. A network of elastic fibers in the extra- 
cellular matrix of these tissues gives them the resilience to recoil after transient 
stretch (Figure 19-44). Elastic fibers are at least five times more extensible than a 
rubber band of the same cross-sectional area. Long, inelastic collagen fibrils are 
interwoven with the elastic fibers to limit the extent of stretching and prevent the 
tissue from tearing. 

The main component of elastic fibers is elastin, a highly hydrophobic protein 
(about 750 amino acids long), which, like collagen, is unusually rich in proline and 
glycine but, unlike collagen, is not glycosylated. Soluble tropoelastin (the biosyn- 
thetic precursor of elastin) is secreted into the extracellular space and assembled 
into elastic fibers close to the plasma membrane, generally in cell-surface infold- 
ings. After secretion, the tropoelastin molecules become highly cross-linked to 
one another, generating an extensive network of elastin fibers and sheets. 

The elastin protein is composed largely of two types of short segments that 
alternate along the polypeptide chain: hydrophobic segments, which are respon- 
sible for the elastic properties of the molecule; and alanine- and lysine-rich a-he- 
lical segments, which are cross-linked to adjacent molecules by covalent attach- 
ment of lysine residues. Each segment is encoded by a separate exon. There is 
still uncertainty concerning the conformation of elastin molecules in elastic fibers 
and how the structure of these fibers accounts for their rubberlike properties. 
However, it seems that parts of the elastin polypeptide chain, like the polymer 
chains in ordinary rubber, adopt a loose “random coil” conformation, and it is the 
random coil nature of the component molecules cross-linked into the elastic fiber 
network that allows the network to stretch and recoil like a rubber band (Figure 
19-45). 

Elastin is the dominant extracellular matrix protein in arteries, comprising 
50% of the dry weight of the largest artery—the aorta (see Figure 19-44). Muta- 
tions in the elastin gene causing a deficiency of the protein in mice or humans 
result in narrowing of the aorta and other arteries and excessive proliferation of 
smooth muscle cells in the arterial wall. Apparently, the normal elasticity of an 
artery is required to restrain the proliferation of these cells. 

Elastic fibers do not consist solely of elastin. The elastin core is covered with a 
sheath of microfibrils, each of which has a diameter of about 10 nm. The microfi- 
brils appear before elastin in developing tissues and seem to provide scaffolding 
to guide elastin deposition. Arrays of microfibrils are elastic in their own right, 
and in some places they persist in the absence of elastin: they help to hold the 
lens in its place in the eye, for example. Microfibrils are composed of a number 
of distinct glycoproteins, including the large glycoprotein fibrillin, which binds to 
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Figure 19-44 Elastic fibers. These 
scanning electron micrographs show 

(A) a low-power view of a segment of a 
dog’s aorta and (B) a high-power view 

of the dense network of longitudinally 
oriented elastic fibers in the outer layer 

of the same blood vessel. All the other 
components have been digested away with 
enzymes and formic acid. (From K.S. Haas 
et al., Anat. Rec. 230:86-96, 1991. With 
permission from Wiley-Liss.) 
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elastin and is essential for the integrity of elastic fibers. Mutations in the fibrillin 
gene result in Marfan’s syndrome, a relatively common human disorder. In the 
most severely affected individuals, the aorta is prone to rupture; other common 
effects include displacement of the lens and abnormalities of the skeleton and 
joints. Affected individuals are often unusually tall and lanky: Abraham Lincoln is 
suspected to have had the condition. 


Fibronectin and Other Multidomain Glycoproteins Help Organize 
the Matrix 


In addition to proteoglycans, collagens, and elastic fibers, the extracellular matrix 
contains a large and varied assortment of glycoproteins that typically have mul- 
tiple domains, each with specific binding sites for other matrix macromolecules 
and for receptors on the surface of cells (Figure 19-46). These proteins therefore 
contribute to both organizing the matrix and helping cells attach to it. Like the 
proteoglycans, they also guide cell movements in developing tissues, by serving 
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Figure 19-45 Stretching a network of 
elastin molecules. The molecules are 
joined together by covalent bonds (red) 
to generate a cross-linked network. In 
this model, each elastin molecule in the 
network can extend and contract in a 
manner resembling a random coil, so that 
the entire assembly can stretch and recoil 
like a rubber band. 


Figure 19-46 Complex glycoproteins 
of the extracellular matrix. Many matrix 
glycoproteins are large scaffold proteins 
containing multiple copies of specific 
protein-interaction domains. Each domain 
is folded into a discrete globular structure, 
and many such domains are arrayed 
along the protein like beads on a string. 
This diagram shows four representative 
proteins among the roughly 200 matrix 
glycoproteins that are found in mammals. 
Each protein contains multiple repeat 
domains, with the names listed in the key 
at the bottom. Fibronectin, for example, 
contains numerous copies of three different 
fibronectin repeats (types I-lll, labeled 
here as FN1, FN2, and FNS). Two type 

Il repeats near the C-terminus contain 
important binding sites for cell-surface 
integrins, whereas other FN repeats are 
involved in binding fibrin, collagen, and 
heparin, as indicated (see Figure 19-47). 
Other matrix proteins contain repeated 
sequences resembling those of epidermal 
growth factor (EGF), a major regulator 

of cell growth and proliferation; these 
repeats might serve a similar signaling 
function in matrix proteins. Other proteins 
contain domains, such as the insulin-like 
growth factor-binding protein (IGFBP) 
repeat, that bind and regulate the function 
of soluble growth factors. To add more 
structural diversity, many of these proteins 
are encoded by RNA transcripts that 

can be spliced in different ways, adding 
or removing exons, such as those in 
fibronectin. Finally, the scaffolding and 
regulatory functions of many matrix 
proteins are further expanded by assembly 
into multimeric forms, as shown at the 
right: fibronectin forms dimers linked 

at the C-termini, whereas tenascin and 
thrombospondin form N-terminally linked 
hexamers and trimers, respectively. 

Other domains include four repeats from 
thrombospondin (TSPN, TSP1, TSP3, 
TSP_C). VWC, von Willebrand type C; 
FBG, fibrinogen-like. (Adapted from 

R.O. Hynes and A. Naba, Cold Spring 
Harb. Perspect. Biol. 4:a004903, 2012.) 
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as tracks along which cells can migrate or as repellents that keep cells out of for- 
bidden areas. They can also bind and thereby influence the function of peptide 
growth factors and other small molecules produced by nearby cells. 

The best-understood member of this class of matrix proteins is fibronec- 
tin, a large glycoprotein found in all vertebrates and important for many cell- 
matrix interactions. Mutant mice that are unable to make fibronectin die early in 
embryogenesis because their endothelial cells fail to form proper blood vessels. 
The defect is thought to result from abnormalities in the interactions of these cells 
with the surrounding extracellular matrix, which normally contains fibronectin. 

Fibronectin is a dimer composed of two very large subunits joined by disulfide 
bonds at their C-terminal ends. Each subunit contains a series of small repeated 
domains, or modules, separated by short stretches of flexible polypeptide chain 
(Figure 19-47). Each domain is usually encoded by a separate exon, suggesting 
that the fibronectin gene, like the genes encoding many matrix proteins, evolved 
by multiple exon duplications. In the human genome, there is only one fibronectin 
gene, containing about 50 exons of similar size, but the transcripts can be spliced 
in different ways to produce multiple fibronectin isoforms (see Figure 19-46). 
The major repeat domain in fibronectin is called the type III fibronectin repeat, 
which is about 90 amino acids long and occurs at least 15 times in each subunit. 
This repeat is among the most common of all protein domains in vertebrates. 


Fibronectin Binds to Integrins 


One way to analyze a complex multifunctional protein molecule such as fibronec- 
tin is to synthesize individual regions of the protein and test their ability to bind 
other proteins. By these and other methods, it was possible to show that one 
region of fibronectin binds to collagen, another to proteoglycans, and another 
to specific integrins on the surface of various types of cells (see Figure 19-47B). 
Synthetic peptides corresponding to different segments of the integrin-binding 
domain were then used to show that binding depends on a specific tripeptide 
sequence (Arg-Gly-Asp, or RGD) that is found in one of the type III repeats (see 
Figure 19-47C). Even very short peptides containing this RGD sequence can com- 
pete with fibronectin for the binding site on cells, thereby inhibiting the attach- 
ment of the cells to a fibronectin matrix. 

Several extracellular proteins besides fibronectin also have an RGD sequence 
that mediates cell-surface binding. Many of these proteins are components of 
the extracellular matrix, while others are involved in blood clotting. Peptides 
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Figure 19-47 The structure of 

a fibronectin dimer. (A) Electron 
micrographs of individual fibronectin dimer 
molecules shadowed with platinum; red 
arrows mark the joined C-termini. (B) The 
two polypeptide chains are similar but 
generally not identical (being made from 
the same gene but from differently spliced 
mRNAs). They are joined by two disulfide 
bonds near the C-termini. Each chain is 
almost 2500 amino acids long and is folded 
into multiple domains (See Figure 19-46). 
As indicated, some domains are specialized 
for binding to a particular molecule. For 
simplicity, not all of the known binding 
sites are shown. (C) The three-dimensional 
structure of the ninth and tenth type Ill 
fibronectin repeats, as determined by 
x-ray crystallography. Both the Arg-Gly- 
Asp (RGD) and the “synergy” sequences 
shown in red are important for binding to 
integrins on cell surfaces. (A, from J. Engel 
et al., J. Mol. Biol. 150:97-120, 1981. With 
permission from Academic Press; C, from 
Daniel J. Leahy, Annu. Rev. Cell Dev. Biol. 
13:363-393, 1997. With permission from 
Annual Reviews.) 
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containing the RGD sequence have been useful in the development of anti-clot- 
ting drugs. Some snakes use a similar strategy to cause their victims to bleed: 
they secrete RGD-containing anti-clotting proteins called disintegrins into their 
venom. 

The cell-surface receptors that bind RGD-containing proteins are members of 
the integrin family, which we describe in detail later. Each integrin specifically 
recognizes its own small set of matrix molecules, indicating that tight binding 
requires more than just the RGD sequence. Moreover, RGD sequences are not the 
only sequence motifs used for binding to integrins: many integrins recognize and 
bind to other motifs instead. 


Tension Exerted by Cells Regulates the Assembly of Fibronectin 
Fibrils 


Fibronectin can exist both in a soluble form, circulating in the blood and other 
body fluids, and as insoluble fibronectin fibrils, in which fibronectin dimers are 
cross-linked to one another by additional disulfide bonds and form part of the 
extracellular matrix. Unlike fibrillar collagen molecules, however, which can 
self-assemble into fibrils in a test tube, fibronectin molecules assemble into 
fibrils only on the surface of cells, and only where those cells possess appropri- 
ate fibronectin-binding proteins—in particular, integrins. The integrins provide a 
linkage from the fibronectin outside the cell to the actin cytoskeleton inside it. The 
linkage transmits tension to the fibronectin molecules—provided that they also 
have an attachment to some other structure—and stretches them, exposing cryp- 
tic binding sites in the fibronectin molecules (Figure 19-48). This allows them 
to bind directly to one another and to recruit additional fibronectin molecules 
to form a fibril (Figure 19-49). This dependence on tension and interaction with 
cell surfaces ensures that fibronectin fibrils assemble where there is a mechanical 
need for them and not in inappropriate locations such as the bloodstream. 

Many other extracellular matrix proteins contain multiple copies of the type 
III fibronectin repeat (see Figure 19-46), and it is possible that tension exerted 
on these proteins also uncovers cryptic binding sites and thereby influences their 
behavior. 


The Basal Lamina Is a Specialized Form of Extracellular Matrix 


Thus far in this section we have reviewed the general principles underlying the 
structure and function of the major classes of extracellular matrix components. 
We now describe how some of these components are assembled into a specialized 
type of extracellular matrix called the basal lamina (also known as the basement 
membrane). This exceedingly thin, tough, flexible sheet of matrix molecules is an 
essential underpinning of all epithelia. Although small in volume, it has a criti- 
cal role in the architecture of the body. Like the cadherins, it seems to be one of 
the defining features common to all multicellular animals, and it seems to have 
appeared very early in their evolution. The major molecular components of the 
basal lamina are among the most ancient extracellular matrix macromolecules. 
Basal laminae are typically 40-120 nm thick. A sheet of basal lamina not only 
lies beneath epithelial cells but also surrounds individual muscle cells, fat cells, 


Figure 19-48 Tension-sensing by 
fibronectin. Some type Ill fibronectin 
repeats are thought to unfold when 
fibronectin is stretched. The unfolding 
exposes cryptic binding sites that interact 
with other fibronectin molecules resulting 
in the formation of fibronectin filaments like 
those shown in Figure 19-49. (From 

V. Vogel and M. Sheetz, Nat. Rev. Mol. Cell 
Biol. 7:265-275, 2006. With permission 
from Macmillan Publishers Ltd.) 





Figure 19-49 Organization of fibronectin 
into fibrils at the cell surface. This 
fluorescence micrograph shows the front 
end of a migrating mouse fibroblast. 
Extracellular fibronectin is stained green 
and intracellular actin filaments are stained 
red. The fibronectin is initially present as 
small dotlike aggregates near the leading 
edge of the cell. It accumulates at focal 
adhesions (sites of anchorage of actin 
filaments, discussed later) and becomes 
organized into fibrils parallel to the actin 
filaments. Integrin molecules spanning the 
cell membrane link the fibronectin outside 
the cell to the actin filaments inside it (see 
Figure 19-55). Tension exerted on the 
fibronectin molecules through this linkage is 
thought to stretch them, exposing binding 
sites that promote fibril formation. (Courtesy 
of Roumen Pankov and Kenneth Yamada.) 
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and Schwann cells (which wrap around peripheral nerve cell axons to form 
myelin). The basal lamina thus separates these cells and epithelia from the under- 
lying or surrounding connective tissue and forms the mechanical connection 
between them. In other locations, such as the kidney glomerulus, a basal lam- 
ina lies between two cell sheets and functions as a selective filter (Figure 19-50). 
Basal laminae have more than simple structural and filtering roles, however. They 
are able to determine cell polarity; influence cell metabolism; organize the pro- 
teins in adjacent plasma membranes; promote cell survival, proliferation, or dif- 
ferentiation; and serve as highways for cell migration. 

The mechanical role is nevertheless essential. In the skin, for example, the epi- 
thelial outer layer—the epidermis—depends on the strength of the basal lamina 
to keep it attached to the underlying connective tissue—the dermis. In people 
with genetic defects in certain basal lamina proteins or in a special type of col- 
lagen that anchors the basal lamina to the underlying connective tissue, the epi- 
dermis becomes detached from the dermis. This causes a blistering disease called 
junctional epidermolysis bullosa, a severe and sometimes lethal condition. 


Laminin and Type IV Collagen Are Major Components of the Basal 
Lamina 


The basal lamina is synthesized by the cells on each side of it: the epithelial cells 
contribute one set of basal lamina components, while cells of the underlying bed 
of connective tissue (called the stroma, Greek for “bedding” ) contribute another 
set (Figure 19-51). Although the precise composition of the mature basal lamina 
varies from tissue to tissue and even from region to region in the same lamina, it 
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Figure 19-50 Three ways in which 
basal laminae are organized. Basal 
laminae (yellow) surround certain cells 
(such as skeletal muscle cells), underlie 
epithelia, and are interposed between two 
cell sheets (as in the kidney glomerulus). 
Note that, in the kidney glomerulus, both 
cell sheets have gaps in them, and the 
basal lamina has a filtering as well as a 
supportive function, helping to determine 
which molecules will pass into the urine 
from the blood. The filtration also depends 
on other protein-based structures, called 
slit diaphragms, that span the intercellular 
gaps in the epithelial sheet. 


Figure 19-51 The basal lamina in 

the cornea of a chick embryo. In this 
scanning electron micrograph, some of 

the epithelial cells have been removed to 
expose the upper surface of the matlike 
basal lamina. A network of collagen fibrils in 
the underlying connective tissue interacts 
with the lower face of the lamina. (Courtesy 
of Robert Trelstad.) 
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typically contains the glycoproteins laminin, type IV collagen, and nidogen, along 
with the proteoglycan perlecan. Other common basal lamina components are 
fibronectin and type XVIII collagen (an atypical member of the collagen family, 
forming the core protein of a proteoglycan). 

Laminin is the primary organizer of the sheet structure, and, early in devel- 
opment, basal laminae consist mainly of laminin molecules. Laminins comprise 
a large family of proteins, each composed of three long polypeptide chains (a, P, 
and y) held together by disulfide bonds and arranged in the shape of an asym- 
metric bouquet, like a bunch of three flowers whose stems are twisted together 
at the foot but whose heads remain separate (Figure 19-52). These heterotrimers 
can self-assemble in vitro into a network, largely through interactions between 
their heads, although interaction with cells is needed to organize the network into 
an orderly sheet. Since there are several isoforms of each type of chain, and these 
can associate in different combinations, many different laminins can be pro- 
duced, creating basal laminae with distinctive properties. The laminin y1 chain is, 
however, a component of most laminin heterotrimers; mice lacking it die during 
embryogenesis because they are unable to make basal laminae. 

Type IV collagen is a second essential component of mature basal laminae, 
and it, too, exists in several isoforms. Like the fibrillar collagens that constitute 
the bulk of the protein in connective tissues such as bone or tendon, type IV col- 
lagen molecules consist of three separately synthesized long protein chains that 
twist together to form a ropelike superhelix; however, they differ from the fibrillar 
collagens in that the triple-stranded helical structure is interrupted in more than 
20 regions, allowing multiple bends. Type IV collagen molecules interact via their 
terminal domains to assemble extracellularly into a flexible, feltlike network that 
gives the basal lamina tensile strength. 

Laminin and type IV collagen interact with other basal lamina components, 
such as the glycoprotein nidogen and the proteoglycan perlecan, resulting in a 
highly cross-linked network of proteins and proteoglycans (Figure 19-53). The 
laminin molecules that generate the initial sheet structure first join to each other 
while bound to receptors on the surface of the cells that produce laminin. The 
cell-surface receptors are primarily members of the integrin family, but another 
important type of laminin receptor is dystroglycan, a proteoglycan with a core pro- 
tein that spans the cell membrane, dangling its GAG chains in the extracellular 
space. Together, these receptors organize basal lamina assembly: they hold the 
laminin molecules by their feet, leaving the laminin heads positioned to interact 
so as to form a two-dimensional network. This laminin network then coordinates 
the assembly of the other basal lamina components. 


Basal Laminae Have Diverse Functions 


In the kidney glomerulus, an unusually thick basal lamina acts as one of the layers 
of a molecular filter, helping to prevent the passage of macromolecules from the 
blood into the urine as urine is formed (see Figure 19-50). The proteoglycan in 
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Figure 19-52 The structure of laminin. 
(A) The best-understood family member 

is laminin-111, shown here with some 

of its binding sites for other molecules 
(yellow boxes). Laminins are multidomain 
glycoproteins composed of three 
polypeptides (a, B, and y) that are disulfide- 
bonded into an asymmetric crosslike 
structure. Each of the polypeptide chains 

is more than 1500 amino acids long. Five 
types of a chains, four types of B chains, 
and three types of y chains are known, and 
various combinations of these subunits can 
assemble to form a large variety of different 
laminins, which are named according 

to numbers assigned to each of their 

three subunits: laminin-111, for example, 
contains a1, B1, and y1 subunits. Each 
isoform tends to have a specific tissue 
distribution: laminin-332 is found in skin, 
laminin-211 in muscle, and laminin-411 in 
endothelial cells of blood vessels. Through 
their binding sites for other proteins, laminin 
molecules play a central part in organizing 
basal laminae and anchoring them to 

cells. (B) Electron micrographs of laminin 
molecules shadowed with platinum. 

(B, from J. Engel et al., J. Mol. Biol. 
150:97-120, 1981. With permission 

from Academic Press.) 
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the basal lamina is important for this function: when its GAG chains are removed 
by specific enzymes, the filtering properties of the lamina are destroyed. Type IV 
collagen also has a role: in a human hereditary kidney disorder (Alport syndrome), 
mutations in a type IV collagen gene result in an irregularly thickened and dys- 
functional glomerular filter. Laminin mutations, too, can disrupt the function of 
the kidney filter, but in a different way—by interfering with the differentiation of 
the cells that contact it and support it. 

The basal lamina can act as a selective barrier to the movement of cells, as 
well as a filter for molecules. The lamina beneath an epithelium, for example, usu- 
ally prevents fibroblasts in the underlying connective tissue from making contact 
with the epithelial cells. It does not, however, stop macrophages, lymphocytes, or 
nerve processes from passing through it, using specialized protease enzymes to 
cut a hole for their transit. The basal lamina is also important in tissue regenera- 
tion after injury. When cells in tissues such as muscles, nerves, and epithelia are 
damaged or killed, the basal lamina often survives and provides a scaffold along 
which regenerating cells can migrate. In this way, the original tissue architecture 
is readily reconstructed. 

A particularly striking example of the role of the basal lamina in regeneration 
comes from studies of the neuromuscular junction, the site where the nerve ter- 
minals of a motor neuron form a chemical synapse with a skeletal muscle cell 
(discussed in Chapter 11). In vertebrates, the basal lamina that surrounds the 
muscle cell separates the nerve and muscle cell plasma membranes at the syn- 
apse, and the synaptic region of the lamina has a distinctive chemical character, 
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Figure 19-53 A model of the molecular 
structure of a basal lamina. (A) The 
basal lamina is formed by specific 
interactions (B) between the proteins 
laminin, type IV collagen, and nidogen, 
and the proteoglycan perlecan. Arrows 

in (B) connect molecules that can bind 
directly to each other. There are various 
isoforms of type IV collagen and laminin, 
each with a distinctive tissue distribution. 
Transmembrane laminin receptors 
(integrins and dystroglycan) in the 

plasma membrane are thought to 
organize the assembly of the basal lamina; 
only the integrins are shown. (Based on 

H. Colognato and P.D. Yurchenco, Dev. 
Dyn. 218:213-234, 2000. With permission 
from Wiley-Liss.) 
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with special isoforms of type IV collagen and laminin and a proteoglycan called 
agrin. After a nerve or muscle injury, the basal lamina at the synapse has a central 
role in reconstructing the synapse at the correct location (Figure 19-54). Defects 
in components of the basal lamina at the synapse are responsible for some forms 
of muscular dystrophy, in which muscles develop normally but then degenerate 
later in life. 


Cells Have to Be Able to Degrade Matrix, as Well as Make It 


The ability of cells to degrade and destroy extracellular matrix is as important as 
their ability to make it and bind to it. Rapid matrix degradation is required in pro- 
cesses such as tissue repair, and even in the seemingly static extracellular matrix 
of adult animals there is a slow, continuous turnover, with matrix macromol- 
ecules being degraded and resynthesized. This allows bone, for example, to be 
remodeled so as to adapt to changes in the stresses on it. 

From the point of view of individual cells, the ability to cut through matrix is 
crucial in two ways: it enables them to divide while embedded in matrix, and it 
enables them to travel through it. Cells in connective tissues generally need to be 
able to stretch out in order to divide. If a cell lacks the enzymes needed to degrade 
the surrounding matrix, it is strongly inhibited from dividing, as well as being hin- 
dered from migrating. 

Localized degradation of matrix components is also required wherever cells 
have to escape from confinement by a basal lamina. It is needed during normal 
branching growth of epithelial structures such as glands, for example, to allow the 
population of epithelial cells to increase, and needed also when white blood cells 
migrate across the basal lamina of a blood vessel into tissues in response to infec- 
tion or injury. Matrix degradation is important both for the spread of cancer cells 
through the body and for their ability to proliferate in the tissues that they invade 
(discussed in Chapter 20). 

In general, matrix components are degraded by extracellular proteolytic 
enzymes (proteases) that act close to the cells that produce them. Many of these 
proteases belong to one of two general classes. The largest group, with about 50 
members in vertebrates, is the matrix metalloproteases, which depend on bound 
Ca** or Zn** for activity. The second group is the serine proteases, which have a 
highly reactive serine in their active site. Together, metalloproteases and serine 


Figure 19-54 Regeneration experiments 
demonstrating the special character 

of the junctional basal lamina at a 
neuromuscular junction. If a frog muscle 
and its motor nerve are destroyed, 

the basal lamina around each muscle 

cell remains intact and the sites of the 

old neuromuscular junctions are still 
recognizable. When the nerve, but not the 
muscle, is allowed to regenerate (upper 
right), the junctional basal lamina directs 
the regenerating nerve to the original 
synaptic site. When the muscle, but not 
the nerve, is allowed to regenerate (lower 
right), the junctional basal lamina causes 
newly made acetylcholine receptors (blue) 
to accumulate at the original synaptic site. 
These experiments show that the junctional 
basal lamina controls the localization of 
synaptic components on both sides of the 
lamina. Some of the molecules responsible 
for these effects have been identified. 
Motor neuron axons, for example, deposit 
agrin in the junctional basal lamina, where 
it regulates the assembly of acetylcholine 
receptors and other proteins in the 
junctional plasma membrane of the muscle 
cell. Reciprocally, muscle cells deposit a 
particular isoform of laminin in the junctional 
basal lamina, and this molecule is likely to 
interact with specific ion channels on the 
presynaptic membrane of the neuron. 
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proteases cooperate to degrade matrix proteins such as collagen, laminin, and 
fibronectin. Some metalloproteases, such as the collagenases, are highly specific, 
cleaving particular proteins at a small number of sites. In this way, the structural 
integrity of the matrix is largely retained, while the limited amount of proteoly- 
sis that occurs is sufficient for cell migration. Other metalloproteases may be less 
specific, but, because they are anchored to the plasma membrane, they can act 
just where they are needed; it is this type of matrix metalloprotease that is crucial 
for a cell’s ability to divide when embedded in matrix. 

Clearly, the activities of the proteases that degrade the matrix must be tightly 
controlled, if the fabric of the body is not to collapse in a heap. Numerous mech- 
anisms are therefore employed to ensure that matrix proteases are activated only 
at the correct time and place. Protease activity is generally confined to the cell 
surface by specific anchoring proteins, by membrane-associated activators, and 
by the production of specific protease inhibitors in regions where protease activ- 
ity is not needed. 


Matrix Proteoglycans and Glycoproteins Regulate the Activities of 
secreted Proteins 


The physical properties of extracellular matrix are important for its fundamental 
roles as a scaffold for tissue structure and as a substrate for cell anchorage and 
migration. The matrix also has an important impact on cell signaling. Cells com- 
municate with each other by secreting signal molecules that diffuse through the 
extracellular fluid to influence other cells (discussed in Chapter 15). En route 
to their targets, the signal molecules encounter the tightly woven meshwork of 
the extracellular matrix, which contains a high density of negative charges and 
protein-interaction domains that can interact with the signal molecules, thereby 
altering their function in a variety of ways. 

The highly charged heparan sulfate chains of proteoglycans, for example, 
interact with numerous secreted signal molecules, including fibroblast growth 
factors (FGFs) and vascular endothelial growth factor (VEGF), which (among 
other effects) stimulate a variety of cell types to proliferate. By providing a dense 
array of growth factor binding sites, proteoglycans are thought to generate large 
local reservoirs of these factors, limiting their diffusion and focusing their actions 
on nearby cells. Similarly, proteoglycans might help generate steep growth fac- 
tor gradients in an embryo, which can be important in the patterning of tissues 
during development. FGF activity can also be enhanced by proteoglycans, which 
oligomerize the FGF molecules, enabling them to cross-link and activate their 
cell-surface receptors more effectively. 

The importance of proteoglycans as regulators of the distribution and activ- 
ity of signal molecules is illustrated by the severe developmental defects that can 
occur when specific proteoglycans are inactivated by mutation. In Drosophila, 
for example, the function of several signal proteins during development is gov- 
erned by interactions with the membrane-associated proteoglycans Dally and 
Dally-like. These members of the glypican family are thought to concentrate sig- 
nal proteins in specific locations and act as co-receptors that collaborate with the 
conventional cell-surface receptor proteins; as a result, they promote signaling in 
the correct location and prevent it in the wrong locations. In the Drosophila ovary, 
for example, Dally is partly responsible for the restricted localization and func- 
tion of a signaling protein called Dpp, which blocks differentiation of the germ- 
line stem cells: when the gene encoding Dally is mutated, Dpp activity is greatly 
reduced and oocyte development is abnormal. 

Several matrix proteins also interact with signal proteins. The type IV collagen 
of basal laminae interacts with Dpp in Drosophila, for example. Fibronectin con- 
tains a type III fibronectin repeat that interacts with VEGF, and another domain 
that interacts with another growth factor called hepatocyte growth factor (HGF), 
thereby promoting the activities of these factors. As discussed earlier, many matrix 
glycoproteins contain extensive arrays of binding domains, and the arrangement 
of these domains is likely to influence the presentation of signal proteins to their 
target cells (see Figure 19-46). 
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Finally, many matrix glycoproteins contain domains that bind directly to spe- 
cific cell-surface receptors, thereby generating signals that influence the behavior 
of the cells, as we describe in the next section. 


Summary 


Cells are embedded in an intricate extracellular matrix, which not only binds the 
cells together but also influences their survival, development, shape, polarity, and 
migratory behavior. The matrix contains various protein fibers interwoven in a net- 
work of glycosaminoglycan (GAG) chains. GAGs are negatively charged polysac- 
charide chains that (except for hyaluronan) are covalently linked to protein to form 
proteoglycan molecules. GAGs attract water and occupy a large volume of extracel- 
lular space. Proteoglycans are also found on the surface of cells, where they often 
function as co-receptors to help cells respond to secreted signal proteins. Fiber-form- 
ing proteins give the matrix strength and resilience. The fibrillar collagens (types I, 
II, Ill, V, and XI) are ropelike, triple-stranded helical molecules that aggregate into 
long fibrils in the extracellular space, thereby providing tensile strength. They also 
form structures to which cells can be anchored, often via large multidomain glyco- 
proteins, such as laminin and fibronectin, that bind to integrins on the cell surface. 
Elasticity is provided by elastin molecules, which form an extensive cross-linked 
network of fibers and sheets that can stretch and recoil. 

The basal lamina is a specialized form of extracellular matrix that underlies 
epithelial cells or is wrapped around certain other cell types, such as muscle cells. 
Basal laminae are organized on a framework of laminin molecules, which are 
linked together by their side-arms and bind to integrins and other receptors in the 
basal plasma membrane of overlying epithelial cells. Type IV collagen molecules, 
together with the protein nidogen and the large heparan sulfate proteoglycan per- 
lecan, assemble into a sheetlike mesh that is an essential component of all mature 
basal laminae. Basal laminae provide mechanical support for epithelia; they form 
the interface and attachment between epithelia and connective tissue; they serve as 
filters in the kidney; they act as barriers to keep cells in their proper compartments; 
they influence cell polarity and cell differentiation; and they guide cell migration 
during development and tissue regeneration. 


CELL-MATRIX JUNCTIONS 


Cells make extracellular matrix, organize it, and degrade it. The matrix in its turn 
exerts powerful influences on the cells. The influences are exerted chiefly through 
transmembrane cell adhesion proteins that act as matrix receptors. These proteins 
tie the matrix outside the cell to the cytoskeleton inside it, but their role goes far 
beyond simple passive mechanical attachment. Through them, components of 
the matrix can affect almost any aspect of a cell’s behavior. The matrix receptors 
have a crucial role in epithelial cells, mediating their interactions with the basal 
lamina beneath them. They are no less important in connective-tissue cells, medi- 
ating the cells’ interactions with the matrix that surrounds them. 

Several types of molecules can function as matrix receptors or co-receptors, 
including the transmembrane proteoglycans. But the principal receptors on ani- 
mal cells for binding most extracellular matrix proteins are the integrins. Like the 
cadherins and the key components of the basal lamina, integrins are part of the 
fundamental architectural toolkit that is characteristic of multicellular animals. 
The members of this large family of homologous transmembrane adhesion mol- 
ecules have a remarkable ability to transmit signals in both directions across the 
plasma membrane. The binding of a matrix component to an integrin can send a 
message into the interior of the cell, and conditions in the cell interior can send a 
signal outward to control binding of the integrin to the matrix. Tension applied to 
an integrin can cause it to tighten its grip on intracellular and extracellular struc- 
tures, and loss of tension can loosen its hold, so that molecular signaling com- 
plexes fall apart on either side of the membrane. In this way, integrins can serve 
not only to transmit mechanical and molecular signals, but also to convert one 
type of signal into the other. 


CELL—MATRIX JUNCTIONS 


Integrins Are Transmembrane Heterodimers That Link the 
Extracellular Matrix to the Cytoskeleton 


There are many varieties of integrins, but they all conform to a common plan. 
An integrin molecule is composed of two noncovalently associated glycopro- 
tein subunits called a and B. Both subunits span the cell membrane, with short 
intracellular C-terminal tails and large N-terminal extracellular domains (Figure 
19-55). The extracellular domains bind to specific amino acid sequence motifs 
in extracellular matrix proteins or, in some cases, in proteins on the surfaces of 
other cells. The best-understood binding site for integrins is the RGD sequence 
mentioned earlier (see Figure 19-47), which is found in fibronectin and other 
extracellular matrix proteins. Some integrins bind a Leu-Asp-Val (LDV) sequence 
in fibronectin and other proteins. Additional integrin-binding sequences, as yet 
poorly defined, exist in laminins and collagens. 

Humans contain 24 types of integrins, formed from the products of 8 differ- 
ent B-chain genes and 18 different a-chain genes, dimerized in different combi- 
nations. Each integrin dimer has distinctive properties and functions. Moreover, 
because the same integrin molecule in different cell types can have different 
ligand-binding specificities, it seems that additional cell-type-specific factors can 
interact with integrins to modulate their binding activity. The binding of integrins 
to their matrix ligands is also affected by the concentration of Ca** and Mg** in the 
extracellular medium, reflecting the presence of divalent cation-binding domains 
in the a and B subunits. The divalent cations can influence both the affinity and 
the specificity of the binding of an integrin to its extracellular ligands. 

The intracellular portion of an integrin dimer binds to a complex of several 
different proteins, which together form a linkage to the cytoskeleton. For all but 
one of the 24 varieties of human integrins, this intracellular linkage is to actin fila- 
ments. These linkages depend on proteins that assemble at the short cytoplasmic 
tails of the integrin subunits (see Figure 19-55). A large adaptor protein called 
talin is a component of the linkage in many cases, but numerous additional pro- 
teins are also involved. Like the actin-linked cell-cell junctions formed by cad- 
herins, the actin-linked cell-matrix junctions formed by integrins may be small, 
inconspicuous, and transient, or large, prominent, and durable. Examples of the 
latter are the focal adhesions that form when fibroblasts have sufficient time to 
establish strong attachments to the rigid surface of a culture dish, and the myoten- 
dinous junctions that attach muscle cells to their tendons. 
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Figure 19-55 The subunit structure 

of an active integrin molecule, linking 
extracellular matrix to the actin 
cytoskeleton. The N-terminal heads of 
the integrin chains attach directly to an 
extracellular protein such as fibronectin; 
the C-terminal intracellular tail of the 
integrin B subunit binds to adaptor 
oroteins that interact with filamentous 
actin. The best-understood adaptor is 

a giant protein called talin, which contains 
a string of multiple domains for binding 
actin and other proteins, such as vinculin, 
that help reinforce and regulate the linkage 
to actin filaments. One end of talin binds 
to a specific site on the integrin B subunit 
cytoplasmic tail; other regulatory proteins, 
such as kindlin, bind at another site on 
the tail. 
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Figure 19-56 Hemidesmosomes. 
(A) Hemidesmosomes spot-weld epithelial 
cells to the basal lamina, linking laminin 
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disease, called bullous pemphigoid, is an 
autoimmune disease in which the immune 
(B) system develops antibodies against 
collagen XVII or BP230. 


collagen 


In epithelia, the most prominent cell-matrix attachment sites are the hemides- 
mosomes, where a specific type of integrin anchors the cells to laminin in the 
basal lamina. Here, uniquely, the intracellular attachment is to keratin interme- 
diate filaments, via the intracellular adaptor proteins plectin and BP230 (Figure 
19-56). 


Integrin Defects Are Responsible for Many Genetic Diseases 


Although there is some overlap in the activities of the different integrins—at least 
five bind laminin, for example—it is the diversity of integrin functions that is more 
remarkable. Table 19-3 lists some varieties of integrins and the problems that 
result when individual integrin a or B chains are defective. 

The Bı subunit forms dimers with at least 12 distinct a subunits and is found 
on almost all vertebrate cells: a5}; is a fibronectin receptor and O¢, is a laminin 


TABLE 19-3 


Fibronectin Ubiquitous Death of embryo; defects Early death of embryo 
in blood vessels, somites, (at implantation) 
neural crest 

agß Laminin Ubiquitous Severe skin blistering; Early death of embryo 
defects in other epithelia also | (at implantation) 

Laminin Muscle Muscular dystrophy; Early death of embryo 
defective myotendinous (at implantation) 
junctions 


a Bo (LFA1) Ig superfamily White blood cells Impaired recruitment of Leukocyte adhesion deficiency 
counterreceptors leucocytes (LAD); impaired inflammatory 
(ICAM1) responses; recurrent life- 
threatening infections 


AlIbB3 Fibrinogen Platelets Bleeding; no platelet Bleeding; no platelet 
aggregation (Glanzmann’s aggregation (Glanzmann’s 
disease) disease); mild osteopetrosis 

agß4 Laminin Hemidesmosomes in | Severe skin blistering; Severe skin blistering; defects in 

epithelia defects in other epithelia also | other epithelia also 


*Not all ligands are listed. 





CELL—MATRIX JUNCTIONS 


receptor on many types of cells. Mutant mice that cannot make any integrins 
die early in embryonic development. Mice that are only unable to make the a7 
subunit (the partner for Bı in muscle) survive but develop muscular dystrophy (as 
do mice that cannot make the laminin ligand for the a7, integrin). 

The B2 subunit forms dimers with at least four types of a subunit and is 
expressed exclusively on the surface of white blood cells, where it has an essential 
role in enabling these cells to fight infection. The B2 integrins mainly mediate cell- 
cell rather than cell-matrix interactions, binding to specific ligands on another 
cell, such as an endothelial cell. The ligands are members of the Ig superfamily of 
cell-cell adhesion molecules. We have already described an example earlier in the 
chapter: an integrin of this class (a,B2, also known as LFA1) on white blood cells 
enables them to attach firmly to the Ig family protein ICAM1 on vascular endothe- 
lial cells at sites of infection (see Figure 19-28B). People with the genetic disease 
called leukocyte adhesion deficiency fail to synthesize functional B2 subunits. As 
a consequence, their white blood cells lack the entire family of B2 receptors, and 
they suffer repeated bacterial infections. 

The {3 integrins are found on blood platelets (as well as various other cells), 
and they bind several matrix proteins, including the blood clotting factor fibrin- 
ogen. Platelets have to interact with fibrinogen to mediate normal blood clotting, 
and humans with Glanzmann’s disease, who are genetically deficient in Bs integ- 
rins, suffer from defective clotting and bleed excessively. 


Integrins Can Switch Between an Active and an Inactive 
Conformation 


A cell crawling through a tissue—a fibroblast or a macrophage, for example, or an 
epithelial cell migrating along a basal lamina—has to be able both to make and to 
break attachments to the matrix, and to do so rapidly if it is to travel quickly. Simi- 
larly, a circulating white blood cell has to be able to switch on or offits tendency to 
bind to endothelial cells in order to crawl out of a blood vessel at a site of inflam- 
mation. Furthermore, if force is to be applied where it is needed, the making and 
breaking of the extracellular attachments in all these cases has to be coupled to 
the prompt assembly and disassembly of cytoskeletal attachments inside the cell. 
The integrin molecules that span the membrane and mediate the attachments 
cannot simply be passive, rigid objects with sticky patches at their two ends. They 
must be able to switch between an active state, where they readily form attach- 
ments, and an inactive state, where they do not. 

Structural studies, using a combination of electron microscopy and x-ray crys- 
tallography, suggest that integrins exist in multiple structural conformations that 
reflect different states of activity (Figure 19-57). In the inactive state, the external 
segments of the integrin dimer are folded together into a compact structure that 
cannot bind matrix proteins. In this state, the cytoplasmic tails of the dimer are 
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Figure 19-57 Integrins exist in two 

major activity states. Inactive (folded) 

and active (extended) structures of an 
integrin molecule, based on data from x-ray 
crystallography and other methods. 
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Figure 19-58 Activation of integrins by intracellular signaling. Signals received from outside the cell can act through various 
intracellular mechanisms to stimulate integrin activation. In platelets, as illustrated here, the extracellular signal protein thrombin 
activates a G-protein-coupled receptor on the cell surface, thereby initiating a signaling pathway that leads to activation of 
Rap1, a member of the monomeric GTPase family. Activated Rap1 interacts with the protein RIAM, which then recruits talin to 
the plasma membrane. Together with another protein called kindlin, talin interacts with the integrin B chain to trigger integrin 
activation. Talin then interacts with adaptor proteins such as vinculin, resulting in the formation of an actin linkage (See Figure 
19-55). 

Talin regulation depends in part on an interaction between its flexible C-terminal rod domain and the N-terminal head domain 
that contains the integrin-binding site. This interaction is thought to maintain talin in an inactive state when it is free in the 
cytoplasm. When talin is recruited by RIAM to the plasma membrane, the talin head domain interacts with a phosphoinositide 
called PI(4,5)P2 (not shown here, but see Figure 15-28), resulting in dissociation of the rod domain. Talin unfolds to expose its 
binding sites for integrin and other proteins. 


hooked together, preventing their interaction with cytoskeletal linker proteins. 
In the active state, the two integrin subunits are unhooked at the membrane to 
expose the intracellular binding sites for cytoplasmic adaptor proteins, and the 
external domains unfold and extend, like a pair of legs, to expose a high-affin- 
ity matrix-binding site at the tips of the subunits. Thus, the switch from inactive 
to active states depends on a major conformational change that simultaneously 
exposes the external and internal ligand-binding sites at the ends of the integrin 
molecule. External matrix binding and internal cytoskeleton linkages are thereby 
coupled. 

Switching between the inactive and active states is regulated by a variety of 
mechanisms that vary, depending on the needs of the cell. In some cases, activa- 
tion occurs by an “outside-in” mechanism: the binding of an external matrix pro- 
tein, such as the RGD sequence of fibronectin, can drive some integrins to switch 
from the low-affinity inactive state to the high-affinity active state. As a result, 
binding sites for talin and other cytoplasmic adaptor proteins are exposed on the 
tail of the P chain. The binding of these adaptor proteins then leads to attachment 
of actin filaments to the intracellular end of the integrin molecule (see Figure 
19-55). In this way, when the integrin catches hold of its ligand outside the cell, 
the cell reacts by tying the integrin molecule to the cytoskeleton, so that force can 
be applied at the point of cell attachment. 

The chain of cause and effect can also operate in reverse, from inside to out- 
side. This “inside-out” integrin-activation process generally depends on intra- 
cellular regulatory signals that stimulate the ability of talin and other proteins to 
interact with the B chain of the integrin. Talin competes with the integrin a chain 
for its binding site on the tail of the B chain. Thus, when talin binds to the B chain, 
it blocks the intracellular a-ß linkage, allowing the two legs of the integrin mole- 
cule to spring apart. 

The regulation of “inside-out” integrin activation is particularly well under- 
stood in platelets, where an extracellular signal protein called thrombin binds 
to a specific G-protein-coupled receptor (GPCR) on the cell surface and thereby 
activates an intracellular signaling pathway that leads to integrin activation (Fig- 
ure 19-58). It is likely that similar signaling pathways govern integrin activation in 
numerous other cell types. 


CELL-—MATRIX JUNCTIONS 


Integrins Cluster to Form Strong Adhesions 


Integrins, like other cell adhesion molecules, differ from cell-surface receptors for 
hormones and for other extracellular soluble signal molecules in that they usually 
bind their ligand with lower affinity and are present at a 10-100-fold higher con- 
centration on the cell surface. The Velcro principle, mentioned earlier in the con- 
text of cadherin adhesion (see Figure 19-6C), operates here too. Following their 
activation, integrins cluster together to create a dense plaque in which many inte- 
grin molecules are anchored to cytoskeletal filaments. The resulting protein struc- 
ture can be remarkably large and complex, as seen in the focal adhesion made by 
a fibroblast on a fibronectin-coated surface culture dish. 

The assembly of mature cell-matrix junctional complexes depends on the 
recruitment of dozens of different scaffolding and signaling proteins. Talin is a 
major component of many cell-matrix complexes, but numerous other proteins 
also make important contributions. These include the integrin-linked kinase (ILK) 
and its binding partners pinch and parvin, which together form a trimeric com- 
plex that serves as an organizing hub at many junctions. Cell-matrix junctions 
also employ several actin-binding proteins, such as vinculin, zyxin, VASP, and 
a-actinin, to promote the assembly and organization of actin filaments. Another 
critical component of many cell-matrix junctions is the focal adhesion kinase 
(FAK), which interacts with multiple components in the junction and serves an 
important function in signaling, as we describe next. 


Extracellular Matrix Attachments Act Through Integrins to Control 
Cell Proliferation and Survival 


Like other transmembrane cell adhesion proteins, integrins do more than just cre- 
ate attachments. They also activate intracellular signaling pathways and thereby 
allow control of almost any aspect of the cell’s behavior according to the nature of 
the surrounding matrix and the state of the cell’s attachments to it. 

Many cells will not grow or proliferate in culture unless they are attached to 
extracellular matrix; nutrients and soluble growth factors in the culture medium 
are not enough. For some cell types, including epithelial, endothelial, and mus- 
cle cells, even cell survival depends on such attachments. When these cells lose 
contact with the extracellular matrix, they undergo apoptosis. This dependence of 
cell growth, proliferation, and survival on attachment to a substratum is known as 
anchorage dependence, and it is mediated mainly by integrins and the intracel- 
lular signals they generate. Mutations that disrupt or override this form of control, 
allowing cells to escape from anchorage dependence, occur in cancer cells and 
play a major part in their invasive behavior. 

Our understanding of anchorage dependence has come mainly from studies 
of cells living on the surface of matrix-coated culture dishes. For connective-tissue 
cells that are normally surrounded by matrix on all sides, this is a far cry from the 
natural environment. Walking over a two-dimensional plain is very different from 
clambering through a three-dimensional jungle. The types of contacts that cells 
make with a rigid substratum are not the same as those, much less well studied, 
that they make with the deformable web of fibers of the extracellular matrix, and 
there are substantial differences in cell behavior in the two contexts. Nevertheless, 
it is likely that the same basic principles apply. Both in vitro and in vivo, intracel- 
lular signals generated at cell-matrix adhesion sites are crucial for cell prolifera- 
tion and survival. 


Integrins Recruit Intracellular Signaling Proteins at Sites of 
Cell-Matrix Adhesion 


The mechanisms by which integrins signal into the cell interior are complex, 
involving several pathways, and integrins and conventional signaling receptors 
often influence one another and work together to regulate cell behavior, as we 
have already emphasized. The Ras/MAP kinase pathway (see Figure 15-49), for 
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example, can be activated both by conventional signaling receptors and by inte- 
grins, but cells often need both kinds of stimulation of this pathway at the same 
time to give sufficient activation to induce cell proliferation. Integrins and con- 
ventional signaling receptors also cooperate to promote cell survival (discussed 
in Chapters 15 and 18). 

One of the best-studied modes of integrin signaling depends on a cytoplas- 
mic protein tyrosine kinase called focal adhesion kinase (FAK). In studies of cells 
cultured on plastic dishes, focal adhesions are often prominent sites of tyrosine 
phosphorylation (Figure 19-59), and FAK is one of the major tyrosine-phosphor- 
ylated proteins found at these sites. When integrins cluster at cell-matrix contacts, 
FAK is recruited to the integrin B subunit by intracellular adaptor proteins such 
as talin or paxillin (which binds to one type of integrin a subunit). The clustered 
FAK molecules phosphorylate each other on a specific tyrosine, creating a phos- 
photyrosine docking site for members of the Src family of cytoplasmic tyrosine 
kinases. In addition to phosphorylating other proteins at the adhesion sites, these 
kinases then phosphorylate FAK on additional tyrosines, creating docking sites 
for a variety of additional intracellular signaling proteins. In this way, outside-in 
signaling from integrins, via FAK and Src family kinases, is relayed into the cell in 
much the same way as receptor tyrosine kinases generate signals (as discussed in 
Chapter 15). 


Cell-Matrix Adhesions Respond to Mechanical Forces 


Like the cell-cell junctions we described earlier, cell-matrix junctions can sense 
and respond to the mechanical forces that act on them. Most cell-matrix junc- 
tions, for example, are connected to a contractile actin network that tends to pull 
the junctions inward, away from the matrix. When cells are attached to a rigid 
matrix that strongly resists such pulling forces, the cell-matrix junction is able to 
sense the resulting high tension and trigger a response in which it recruits addi- 
tional integrins and other proteins to increase the junction’s ability to withstand 
that tension. Cell attachment to a relatively soft matrix generates less tension 
and therefore a less robust response. These mechanisms allow cells to sense and 
respond to differences in the rigidity of extracellular matrices in different tissues. 
We saw earlier that mechanotransduction at cadherin-based cell-cell junc- 
tions likely depends on junctional proteins that change their structure when 
the junction is stretched by tension (see Figure 19-12). The same is true for cell- 
matrix junctions. The long C-terminal tail domain of talin, for example, includes 
a large number of binding sites for the actin-regulatory protein vinculin. Many of 
these sites are hidden inside folded protein domains but are exposed when those 


Figure 19-59 Tyrosine phosphorylation 
at focal adhesions. A fibroblast cultured 
on a fibronectin-coated substratum and 
stained with fluorescent antibodies: actin 
filaments are stained green and activated 
proteins that contain phosphotyrosine 

are red, giving orange where the two 
components overlap. The actin filaments 
terminate at focal adhesions, where 

the cell attaches to the substratum by 
means of integrins. Proteins containing 
phosphotyrosine are also concentrated at 
these sites, reflecting the local activation 
of FAK and other protein kinases. Signals 
generated at such adhesion sites help 
regulate cell division, growth, and survival. 
(Courtesy of Keith Burridge.) 
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domains are unfolded by stretching the protein (Figure 19-60). The N-terminal 
end of talin binds integrin and the C-terminal end binds actin (see Figure 19-55); 
thus, when actin filaments are pulled by myosin motors inside the cell, the result- 
ing tension stretches the talin rod, thereby exposing vinculin-binding sites. The 
vinculin molecules then recruit and organize additional actin filaments. Tension 
thereby increases the strength of the junction. 


Summary 


Integrins are the principal cell-surface receptors used by animal cells to bind to the 
extracellular matrix: they function as transmembrane linkers between the extracel- 
lular matrix and the cytoskeleton. Most integrins connect to actin filaments, while 
those at hemidesmosomes bind to intermediate filaments. Integrin molecules are 
heterodimers, and the binding of extracellular matrix ligands or intracellular acti- 
vator proteins such as talin results in a dramatic conformational switch from an 
inactive to an active state. This creates an allosteric coupling between binding to 
matrix outside the cell and binding to the cytoskeleton inside it, allowing the inte- 
grin to convey signals in both directions across the plasma membrane. Complex 
assemblies of proteins become organized around the intracellular tails of activated 
integrins, producing intracellular signals that can influence almost any aspect of 
cell behavior, from proliferation and survival, as in the phenomenon of anchorage 
dependence, to cell polarity and guidance of migration. Integrin-based cell-matrix 
junctions are also capable of mechanotransduction: they can sense and respond to 
mechanical forces acting across the junction. 


THE PLANT CELL WALL 


Each cell in a plant deposits, and is in turn completely enclosed by, an elaborate 
extracellular matrix called the plant cell wall. It was the thick cell walls of cork, vis- 
ible in a primitive microscope, that in 1663 enabled Robert Hooke to distinguish 
and name cells for the first time. The walls of neighboring plant cells, cemented 
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Figure 19-60 Talin is a tension sensor 
at cell-matrix junctions. Tension 

across cell—matrix junctions stimulates 

the local recruitment of vinculin and 

other actin-regulatory proteins, thereby 
strengthening the junction’s attachment 

to the cytoskeleton. The experiments 
presented here tested the hypothesis that 
tension is sensed by the talin adaptor 
protein that links integrins to actin filaments 
(see Figure 19-55). (A) The long, flexible, 
C-terminal region of talin is divided into a 
series of folded domains, some of which 
contain vinculin-binding sites (dark green 
lines) that are thought to be hidden and 
therefore inaccessible. One domain near 
the N-terminus, for example, comprises a 
folded bundle of 12 a helices containing 
five vinculin-binding sites. (B) This 
experiment tested the hypothesis that 
tension stretches the 12-helix domain, 
thereby exposing vinculin-binding sites. 

A fragment of talin containing this domain 
was attached to an apparatus in which the 
domain could be stretched, as shown here. 
The fragment was labeled at its N-terminus 
with a tag that sticks to the surface of a 
glass slide on a microscope stage. The 
C-terminal end of the fragment was bound 
to a tiny magnetic bead, so the talin 
fragment could be stretched using a small 
magnetic electrode. The solution around 
the protein contained fluorescently tagged 
vinculin proteins. After the talin protein was 
stretched, excess vinculin solution was 
washed away, and the microscope was 
used to determine if any fluorescent vinculin 
proteins were bound to the talin protein. 

In the absence of stretching (top), most 
talin molecules did not bind vinculin. When 
the protein was stretched (bottom), two or 
three vinculin molecules were bound (only 
one is shown here for clarity). (Adapted 
from A. del Rio et al., Science 323:638- 
641, 2009.) 
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together to form the intact plant (Figure 19-61), are generally thicker, stronger, 
and, most important of all, more rigid than the extracellular matrix produced by 
animal cells. In evolving relatively rigid walls, which can be up to many microm- 
eters thick, early plant cells forfeited the ability to crawl about and adopted a sed- 
entary lifestyle that has persisted in all present-day plants. 


The Composition of the Cell Wall Depends on the Cell Type 


All cell walls in plants have their origin in dividing cells, as the cell plate forms 
during cytokinesis to create a new partition wall between the daughter cells 
(discussed in Chapter 17). The new cells are usually produced in special regions 
called meristems, and they are generally small in comparison with their final size. 
To accommodate subsequent cell growth, the walls of the newborn cells, called 
primary cell walls, are thin and extensible, although tough. Once cell growth 
stops, the wall no longer needs to be extensible: sometimes the primary wall is 
retained without major modification, but, more commonly, a rigid secondary cell 
wall is produced by depositing new layers of matrix inside the old ones. These new 
layers generally have a composition that is significantly different from that of the 
primary wall. The most common additional polymer in secondary walls is lignin, 
a complex network of covalently linked phenolic compounds found in the walls of 
the xylem vessels and fiber cells of woody tissues. 

Although the cell walls of higher plants vary in both composition and orga- 
nization, they are all constructed, like animal extracellular matrices, using a 
structural principle common to all fiber-composites, including fiberglass and 
reinforced concrete. One component provides tensile strength, while another, 
in which the first is embedded, provides resistance to compression. While the 
principle is the same in plants and animals, the chemistry is different. Unlike the 


Figure 19-61 Plant cell walls. (A) Electron 
micrograph of the root tip of a rush, 
showing the organized pattern of cells 

that results from an ordered sequence of 
cell divisions in cells with relatively rigid 

cell walls. In this growing tissue, the cell 
walls are still relatively thin, appearing as 
fine black lines between the cells in the 
micrograph. (B) Section of a typical cell wall 
separating two adjacent plant cells. The 
two dark transverse bands correspond to 
plasmodesmata that span the wall (see 
Figure 19-27). (A, courtesy of C. Busby 
and B. Gunning, Eur. J. Cell Biol. 21:214- 
223, 1980. With permission from Elsevier; 
B, courtesy of Jeremy Burgess.) 


THE PLANT CELL WALL 


animal extracellular matrix, which is rich in protein and other nitrogen-contain- 
ing polymers, the plant cell wall is made almost entirely of polymers that contain 
no nitrogen, including cellulose and lignin. For a sedentary organism that depends 
on CO2, H20, and sunlight, these two abundant biopolymers represent “cheap,” 
carbon-based structural materials, helping to conserve the scarce fixed nitrogen 
available in the soil that generally limits plant growth. Thus trees, for example, 
make a huge investment in the cellulose and lignin that comprise the bulk of their 
biomass. 

In the cell walls of higher plants, the tensile fibers are made from the poly- 
saccharide cellulose, the most abundant organic macromolecule on Earth, tightly 
linked into a network by cross-linking glycans. In primary cell walls, the matrix 
in which the cross-linked cellulose network is embedded is composed of pectin, 
a highly hydrated network of polysaccharides rich in galacturonic acid. Second- 
ary cell walls contain additional molecules to make them rigid and permanent; 
lignin, in particular, forms a hard, waterproof filler in the interstices between the 
other components. All of these molecules are held together by a combination of 
covalent and noncovalent bonds to form a highly complex structure, whose com- 
position, thickness, and architecture depend on the cell type. 

The plant cell wall thus has a “skeletal” role in supporting the structure of the 
plant as a whole, a protective role as an enclosure for each cell individually, and 
a transport role, helping to form channels for the movement of fluid in the plant. 
When plant cells become specialized, they generally adopt a specific shape and 
produce specially adapted types of walls, according to which the different types 
of cells in a plant can be recognized and classified. We focus here, however, on 
the primary cell wall and the molecular architecture that underlies its remarkable 
combination of strength, resilience, and plasticity, as seen in the growing parts of 
a plant. 


The Tensile Strength of the Cell Wall Allows Plant Cells to Develop 
Turgor Pressure 


The aqueous extracellular environment of a plant cell consists of the fluid con- 
tained in the walls that surround the cell. Although the fluid in the plant cell wall 
contains more solutes than does the water in the plant’s external milieu (for exam- 
ple, soil), it is still hypotonic in comparison with the cell interior. This osmotic 
imbalance causes the cell to develop a large internal hydrostatic pressure, or tur- 
gor pressure, which pushes outward on the cell wall, just as an inner tube pushes 
outward on a tire. The turgor pressure increases just to the point where the cell is 
in osmotic equilibrium, with no net influx of water despite the salt imbalance. The 
turgor pressure generated in this way may reach 10 or more atmospheres, about 
five times that in the average car tire. This pressure is vital to plants because it is the 
main driving force for cell expansion during growth, and it provides much of the 
mechanical rigidity of living plant tissues. Compare the wilted leaf of a dehydrated 
plant, for example, with the turgid leaf of a well-watered one. It is the mechanical 
strength of the cell wall that allows plant cells to sustain this internal pressure. 


The Primary Cell Wall Is Built from Cellulose Microfibrils Interwoven 
with a Network of Pectic Polysaccharides 


Cellulose gives the primary cell wall tensile strength. Each cellulose molecule con- 
sists of a linear chain of at least 500 glucose residues that are covalently linked to 
one another to form a ribbonlike structure, which is stabilized by hydrogen bonds 
within the chain (Figure 19-62). In addition, hydrogen bonds between adjacent 
cellulose molecules cause them to stick together in overlapping parallel arrays, 
forming bundles of about 40 cellulose chains, all of which have the same polarity. 
These highly ordered crystalline aggregates, many micrometers long, are called 
cellulose microfibrils, and they have a tensile strength comparable to that of 
steel. Sets of microfibrils are arranged in layers, or lamellae, with each microfibril 
about 20-40 nm from its neighbors and connected to them by long cross-linking 
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cellulose 
microfibril 


Figure 19-62 Cellulose. Cellulose 
molecules are long, unbranched chains of 
B1,4-linked glucose units. Each glucose 
residue is inverted with respect to its 
neighbors, and the resulting disaccharide 
repeat occurs hundreds of times in a single 
cellulose molecule. About 16 individual 
cellulose molecules assemble to form 

a strong, hydrogen-bonded cellulose 
microfibril. 


1084 Chapter 19: Cell Junctions and the Extracellular Matrix 






middle 
lamella 


primary 
cell wall 


pectin 







cellulose 
microfibril 


plasma [4 
membrane 


cross-linking glycan 


glycan molecules, which are attached by hydrogen bonds to the surface of the 
microfibrils. The primary cell wall consists of several such lamellae arranged in a 
plywoodlike network (Figure 19-63). 

The cross-linking glycans are a heterogeneous group of branched polysaccha- 
rides that bind tightly to the surface of each cellulose microfibril and thereby help 
to cross-link the microfibrils into a complex network. There are many classes of 
cross-linking glycans, but they all have a long linear backbone composed of one 
type of sugar (glucose, xylose, or mannose) from which short side chains of other 
sugars protrude. It is the backbone sugar molecules that form hydrogen bonds 
with the surface of cellulose microfibrils, cross-linking them in the process. Both 
the backbone and the side-chain sugars vary according to the plant species and 
its stage of development. 

Coextensive with this network of cellulose microfibrils and cross-linking gly- 
cans is another cross-linked polysaccharide network based on pectins (see Figure 
19-63). Pectins are a heterogeneous group of branched polysaccharides that con- 
tain many negatively charged galacturonic acid units. Because of their negative 
charge, pectins are highly hydrated and associated with a cloud of cations, resem- 
bling the glycosaminoglycans of animal cells in the large amount of space they 
occupy (see Figure 19-33). When Ca** is added to a solution of pectin molecules, 
it cross-links them to produce a semirigid gel (it is pectin that is added to fruit 
juice to make jam set). Certain pectins are particularly abundant in the middle 
lamella, the specialized region that cements together the walls of adjacent cells 
(see Figure 19-63); here, Ca** cross-links are thought to help hold cell wall com- 
ponents together. Although covalent bonds also play a part in linking the com- 
ponents, very little is known about their nature. Regulated separation of cells at 
the middle lamella underlies such processes as the ripening of tomatoes and the 
abscission (detachment) of leaves in the fall. 

In addition to the two polysaccharide-based networks that form the bulk of all 
plant primary cell walls, proteins are present, contributing up to about 5% of the 
wall’s dry mass. Many of these proteins are enzymes, responsible for wall turn- 
over and remodeling, particularly during growth. Another class of wall proteins, 
like collagen, contains high levels of hydroxyproline. These proteins are thought 
to strengthen the wall, and they are produced in greatly increased amounts as a 
local response to attack by pathogens. From the genome sequence of Arabidopsis, 
it has been estimated that more than 700 genes are required to synthesize, assem- 
ble, and remodel the plant cell wall. 


Figure 19-63 Scale model of a portion 
of a primary plant cell wall showing the 
two major polysaccharide networks. The 
orthogonally arranged layers of cellulose 
microfibrils (green) are tied into a network 
by the cross-linking glycans (red) that form 
hydrogen bonds with the microfibrils. This 
network is coextensive with a network of 
pectin polysaccharides (blue). The network 
of cellulose and cross-linking glycans 
provides tensile strength, while the pectin 
network resists compression. Cellulose, 
cross-linking glycans, and pectin are 
typically present in roughly equal amounts 
in a primary cell wall. The middle lamella 

is especially rich in pectin, and it cements 
adjacent cells together. 
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Oriented Cell Wall Deposition Controls Plant Cell Growth 


Once a plant cell has left the meristem where it is generated, it can grow dramat- 
ically, commonly by more than a thousand times in volume. The manner of this 
expansion determines the final shape of each cell, and hence the final form of the 
plant as a whole. Turgor pressure inside the cell drives the expansion, but it is the 
behavior of the cell wall that governs its direction and extent. Complex wall-re- 
modeling activities are required, as well as the deposition of new wall materials. 
Because of their crystalline structure, the individual cellulose microfibrils in the 
wall are unable to stretch, and this gives them a crucial role in the process. For the 
cell wall to stretch or deform, the microfibrils must either slide past one another 
or become more widely separated, or both. The orientation of the microfibrils in 
the innermost layers of the wall governs the direction in which the cell expands. 
Cells in plants therefore anticipate their future morphology by controlling the ori- 
entation of the cellulose microfibrils that they deposit in the wall (Figure 19-64). 

Unlike most other matrix macromolecules, which are made in the endoplas- 
mic reticulum and Golgi apparatus and are secreted, cellulose is spun out from 
the surface of the cell by a plasma-membrane-bound enzyme complex (cellulose 
synthase), which uses as its substrate the sugar nucleotide UDP-glucose supplied 
from the cytosol. Each enzyme complex, or rosette, has a sixfold symmetry (see 
Figure 19-65) and contains the protein products of three separate cellulose syn- 
thase (CESA) genes. Each CESA protein is essential for the production of a cellu- 
lose microfibril. Three CESA genes are required for primary cell wall synthesis and 
a different three for secondary cell wall synthesis. 

As they are being synthesized, the nascent cellulose chains assemble into 
microfibrils. These are spun out on the extracellular surface of the plasma mem- 
brane, forming a layer, or lamella, in which all the microfibrils have more or less 
the same alignment (see Figure 19-63). Each new lamella is deposited internally 
to the previous one, so that the wall consists of concentrically arranged lamellae, 
with the oldest on the outside. The most recently deposited microfibrils in elon- 
gating cells commonly lie perpendicular to the axis of cell elongation, although 
the orientation of the microfibrils in the outer lamellae that were laid down earlier 
may be different (see Figure 19-64B and C). 
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Figure 19-64 Cellulose microfibrils 
influence the direction of cell elongation. 
(A) The orientation of cellulose microfibrils in 
the primary cell wall of an elongating carrot 
cell is shown in this electron micrograph of 
a shadowed replica from a rapidly frozen 
and deep-etched cell wall. The cellulose 
microfibrils are aligned parallel to one 
another and perpendicular to the axis of 
cell elongation. The microfibrils are cross- 
linked by, and interwoven with, a complex 
web of matrix molecules (Compare with 
Figure 19-63). (B, C) The cells in (B) and 
(C) start off with identical shapes (shown 
here as cubes) but with different net 
orientations of cellulose microfibrils in their 
walls. Although turgor pressure is uniform 
in all directions, cell wall loosening allows 
each cell to elongate only in a direction 
perpendicular to the orientation of the 
innermost layer of microfibrils, which have 
great tensile strength. Cell expansion 
occurs in concert with the insertion of new 
wall material. The final shape of an organ, 
such as a shoot, is determined in part by 
the direction in which its component cells 
can expand. (A, courtesy of Brian Wells and 
Keith Roberts.) 
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Microtubules Orient Cell Wall Deposition 


An important clue to the mechanism that dictates microfibril orientation came 
from observations of the microtubules in plant cells. These are frequently arranged 
in the cortical cytoplasm with the same orientation as the cellulose microfibrils 
that are currently being deposited in the cell wall in that region. These cortical 
microtubules form a cortical array close to the cytosolic face of the plasma mem- 
brane, held there by poorly characterized proteins. The congruent orientation of 
the cortical array of microtubules (lying just inside the plasma membrane) and 
cellulose microfibrils (lying just outside) is seen in many types and shapes of plant 
cells and is present during both primary and secondary cell wall deposition, sug- 
gesting a causal relationship. 

This suggestion can be tested by treating a plant tissue with a microtubule-de- 
polymerizing drug so as to disassemble the entire system of cortical microtu- 
bules. The consequences for subsequent cellulose deposition, however, are not 
as straightforward as might be expected. The drug treatment does not disrupt the 
production of new cellulose microfibrils, and in some cases cells can continue to 
deposit new microfibrils in the preexisting orientation. Any developmental switch 
in the orientation of the microfibril pattern that would normally occur between 
successive lamellae, however, is invariably blocked. It seems that a preexisting 
orientation of microfibrils can be propagated even in the absence of microtu- 
bules, but any change in the deposition of cellulose microfibrils requires that 
intact microtubules be present to determine the new orientation. 

These observations are consistent with the following model. The cellulose-syn- 
thesizing rosettes embedded in the plasma membrane spin out long cellulose 
molecules. As the synthesis of cellulose molecules and their self-assembly into 
microfibrils proceeds, the distal end of each microfibril presumably forms indi- 
rect cross-links to the previous layer of wall material, orienting the new microfi- 
bril in parallel with the old ones as it becomes integrated into the texture of the 
wall. Since the microfibril is stiff, the rosette at its growing, proximal end has to 
move as it deposits the new material. Traveling in the plane of the membrane, 
the rosette moves in the direction defined by the way in which the far end of the 
microfibril is anchored in the existing wall. In this way, each layer of microfibrils 
would tend to be spun out from the membrane in the same orientation as the 
layer laid down previously, with the rosettes following the direction of the preex- 
isting oriented microfibrils outside the cell. Oriented microtubules inside the cell, 
however, can force a change in the direction in which the rosettes move: they can 
create boundaries in the plasma membrane that act like the banks of a canal to 
constrain rosette movement (Figure 19-65). In this view, cellulose synthesis can 
occur independently of microtubules; but it is constrained spatially when cortical 
microtubules are present to define membrane microdomains within which the 
enzyme complex can move. 
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Figure 19-65 One model of how the 
orientation of newly deposited cellulose 
microfibrils might be determined by the 
orientation of cortical microtubules. 

(A) The large cellulose synthase complexes, 
or rosettes, are integral membrane 
proteins that continuously synthesize 
cellulose microfibrils on the outer face of 
the plasma membrane. The distal ends 

of the stiff microfibrils become integrated 
into the texture of the wall, and their 
elongation at the proximal end pushes the 
synthase complex along in the plane of 
the membrane. Because the cortical array 
of microtubules is attached to the plasma 
membrane in a way that confines this 
complex to defined membrane channels, 
the orientation of these microtubules — 
when they are present— determines the 
axis along which the new microfibrils 

are laid down. (B, C) Two electron 
micrographs show the tight association of 
the cortical microtubules with the plasma 
membrane. One shows the microtubules 
in cross section while the other shows a 
microtubule in longitudinal section. Both 
emphasize the constant gap of about 20 
nm between membrane and microtubule; 
the connecting molecules responsible 
remain obscure. (B and C, courtesy of 
Andrew Staehelin.) 
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THE PLANT CELL WALL 


In this way, plant cells can change their direction of expansion by a sudden 
change in the orientation of their cortical array of microtubules. Because plant 
cells cannot move (being constrained by their walls), the entire morphology of 
a multicellular plant presumably depends on a coordinated, highly patterned 
deployment of cortical microtubule orientations during plant development. It is 
not known how these orientations are controlled, although it has been shown that 
the microtubules can reorient rapidly in response to extracellular stimuli, includ- 
ing plant growth regulators such as ethylene and auxins (discussed in Chapter 15). 

Microtubules are not, however, the only cytoskeletal elements that influence 
wall deposition. Local foci of cortical actin filaments can also direct the deposition 
of new wall material at specific sites on the cell surface, contributing to the elabo- 
rate final shaping of many differentiated plant cells. 


Summary 


Plant cells are surrounded by a tough extracellular matrix, or cell wall, which is 
responsible for many of the unique features of a plant’s lifestyle. The wall is com- 
posed of a network of cellulose microfibrils and cross-linking glycans, embedded in 
a highly cross-linked matrix of pectin polysaccharides. In secondary cell walls, lig- 
nin may be deposited to make them waterproof, hard, and woody. A cortical array 
of microtubules can control the orientation of newly deposited cellulose microfi- 
brils, which in turn determine the direction of cell expansion and therefore the final 


1087 


WHAT WE DON’T KNOW 


e What are the regulatory mechanisms 
that control the rearrangement of 
cell-cell junctions in epithelia during 
early development? What roles do 
mechanical force and tension play in 
these rearrangements? 


e How do extracellular matrix proteins 
and carbohydrates influence the 
localization and actions of extracellular 
signal molecules or their cell-surface 
receptors? 


e How do intracellular adaptor proteins 
coordinate the activation of integrin 
proteins and their interactions with 
cytoskeletal components and their 
response to changes in mechanical 
force acting on cell—matrix junctions? 


e Given that extracellular matrix 
molecules have the ability to present 
ordered arrays of signals to cells, 


shape of the cell and, ultimately, of the plant as a whole. 


PROBLEMS 


Which statements are true? Explain why or why not. 


19-1 Given the numerous processes inside cells that 
are regulated by changes in Ca** concentration, it seems 
likely that Ca?*-dependent cell-cell adhesions are also 
regulated by changes in Ca** concentration. 


19-2 Tight junctions perform two distinct functions: 
they seal the space between cells to restrict paracellular 
flow and they fence off plasma membrane domains to 
prevent the mixing of apical and basolateral membrane 
proteins. 


19-3 ‘The elasticity of elastin derives from its high con- 
tent of a helices, which act as molecular springs. 


19-4 Integrins can convert mechanical signals into 
intracellular molecular signals. 


Discuss the following problems. 


19-5 Comment on the following (1922) quote from 
Warren Lewis, who was one of the pioneers of cell biology. 
“Were the various types of cells to lose their stickiness for 
one another and for the supporting extracellular matrix, 
our bodies would at once disintegrate and flow off into the 
ground in a mixed stream of cells.” 


19-6 Cell adhesion molecules were originally identi- 
fied using antibodies raised against cell-surface compo- 
nents to block cell aggregation. In the adhesion-blocking 
assays, the researchers found it necessary to use antibody 
fragments, each with a single binding site (so-called Fab 
fragments), rather than intact IgG antibodies, which are 
Y-shaped molecules with two identical binding sites. The 


might the exact spatial relationships 
between such signals carry a message 
beyond that of the individual signals 
themselves? 


sites for 
antigen binding 


sites of 
papain 
cleavage 





PAPAIN 
—_—_—_——P 


IgG antibody Fab fragments 


Figure Q19-1 Production of Fab fragments from IgG antibodies by 
digestion with papain (Problem 19-6). 


Fab fragments were generated by digesting the IgG anti- 
bodies with papain, a protease, to separate the two bind- 
ing sites (Figure Q19-1). Why do you suppose it was nec- 
essary to use Fab fragments to block cell aggregation? 


19-7 ‘The food-poisoning bacterium Clostridium per- 
fringens makes a toxin that binds to members of the clau- 
din family of proteins, which are the main constituents of 
tight junctions. When the C-terminus of the toxin is bound 
to a claudin, the N-terminus can insert into the adjacent 
cell membrane, forming holes that kill the cell. The por- 
tion of the toxin that binds to the claudins has proven to be 
a valuable reagent for investigating the properties of tight 
junctions. MDCK cells are a common choice for studies 
of tight junctions because they can form an intact epithe- 
lial sheet with high transepithelial resistance. MDCK cells 
express two claudins: claudin-1, which is not bound by 
the toxin, and claudin-4, which is. 

When an intact MDCK epithelial sheet is incu- 
bated with the C-terminal toxin fragment, claudin-4 
disappears, becoming undetectable within 24 hours. In 
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Figure Q19-2 Effects of Clostridium toxin on the barrier function 
of MDCK cells (Problem 19-7). (A) Addition of toxin from the 
basolateral side of the epithelial sheet. (B) Addition of toxin from 
the apical side of the epithelial sheet. For a given voltage, a higher 
resistance (ohms cm?) gives less paracellular current. 


the absence of claudin-4, the cells remain healthy and 
the epithelial sheet appears intact. The mean number 
of strands in the tight junctions that link the cells also 
decreases over 24 hours from about four to about two, and 
they are less highly branched. A functional assay for the 
integrity of the tight junctions shows that transepithelial 
resistance decreases dramatically in the presence of the 
toxin, but the resistance can be restored by washing out 
the toxin (Figure Q19-2A). Curiously, the toxin produces 
these effects only when it is added to the basolateral side 
of the sheet; it has no effect when added to the apical sur- 
face (Figure Q19-2B). 


A. How can it be that two tight-junction strands 
remain, even though all of the claudin-4 has disappeared? 
B. Why do you suppose the toxin works when it is 


added to the basolateral side of the epithelial sheet, but 
not when added to the apical side? 


19-8 It is not an easy matter to assign particular func- 
tions to specific components of the basal lamina, since 
the overall structure is a complicated composite material 
with both mechanical and signaling properties. Nidogen, 
for example, cross-links two central components of the 
basal lamina by binding to the laminin y-1 chain and to 
type IV collagen. Given such a key role, it was surprising 
that mice with a homozygous knockout of the gene for 
nidogen-1 were entirely healthy, with no abnormal phe- 
notype. Similarly, mice homozygous for a knockout of the 
gene for nidogen-2 also appeared completely normal. By 
contrast, mice that were homozygous for a defined muta- 
tion in the gene for laminin y-1, which eliminated just the 
binding site for nidogen, died at birth with severe defects 
in lung and kidney formation. The mutant portion of the 
laminin y-1 chain is thought to have no other function 
than to bind nidogen, and does not affect laminin struc- 
ture or its ability to assemble into the basal lamina. How 
would you explain these genetic observations, which are 
summarized in Table Q19-1? What would you predict 
would be the phenotype of a mouse that was homozygous 
for knockouts of both nidogen genes? 





TABLE Q19-1 


f = en A A E e = of. 
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Gene knockout (-/-) 
Gene knockout (-/-) 


Nidogen binding-site 
deletion (+/-) 


Protein 
Nidogen-1 
Nidogen-2 


Laminin y-1 


Nidogen binding-site Dead at birth 


deletion (—/-) 


Laminin y-1 


+/— stands for heterozygous, —/— stands for homozygous. 


19-9 Discuss the following statement: “The basal lam- 
ina of muscle fibers serves as a molecular bulletin board, 
in which adjoining cells can post messages that direct the 
differentiation and function of the underlying cells.” 


19-10 The affinity of integrins for matrix components can 
be modulated by changes to their cytoplasmic domains: 
a process known as inside-out signaling. You have iden- 
tified a key region in the cytoplasmic domains of ayph3 
integrin that seems to be required for inside-out signaling 
(Figure Q19-3). Substitution of alanine for either D723 
in the P chain or R995 in the a chain leads to a high level 
of spontaneous activation, under conditions where the 
wild-type chains are inactive. Your advisor suggests that 
you convert the aspartate in the B chain to an arginine 
(D723R) and the arginine in the a chain to an aspartate 
(R995D). You compare all three a chains (R995, R995A, 
and R995D) against all three B chains (D723, D723A, and 
D723R). You find that all pairs have a high level of sponta- 
neous activation, except D723 vs R995 (the wild type) and 
D723R vs R995D, which have low levels. Based on these 
results, how do you think the rbp integrin is held in its 
inactive state? 


extracellular plasma cytoplasm 
space membrane 
723 
B, K LLITILI HDRKE F'——— COOH 
“ip WKVGFFKRNRP— COOH 
995 


Figure Q19-3 Schematic representation of apB3 integrin 
(Problem 19-10). The D723 and R995 residues are indicated. 
(From PE. Hughes et al., J. Biol. Chem. 271:6571-6574, 1996. 
With permission from American Society for Biochemistry and 
Molecular Biology.) 


19-11 The glycosaminoglycan polysaccharide chains 
that are linked to specific core proteins to form the pro- 
teoglycan components of the extracellular space are 
highly negatively charged. How do you suppose these 
negatively charged polysaccharide chains help to estab- 
lish a hydrated gel-like environment around the cell? How 
would the properties of these molecules differ if the poly- 
saccharide chains were uncharged? 


REFERENCES 


19-12 At body temperature, L-aspartate in proteins race- 
mizes to D-aspartate at an appreciable rate. Most pro- 
teins in the body have a very low level of D-aspartate, if it 
can be detected at all. Elastin, however, has a fairly high 
level of D-aspartate. Moreover, the amount of D-aspartate 
increases in direct proportion to the age of the person from 
whom the sample was taken. Why do you suppose that 
most proteins have little if any D-aspartate, while elastin 
has levels of D-aspartate that increase steadily with age? 


19-13 Your boss is coming to dinner! All you have for a 
salad is some wilted, day-old lettuce. You vaguely recall 
that there is a trick to rejuvenating wilted lettuce, but you 
cannot remember what it is. Should you soak the lettuce in 
salt water, soak it in tap water, or soak it in sugar water, or 
maybe just shine a bright light on it and hope that photo- 
synthesis will perk it up? 
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Cancer 


About one in five of us will die of cancer, but that is not why we devote a chapter 
to this disease. Cancer cells break the most basic rules of cell behavior by which 
multicellular organisms are built and maintained, and they exploit every kind of 
opportunity to do so. These transgressions help to reveal what the normal rules 
are and how they are enforced. As a result, cancer research helps to illuminate 
the fundamentals of cell biology—especially cell signaling (Chapter 15), the cell 
cycle and cell growth (Chapter 17), programmed cell death (apoptosis, Chapter 
18), and the control of tissue architecture (Chapters 19 and 22). Of course, with 
a deeper understanding of these normal processes, we also gain a deeper under- 
standing of the disease and better tools to treat it. 

In this chapter, we first consider what cancer is and describe the natural 
history of the disease from a cellular standpoint. We then discuss the molecu- 
lar changes that make a cell cancerous. And we end the chapter by considering 
how our enhanced understanding of the molecular basis of cancer is leading to 
improved methods for its prevention and treatment. 


CANCER AS A MICROEVOLUTIONARY PROCESS 


The body of an animal operates as a society or ecosystem, whose individual mem- 
bers are cells that reproduce by cell division and organize themselves into collabo- 
rative assemblies called tissues. This ecosystem is very peculiar, however, because 
self-sacrifice—as opposed to survival of the fittest—is the rule. Ultimately, all of 
the somatic cell lineages in animals are committed to die: they leave no progeny 
and instead dedicate their existence to the support of the germ cells, which alone 
have a chance of continued survival (discussed in Chapter 21). There is no mys- 
tery in this, for the body is a clone derived from a fertilized egg, and the genome 
of the somatic cells is the same as that of the germ-cell lineage that gives rise to 
sperm or eggs. By their self-sacrifice for the sake of the germ cells, the somatic 
cells help to propagate copies of their own genes. 

Thus, unlike free-living cells such as bacteria, which compete to survive, the 
cells of a multicellular organism are committed to collaboration. To coordinate 
their behavior, the cells send, receive, and interpret an elaborate set of extracel- 
lular signals that serve as social controls, directing cells how to act (discussed in 
Chapter 15). As a result, each cell behaves in a socially responsible manner—rest- 
ing, growing, dividing, differentiating, or dying—as needed for the good of the 
organism. 

Molecular disturbances that upset this harmony mean trouble for a multicel- 
lular society. In a human body with more than 10" cells, billions of cells experi- 
ence mutations every day, potentially disrupting the social controls. Most danger- 
ously, a mutation may give one cell a selective advantage, allowing it to grow and 
divide slightly more vigorously and survive more readily than its neighbors and 
in this way to become a founder of a growing mutant clone. A mutation that pro- 
motes such selfish behavior by individual members of the cooperative can jeop- 
ardize the future of the whole enterprise. Over time, repeated rounds of mutation, 
competition, and natural selection operating within the population of somatic 
cells can cause matters to go from bad to worse. These are the basic ingredients 
of cancer: it is a disease in which an individual mutant clone of cells begins by 
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prospering at the expense ofits neighbors. In the end—as the clone grows, evolves, 
and spreads—it can destroy the entire cellular society (Movie 20.1). 

In this section, we discuss the development of cancer as a microevolutionary 
process that takes place within the course of a human life-span in a subpopulation 
of cells in the body. But the process depends on the same principles of mutation 
and natural selection that have driven the evolution of living organisms on Earth 
for billions of years. 


Cancer Cells Bypass Normal Proliferation Controls and Colonize 
Other Tissues 


Cancer cells are defined by two heritable properties: (1) they reproduce in defi- 
ance of the normal restraints on cell growth and division, and (2) they invade 
and colonize territories normally reserved for other cells. It is the combination of 
these properties that makes cancers particularly dangerous. An abnormal cell that 
grows (increases in mass) and proliferates (divides) out of control will give rise to 
a tumor, or neoplasm—literally, a new growth. As long as the neoplastic cells have 
not yet become invasive, however, the tumor is said to be benign. For most types 
of such neoplasms, removing or destroying the mass locally usually achieves a 
complete cure. A tumor is considered a true cancer if it is malignant; that is, when 
its cells have acquired the ability to invade surrounding tissue. Invasiveness is an 
essential characteristic of cancer cells. It allows them to break loose, enter blood 
or lymphatic vessels, and form secondary tumors called metastases at other sites 
in the body (Figure 20-1). In general, the more widely a cancer spreads, the harder 
it becomes to eradicate. It is generally metastases that kill the cancer patient. 

Cancers are traditionally classified according to the tissue and cell type from 
which they arise. Carcinomas are cancers arising from epithelial cells, and they 
are by far the most common cancers in humans. They account for about 80% of 
cases, perhaps because most of the cell proliferation in adults occurs in epithe- 
lia. In addition, epithelial tissues are the most likely to be exposed to the various 
forms of physical and chemical damage that favor the development of cancer. 
Sarcomas arise from connective tissue or muscle cells. Cancers that do not fit in 
either of these two broad categories include the various leukemias and lympho- 
mas, derived from white blood cells and their precursors (hemopoietic cells), as 
well as cancers derived from cells of the nervous system. Figure 20-2 shows the 
types of cancers that are common in the United States, together with their inci- 
dence and death rates. Each broad category has many subdivisions according to 
the specific cell type, the location in the body, and the microscopic appearance of 
the tumor. 

In parallel with the set of names for malignant tumors, there is a related set of 
names for benign tumors: an adenoma, for example, is a benign epithelial tumor 
with a glandular organization; the corresponding type of malignant tumor is an 
adenocarcinoma (Figure 20-3). Similarly, a chondroma and a chondrosarcoma 
are, respectively, benign and malignant tumors of cartilage. 

Most cancers have characteristics that reflect their origin. Thus, for example, 
the cells of a basal-cell carcinoma, derived from a keratinocyte stem cell in the skin, 
generally continue to synthesize cytokeratin intermediate filaments, whereas the 
cells of a melanoma, derived from a pigment cell in the skin, will often (but not 
always) continue to make pigment granules. Cancers originating from different 
cell types are, in general, very different diseases. Basal-cell carcinomas of the skin, 
for example, are only locally invasive and rarely metastasize, whereas melanomas 
can become much more malignant and often form metastases. Basal-cell carcino- 
mas are readily cured by surgery or local irradiation, whereas malignant melano- 
mas, once they have metastasized widely, are usually fatal. 

Later, we shall see that there is also a different way to classify cancers, one that 
cuts across the traditional classification by site of origin: we can classify them in 
terms of the mutations that make the tumor cells cancerous. The final section 
of the chapter will show how this information can be crucial to the design and 
choice of treatments. 





Figure 20-1 Metastasis. Malignant tumors 
typically give rise to metastases, making 
the cancer hard to eradicate. Shown in 
this fusion image is a whole-body scan of 
a patient with metastatic non-Hodgkin's 
lymphoma (NHL). The background image 
of the body’s tissues was obtained by CT 
(computed x-ray tomography) scanning. 
Overlaid on this image, a PET (positron 
emission tomography) scan reveals the 
tumor tissue (yellow), detected by its 
unusually high uptake of radioactively 
labeled fluorodeoxyglucose (FDG). High 
FDG uptake occurs in cells with unusually 
active glucose uptake and metabolism, 
which is a characteristic of cancer cells 
(see Figure 20-12). The yellow spots 

in the abdominal region reveal multiple 
metastases. (Courtesy of S. Gambhir.) 
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Most Cancers Derive from a Single Abnormal Cell 


Even when a cancer has metastasized, we can usually trace its origins to a single 
primary tumor, arising in a specific organ. The primary tumor is thought to derive 
by cell division from a single cell that initially experienced some heritable change. 
Subsequently, additional changes accumulate in some of the descendants of this 
cell, allowing them to outgrow, out-divide, and often outlive their neighbors. By 
the time it is first detected, a typical human cancer will have been developing for 
many years and will already contain a billion cancer cells or more (Figure 20-4). 
Tumors will usually also contain a variety of other cell types; for example, fibro- 
blasts will be present in the supporting connective tissue associated with a car- 
cinoma, in addition to inflammatory and vascular endothelial cells. How can we 
be sure that the cancer cells are the clonal descendants of a single abnormal cell? 

One way of proving clonal origin is through molecular analysis of the chro- 
mosomes in tumor cells. In almost all patients with chronic myelogenous leuke- 
mia (CML), for example, we can distinguish the leukemic white blood cells from 
the patient’s normal cells by a specific chromosomal abnormality: the so-called 
Philadelphia chromosome, created by a translocation between the long arms of 
chromosomes 9 and 22 (Figure 20-5). When the DNA at the site of translocation 
is cloned and sequenced, it is found that the site of breakage and rejoining of the 
translocated fragments is identical in all the leukemic cells in any given patient, 
but that this site differs slightly (by a few hundred or thousand base pairs) from 
one patient to another. This is the expected result if, and only if, the cancer in each 
patient arises from a unique accident occurring in a single cell. We will see later 
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Figure 20-2 Cancer incidence and 
mortality in the United States. The total 
number of new cases diagnosed in 2012 
in the United States was 1,665,540, and 
total cancer deaths were 585,720. Note 
that deaths reflect cases diagnosed at 
many different times and that somewhat 
less than half of the people who develop 
cancer die of it. In the world as a whole, the 
five most common cancers are those of the 
lung, stomach, breast, colon/rectum, and 
uterine cervix (included in the figure under 
the heading of reproductive tract), and the 
total number of new cancer cases recorded 
per year is just over 6 million. Skin cancers 
other than melanomas are not included in 
these figures, since almost all are cured 
easily and many are unrecorded. 

The data for the United Kingdom are 
similar. However, incidences are different 
in some other parts of the world, reflecting 
widespread exposures to different 
infectious agents and environmental toxins. 
(Data from American Cancer Society, 
Cancer Facts and Figures, 2014.) 


Figure 20-3 Benign versus malignant 
tumors. A benign glandular tumor (pink 
cells; an adenoma) remains inside the basal 
lamina (yellow) that marks the boundary 

of the normal structure (a duct, in this 
example). In contrast, a malignant glandular 
tumor (red cells; an adenocarcinoma) can 
develop from a benign tumor cell, and 

it destroys the integrity of the tissue, as 
shown. There are many different forms that 
such tumors may take. 
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Figure 20-4 The growth of a typical human tumor, such as a tumor of 
the breast. The diameter of the tumor is plotted on a logarithmic scale. Years 
may elapse before the tumor becomes noticeable. The doubling time of a 
typical breast tumor, for example, is about 100 days. However, particularly 
virulent tumors may grow much more rapidly. 


how this particular translocation promotes the development of CML by creating a 
novel hybrid gene encoding a protein that promotes cell proliferation. 

Many other lines of evidence, from a variety of cancers, point to the same con- 
clusion: most cancers originate from a single aberrant cell. 


Cancer Cells Contain Somatic Mutations 


If a single abnormal cell is to give rise to a tumor, it must pass on its abnormal- 
ity to its progeny: the aberration has to be heritable. Thus, the development of 
a clone of cancer cells depends on genetic changes. The tumor cells contain 
somatic mutations: they have one or more shared detectable abnormalities in 
their DNA sequence that distinguish them from the normal cells surrounding the 
tumor, as in the example of CML just described. (The mutations are called somatic 
because they occur in the soma, or body cells, not in the germ line). Cancers are 
also driven by epigenetic changes—persistent, heritable changes in gene expres- 
sion that result from modifications of chromatin structure without alteration of 
the cell’s DNA sequence. But somatic mutations that alter DNA sequence appear 
to be a fundamental and universal feature, and cancer is in this sense a genetic 
disease. 

Factors that cause genetic changes tend to provoke the development of can- 
cer. Thus, carcinogenesis (the generation of cancer) can be linked to mutagenesis 
(the production of a change in the DNA sequence). This correlation is particularly 
clear for two classes of external agents: (1) chemical carcinogens (which typically 
cause simple local changes in the nucleotide sequence), and (2) radiation such as 
x-rays (which typically cause chromosome breaks and translocations) or ultravio- 
let (UV) light (which causes specific DNA base alterations). 

As would be expected, people who have inherited a genetic defect in one of 
several DNA repair mechanisms, causing their cells to accumulate mutations at an 
elevated rate, run a heightened risk of cancer. Those with the disease xeroderma 
pigmentosum, for example, have defects in the system that repairs DNA damage 
induced by UV light, and they have a greatly increased incidence of skin cancers. 


A Single Mutation Is Not Enough to Change a Normal Cell into a 
Cancer Cell 


An estimated 10!° cell divisions occur in a normal human body in the course of 
a typical lifetime; in a mouse, with its smaller number of cells and its shorter life- 
span, the number is about 10!*. Even in an environment that is free of mutagens, 
mutations would occur spontaneously at an estimated rate of about 10° mutations 
per gene per cell division—a value set by fundamental limitations on the accuracy 
of DNA replication and repair (see pp. 237-238). Thus, in a typical lifetime, every 
single gene is likely to have undergone mutation on about 10!° separate occasions 
in a human, or on about 10° occasions in a mouse. Among the resulting mutant 
cells, we might expect a large number that have sustained deleterious mutations 
in genes that regulate cell growth and division, causing the cells to disobey the 
normal restrictions on cell proliferation. From this point of view, the problem of 
cancer seems to be not why it occurs, but why it occurs so infrequently. 

Clearly, if a mutation in a single gene were enough to convert a typical healthy 
cell into a cancer cell, we would not be viable organisms. Many lines of evidence 
indicate that the development of a cancer typically requires that a substantial 
number of independent, rare genetic and epigenetic accidents occur in the lin- 
eage that emanates from a single cell. One such indication comes from epidemi- 
ological studies of the incidence of cancer as a function of age (Figure 20-6). If a 
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Figure 20-5 The translocation between 
chromosomes 9 and 22 responsible 
for chronic myelogenous leukemia. 
The normal structures of chromosomes 

9 and 22 are shown at the left. When a 
translocation occurs between them at the 
indicated site, the result is the abnormal 
pair at the right. The smaller of the two 
resulting abnormal chromosomes (22q) is 
called the Philadelphia chromosome, after 
the city where the abnormality was first 
recorded. 
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Figure 20-6 Cancer incidence as a function of age. The number of newly 
diagnosed cases of colon cancer in women in England and Wales in 1 year 
is plotted as a function of age at diagnosis, relative to the total number 

of individuals in each age group. The incidence of cancer rises steeply 

as a function of age. If only a single mutation were required to trigger the 
cancer and this mutation had an equal chance of occurring at any time, the 
incidence of this cancer would be the same at all ages. Analyses of this type 
suggest that the development of a solid tumor instead requires five to eight 
independent accidents (“hits”) that occur randomly over time. This calculation 
assumes that the mutation rate remains constant as a cancer evolves, where 
in fact it often increases (see p. 1097). (Data from C. Muir et al., Cancer 
Incidence in Five Continents, Vol. V. Lyon: International Agency for Research 
on Cancer, 1987.) 


single mutation were responsible for cancer, occurring with a fixed probability per 
year, the chance of developing cancer in any given year of life should be indepen- 
dent of age. In fact, for most types of cancer, the incidence rises steeply with age— 
as would be expected if cancer is caused by a progressive, random accumulation 
of a set of mutations in a single lineage of cells. 

As discussed later, these indirect arguments have now been confirmed by sys- 
tematically sequencing the genomes of the tumor cells from individual cancer 
patients and cataloging the mutations that they contain. 


Cancers Develop Gradually from Increasingly Aberrant Cells 


For those cancers known to have a specific external cause, the disease does 
not usually become apparent until long after exposure to the causal agent. The 
incidence of lung cancer, for example, does not begin to rise steeply until after 
decades of heavy smoking (Figure 20-7). Similarly, the incidence of leukemias 
in Hiroshima and Nagasaki did not show a marked rise until about 5 years after 
the explosion of the atomic bombs, and industrial workers exposed for a limited 
period to chemical carcinogens do not usually develop the cancers characteristic 
of their occupation until 10, 20, or even more years after the exposure. During 
this long incubation period, the prospective cancer cells undergo a succession 
of changes, and the same presumably applies to cancers where the initial genetic 
lesion has no such obvious external cause. 

The concept that the development of a cancer requires a gradual accumulation 
of mutations in a number of different genes helps to explain the well-known phe- 
nomenon of tumor progression, whereby an initial mild disorder of cell behavior 
evolves gradually into a full-blown cancer. Chronic myelogenous leukemia again 
provides a clear example. It begins as a disorder characterized by a nonlethal 
overproduction of white blood cells and continues in this form for several years 
before changing into a much more rapidly progressing illness that usually ends 
in death within a few months. In the early chronic phase, the leukemic cells are 
distinguished mainly by the chromosomal translocation (the Philadelphia chro- 
mosome) mentioned previously, although there may well be other, less visible 
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Figure 20-7 Smoking and the onset of 
lung cancer. A major increase in cigarette 
smoking (red line) has caused a dramatic 
rise in lung cancer deaths (green line), with 
a lag time of about 35 years. Because 
global cigarette smoking peaked in 1990, 
global lung cancer deaths are expected to 
decline after a similar lag. (Data from R.N. 
Proctor, Nat. Rev. Cancer 1:82-86, 2001). 
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genetic or epigenetic changes. In the subsequent acute phase, cells that show not 
only the translocation but also several other chromosomal abnormalities over- 
run the hemopoietic (blood-forming) system. It appears that cells from the initial 
mutant clone have undergone further mutations that make them proliferate even 
more vigorously, so that they come to outnumber both the normal blood cells and 
their ancestors with the primary chromosomal translocation. 

Carcinomas and other solid tumors evolve in a similar way (Figure 20-8). 
Although many such cancers in humans are not diagnosed until a relatively late 
stage, in some cases it is possible to observe the earlier steps and, as we shall see 
later, to relate them to specific genetic changes 


Tumor Progression Involves Successive Rounds of Random 
Inherited Change Followed by Natural Selection 


From all the evidence, therefore, it seems that cancers arise by a process in which 
an initial population of slightly abnormal cells—descendants of a single abnor- 
mal ancestor—evolve from bad to worse through successive cycles of random 
inherited change followed by natural selection. Correspondingly, tumors grow in 
fits and starts, as additional advantageous inherited changes arise and the cells 
bearing them flourish. Tumor progression involves a large element of chance and 
usually takes many years, which may be why the majority of us will die of causes 
other than cancer. 

At each stage of progression, some individual cell acquires an additional muta- 
tion or epigenetic change that gives it a selective advantage over its neighbors, 
making it better able to thrive in its environment—an environment that, inside 
a tumor, may be harsh, with low levels of oxygen, scarce nutrients, and the nat- 
ural barriers to growth presented by the surrounding normal tissues. The larger 
the number of tumor cells, the higher the chance that at least one of them will 
undergo a change that favors it over its neighbors. Thus, as the tumor grows, pro- 
gression accelerates. The offspring of the best-adapted cells continue to divide, 
eventually producing the dominant clones in the developing lesion (Figure 20-9). 

Just as in the evolution of plants and animals, a kind of speciation often occurs: 
the original cancer cell lineage can diversify to give many genetically different vig- 
orous subclones of cells. These may coexist in the same mass of tumor tissue; or 
they may migrate and colonize separate environments suited to their individual 
quirks, where they settle, thrive, and progress as independently evolving metas- 
tases. As new mutations arise within each tumor mass, different subclones may 
gain an advantage and come to predominate, only to be overtaken by others or 
outgrown by their own sub-subclones. The increasing genetic diversity as a cancer 
progresses is one of the chief factors that make cures difficult. 





Figure 20-8 Stages of progression in the 
development of cancer of the epithelium 
of the uterine cervix. Pathologists use 
standardized terminology to classify the 
types of disorders they see, so as to guide 
the choice of treatment. (A) In a stratified 
squamous epithelium, dividing cells are 
confined to the basal layer. (B) In this 
low-grade intraepithelial neoplasia (right 
half of image), dividing cells can be found 
throughout the lower third of the epithelium; 
the superficial cells are still flattened and 
show signs of differentiation, but this is 
incomplete. (C) In high-grade intraepithelial 
neoplasia, cells in all the epithelial layers 
are proliferating and exhibit defective 
differentiation. (D) True malignancy begins 
when the cells move through or destroy the 
basal lamina that underlies the basal layer 
of epithelium and invade the underlying 
connective tissue. (Photographs courtesy 
of Andrew J. Connolly.) 
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Figure 20-9 Clonal evolution. In this schematic diagram, a tumor develops 
through repeated rounds of mutation and proliferation, giving rise eventually 
to a clone of fully malignant cancer cells. At each step, a single cell 
undergoes a mutation that either enhances cell proliferation or decreases 

cell death, so that its progeny become the dominant clone in the tumor. 
Proliferation of each clone hastens the occurrence of the next step of tumor 
progression by increasing the size of the cell population that is at risk of 
undergoing an additional mutation. The final step depicted here is invasion 
through the basement membrane, an initial step in metastasis. In reality, there 
are more than the three steps shown here, and a combination of genetic and 
epigenetic changes are involved. Not shown here is the fact that, over time, a 
variety of competing subclones will often arise in a tumor. As we will discuss 
later, this heterogeneity complicates cancer therapies (See Figure 20-30). 


Human Cancer Cells Are Genetically Unstable 


Most human cancer cells accumulate genetic changes at an abnormally rapid 
rate and are said to be genetically unstable. The extent of this instability and its 
molecular origins differ from cancer to cancer and from patient to patient, as we 
shall discuss in a later section. The basic phenomenon was evident even before 
modern molecular analyses. For example, the cells of many cancers show grossly 
abnormal sets of chromosomes, with duplications, deletions, and translocations 
that are visible at mitosis (Figure 20-10). When the cells are maintained in cul- 
ture, these patterns of chromosomal disruption can often be seen to evolve rap- 
idly and in a seemingly haphazard way. And for many years, pathologists have 
used an abnormal appearance of the cell nucleus to identify and classify cancer 
cells in tumor biopsies; in particular, cancer cells can contain an unusually large 
amount of heterochromatin—a condensed form of interphase chromatin that 
silences genes (see pp. 194-195). This suggested that epigenetic changes of chro- 
matin structure can also contribute to the cancer cell phenotype, as recently con- 
firmed by molecular analysis. 

The genetic instability observed in cancer cells can arise from defects in the 
ability to repair DNA damage or to correct replication errors of various kinds. 
These alterations lead to changes in DNA sequence and produce rearrangements 
such as DNA translocations and duplications. Also common are defects in chro- 
mosome segregation during mitosis, which provide another possible source of 
chromosome instability and changes in karyotype. 

From an evolutionary perspective, none of this should be a surprise: anything 
that increases the probability of random changes in gene function heritable from 
one cell generation to the next—and that is not too deleterious—is likely to speed 
the evolution of a clone of cells toward malignancy, thereby causing this property 
to be selected for during tumor progression. 
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Figure 20-10 Chromosomes from a 
breast tumor displaying abnormalities 
in structure and number. Chromosomes 
were prepared from a breast tumor cell in 
metaphase, spread on a glass slide, and 
stained with (A) a general DNA stain or 

(B) a combination of fluorescently labeled 
DNA molecules that color each normal 
human chromosome differently (see Figure 
4-10). The staining (displayed in false color) 
shows multiple translocations, including a 
doubly translocated chromosome (white 
arrow) that is made up of two pieces of 
chromosome 8 (green-brown) and a piece 
of chromosome 17 (purple). The karyotype 
also contains 48 chromosomes, instead 

of the normal 46. (Courtesy of Joanne 
Davidson and Paul Edwards.) 
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Cancer Cells Display an Altered Control of Growth 


Mutability and large cell population numbers create the opportunities for muta- 
tions to occur, but the driving force for development of a cancer has to come from 
some sort of selective advantage possessed by the mutant cells. Most obviously, 
a mutation or epigenetic change can confer such an advantage by increasing the 
rate at which a clone of cells proliferates or by enabling it to continue proliferating 
when normal cells would stop. Cancer cells that can be grown in culture, or cul- 
tured cells artificially engineered to contain the types of mutations encountered 
in cancers, typically show a transformed phenotype. They are abnormal in their 
shape, their motility, their responses to growth factors in the culture medium, 
and, most characteristically, in the way they react to contact with the substratum 
and with one another. Normal cells will not divide unless they are attached to the 
substratum; transformed cells will often divide even if held in suspension. Normal 
cells become inhibited from moving and dividing when the culture reaches con- 
fluence (where the cells are touching one another); transformed cells continue 
moving and dividing even after confluence, and so pile up in layer upon layer in 
the culture dish (Figure 20-11). In addition, transformed cells no longer require 
all of the positive signals from their surroundings that normal cells require. 

Their behavior in culture gives a hint of the ways in which cancer cells may 
misbehave in their natural environment, embedded in a tissue. But cancer cells in 
the body show other peculiarities that mark them out from normal cells, beyond 
those just described. 


Cancer Cells Have an Altered Sugar Metabolism 


Given sufficient oxygen, normal adult tissue cells will generally fully oxidize 
almost all the carbon in the glucose they take up to CO, which is lost from the 
body as a waste product. A growing tumor needs nutrients in abundance to pro- 
vide the building blocks to make new macromolecules. Correspondingly, most 
tumors have a metabolism more similar to that of a growing embryo than to that 
of normal adult tissue. Tumor cells consume glucose avidly, importing it from the 
blood at a rate that can be as much as 100 times higher than neighboring nor- 
mal cells. Moreover, only a small fraction of this imported glucose is used for 
production of ATP by oxidative phosphorylation. Instead, a great deal of lactate 
is produced, and many of the remaining carbon atoms derived from glucose are 
diverted for use as raw materials for synthesis of the proteins, nucleic acids, and 
lipids required for tumor growth (Figure 20-12). 

This tendency of tumor cells to de-emphasize oxidative phosphorylation even 
when oxygen is plentiful, while at the same time taking up large quantities of 
glucose, can be shown to promote cancer cell growth and is called the Warburg 
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Figure 20-11 Loss of contact inhibition 
by cancer cells in cell culture. Most 
normal cells stop proliferating once they 
have carpeted the dish with a single layer 
of cells: proliferation seems to depend 

on contact with the dish, and to be 
inhibited by contacts with other cells—a 
phenomenon known as “contact inhibition.” 
Cancer cells, in contrast, usually disregard 
these restraints and continue to grow, so 
that they pile up on top of one another, 

as shown (Movie 20.2). (A) Schematic 
drawing. (B and C) Light micrographs of 
normal (B) and transformed (C) fibroblasts. 
(B and C, courtesy of Lan Bo Chen.) 
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effect—so named because Otto Warburg first noticed the phenomenon in the early 
twentieth century. It is this abnormally high glucose uptake that allows tumors to 
be selectively imaged in whole-body scans (see Figure 20-1), thereby providing a 
way to monitor cancer progression and responses to treatment. 


Cancer Cells Have an Abnormal Ability to Survive Stress and DNA 
Damage 


In a large multicellular organism, there are powerful safety mechanisms that 
guard against the trouble that can be caused by damaged and deranged cells. For 
example, internal disorder gives rise to danger signals in the faulty cell, activat- 
ing protective devices that can eventually lead to apoptosis (see Chapter 18). To 
survive, cancer cells require additional mutations to elude or break through these 
defenses against cellular misbehavior. 

Cancer cells are found to contain mutations that drive the cell into an abnormal 
state, where metabolic processes may be unbalanced and essential cell compo- 
nents may be produced in ill-matched proportions. States of this type, where the 
cell’s homeostatic mechanisms are inadequate to cope with an imposed distur- 
bance, are loosely referred to as states of cell stress. As one example, chromosome 
breakage and other forms of DNA damage are commonly observed during the 
development of cancer, reflecting the genetic instability that cancer cells display. 
Thus, to survive and divide without limit, a prospective cancer cell must accumu- 
late mutations that disable the normal safety mechanisms that would otherwise 
induce a cell that is stressed, in this or in other ways, to commit suicide. In fact, 
one of the most important properties of many types of cancer cells is that they fail 
to undergo apoptosis when a normal cell would do so (Figure 20-13). 

While cancer cells tend to avoid apoptosis, this does not mean that they rarely 
die. On the contrary, in the interior of a large solid tumor, cell death often occurs 
on a massive scale: living conditions are difficult, with severe competition among 
the cancer cells for oxygen and nutrients. Many die, but typically much more by 
necrosis than by apoptosis (Figure 20-14). The tumor grows because the cell birth 
rate outpaces the cell death rate, but often by only a small margin. For this reason, 
the time that a tumor takes to double in size can be far longer than the cell-cycle 
time of the tumor cells. 


Human Cancer Cells Escape a Built-in Limit to Cell Proliferation 


Many normal human cells have a built-in limit to the number of times they can 
divide when stimulated to proliferate in culture: they permanently stop dividing 
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Figure 20-12 The Warburg effect in 
tumor cells reflects a dramatic change in 
glucose uptake and sugar metabolism. 
(A) Cells that are not proliferating will 
normally oxidize nearly all of the glucose 
that they import from the blood to produce 
ATP through the oxidative phosphorylation 
that takes place in their mitochondria. Only 
when deprived of oxygen will these cells 
generate most of their ATP from glycolysis, 
converting the pyruvate produced to lactate 
in order to regenerate the NAD* that they 
need to keep glycolysis going (See Figure 
2-47). (B) Tumor cells, by contrast, will 
generally produce abundant lactate even 

in the presence of oxygen. This results 
from a greatly increased rate of glycolysis 
that is fed by a very large increase in the 
rate of glucose import. In this way, tumor 
cells resemble the rapidly proliferating 

cells in embryos (and during tissue repair), 
which likewise require for biosynthesis a 
large supply of the small-molecule building 
blocks that can be produced from imported 
glucose (see also Figure 20-26). 
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after a certain number of population doublings (25-50 for human fibroblasts, 
for example). This cell-division-counting mechanism is termed replicative 
cell senescence, and it generally depends on the progressive shortening of the 
telomeres at the ends of chromosomes, a process that eventually changes their 
structure (discussed in Chapter 17). As discussed in Chapter 5, the replication of 
telomere DNA during S phase depends on the enzyme telomerase, which main- 
tains a special telomeric DNA sequence that promotes the formation of protein 
cap structures to protect chromosome ends. Because many proliferating human 
cells (stem cells being an exception) are deficient in telomerase, their telomeres 
shorten with every division, and their protective caps deteriorate, creating a DNA 
damage signal. Eventually, the altered chromosome ends can trigger a permanent 
cell-cycle arrest, causing a normal cell to die. 

Human cancer cells avoid replicative cell senescence in one of two ways. They 
can maintain the activity of telomerase as they proliferate, so that their telomeres 
do not shorten or become uncapped, or they can evolve an alternate mechanism 
based on homologous recombination (called ALT) for elongating their chromo- 
some ends. Regardless of the strategy used, the result is that the cancer cells con- 
tinue to proliferate under conditions when normal cells would stop. 


The Tumor Microenvironment Influences Cancer Development 


While the cancer cells in a tumor are the bearers of dangerous mutations and 
are often grossly abnormal, the other cells in the tumor—especially those of the 
supporting connective tissue, or stroma—are far from passive bystanders. The 
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Figure 20-13 Both increased cell 
division and decreased apoptosis can 
contribute to tumorigenesis. In normal 
tissues, apoptosis balances cell division 
to maintain homeostasis (see Movie 18.1). 
During the development of cancer, either 
an increase in cell division or an inhibition 
of apoptosis can lead to the increased 
cell numbers important for tumorigenesis. 
The cells fated to undergo apoptosis are 
gray in this diagram. Both an increase in 
cell division and a decrease in apoptosis 
normally contribute to tumor growth. 


Figure 20-14 Cross-section of a colon 
adenocarcinoma that has metastasized 
to the lung. This tissue slice shows 
well-differentiated colorectal cancer cells 
forming cohesive glands in the lung. 

The metastasis has central pink areas of 
necrosis where dying cancer cells have 
outgrown their blood supply. Such anoxic 
regions are common in the interior of large 
tumors. (Courtesy of Andrew J. Connolly.) 
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development of a tumor relies on a two-way communication between the tumor 
cells and the tumor stroma, just as the normal development of epithelial organs 
relies on communication between epithelial cells and mesenchymal cells (dis- 
cussed in Chapter 22). 

The stroma provides a framework for the tumor. It is composed of normal con- 
nective tissue containing fibroblasts and inflammatory white blood cells, as well 
as the endothelial cells that form blood and lymphatic vessels with their attendant 
pericytes and smooth muscle cells (Figure 20-15). As a carcinoma progresses, the 
cancer cells induce changes in the stroma by secreting signal proteins that alter 
the behavior of the stromal cells, as well as proteolytic enzymes that modify the 
extracellular matrix. The stromal cells in turn act back on the tumor cells, secret- 
ing signal proteins that stimulate cancer cell growth and division as well as pro- 
teases that further remodel the extracellular matrix. In these ways, the tumor and 
its stroma evolve together, like weeds and the ecosystem that they invade, and 
the tumor becomes dependent on its particular stromal cells. Experiments using 
mice indicate that the growth of some transplanted carcinomas depends on the 
tumor-associated fibroblasts and normal fibroblasts will not do. Such environ- 
mental requirements help to protect us from cancer, as we discuss next in consid- 
ering the critical phenomenon called metastasis. 


Cancer Cells Must Survive and Proliferate in a Foreign Environment 


Cancer cells generally need to spread and multiply at new sites in the body in 
order to kill us, through a process called metastasis. This is the most deadly—and 
least understood—aspect of cancer, being responsible for 90% of cancer-associ- 
ated deaths. By spreading through the body, a cancer becomes almost impossi- 
ble to eradicate by either surgery or local irradiation. Metastasis is itself a multi- 
step process: the cancer cells first have to invade local tissues and vessels, move 
through the circulation, leave the vessels, and then establish new cellular colonies 
at distant sites (Figure 20-16). Each of these events is complex, and most of the 
molecular mechanisms involved are not yet clear. 

For a cancer cell to become dangerous, it must break free of constraints that 
keep normal cells in their proper places and prevent them from invading neigh- 
boring tissues. Invasiveness is thus one of the defining properties of malignant 
tumors, which show a disorganized pattern of growth and ragged borders, with 
extensions into the surrounding tissue (see, for example, Figure 20-8). Although 
the underlying molecular changes are not well understood, invasiveness almost 
certainly requires a disruption of the adhesive mechanisms that normally keep 
cells tethered to their proper neighbors and to the extracellular matrix. For carci- 
nomas, this change resembles the epithelial-mesenchymal transition (EMT) that 
occurs in some epithelial tissues during normal development (see p. 1042). 

The next step in metastasis—the establishment of colonies in distant organs— 
begins with entry into the circulation: the invasive cancer cells must penetrate the 
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Figure 20-15 The tumor 
microenvironment plays a role in 
tumorigenesis. Tumors consist of 

many cell types, including cancer cells, 
endothelial cells, pericytes (vascular 
smooth muscle cells), fibroblasts, 

and inflammatory white blood cells. 
Communication among these and other 
cell types plays an important part in tumor 
development. Note, however, that only the 
cancer cells are thought to be genetically 
abnormal in a tumor. 
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Figure 20-16 Steps in the process of metastasis. This example illustrates the spread of a tumor from an organ such as the 
bladder to the liver. Tumor cells may enter the bloodstream directly by crossing the wall of a blood vessel, as diagrammed here, 
or, more commonly perhaps, by crossing the wall of a lymphatic vessel that ultimately discharges its contents (lymph) into the 
bloodstream. Tumor cells that have entered a lymphatic vessel often become trapped in lymph nodes along the way, giving rise 
to lymph-node metastases. 

Studies in animals show that typically far fewer than one in every thousand malignant tumor cells that enter the bloodstream 
will colonize a new tissue so as to produce a detectable tumor at a new site. 


wall of a blood or lymphatic vessel. Lymphatic vessels, being larger and having 
more flimsy walls than blood vessels, allow cancer cells to enter in small clumps; 
such clumps may then become trapped in lymph nodes, giving rise to lymph- 
node metastases. The cancer cells that enter blood vessels, in contrast, seem to 
do so singly. With modern techniques for sorting cells according to their surface 
properties, it has become possible in some cases to detect these circulating tumor 
cells (CTCs) in samples of blood from cancer patients, even though they are only a 
minute fraction of the total blood-cell population. These cells, in principle at least, 
provide a useful sample of the tumor-cell population for genetic analysis. 

Of the cancer cells that enter the lymphatics or bloodstream, only a tiny pro- 
portion succeed in making their exit, settling in new sites, and surviving and pro- 
liferating there as founders of metastases. Experiments show that fewer than one 
in thousands, perhaps one in millions, manage this feat. The final step of coloni- 
zation seems to be the most difficult: like the Vikings who landed on the inhospi- 
table shores of Greenland, the migrant cells may fail to survive in the alien envi- 
ronment; or they may only thrive there for a short while to found a little colony—a 
micrometastasis—that then dies out (Movie 20.3). 

Many cancers are discovered before they have managed to found metastatic 
colonies and can be cured by destruction of the primary tumor. But on occasion, 
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an undetected micrometastasis will remain dormant for many years, only to 
reveal its presence by erupting into growth to form a large secondary tumor long 
after the primary tumor has been removed. 


Many Properties Typically Contribute to Cancerous Growth 


Clearly, to produce a cancer, a cell must acquire a range of aberrant properties—a 
collection of subversive new skills—as it evolves. Different cancers require differ- 
ent combinations of these properties. Nevertheless, cancers all share some com- 
mon features. By definition, they all ignore or misinterpret normal social controls 
so as to proliferate and spread where normal cells would not. These defining prop- 
erties are commonly combined with other features that help the miscreants to 
arise and thrive. A list of the key attributes of cancer cells in general would include 
the following, all of which we have just discussed: 


1. They grow (biosynthesize) when they should not, aided by a metabolism 
shifted from oxidative phosphorylation toward aerobic glycolysis. 


2. They go through the cell-division cycle when they should not. 


3. They escape from their home tissues (that is, they are invasive) and survive 
and proliferate in foreign sites (that is, they metastasize). 


4. They have abnormal stress responses, enabling them to survive and con- 
tinue dividing in conditions of stress that would arrest or kill normal cells, 
and they are less prone than normal cells to commit suicide by apoptosis. 


5. They are genetically and epigenetically unstable. 


6. They escape replicative cell senescence, either by producing telomerase or 
by acquiring another way of stabilizing their telomeres. 
In the next section of the chapter, we examine the mutations and molecular 
mechanisms that underlie these and other properties of cancer cells. 


Summary 


Cancer cells, by definition, grow and proliferate in defiance of normal controls (that 
is, they are neoplastic) and are able to invade surrounding tissues and colonize dis- 
tant organs (that is, they are malignant). By giving rise to secondary tumors, or 
metastases, they become difficult to eradicate by surgery or local irradiation. Can- 
cers are thought to originate from a single cell that has experienced an initial muta- 
tion, but the progeny of this cell must undergo many further changes, requiring 
additional mutations and epigenetic events, to become cancerous. Tumor progres- 
sion usually takes many years and reflects the operation of a Darwinian-like pro- 
cess of evolution, in which somatic cells undergo mutation and epigenetic changes 
accompanied by natural selection. 

Cancer cells acquire a variety of special properties as they evolve, multiply, and 
spread. Their mutant genomes enable them to grow and divide in defiance of the 
signals that normally keep cell proliferation under tight control. As part of the evo- 
lutionary process of tumor progression, cancer cells acquire a collection of addi- 
tional abnormalities, including defects in the controls that permanently stop cell 
division or induce apoptosis in response to cell stress or DNA damage, and in the 
mechanisms that normally keep cells from straying from their proper place. All of 
these changes increase the ability of cancer cells to survive, grow, and divide in their 
original tissue and then to metastasize, founding new colonies in foreign environ- 
ments. The evolution of a tumor also depends on other cells present in the tumor 
microenvironment, collectively called stromal cells, that the cancer attracts and 
manipulates. 

Since many changes are needed to confer this collection of asocial behaviors, it is 
not surprising that most cancer cells are genetically and/or epigenetically unstable. 
This instability is thought to be selected for in the clones of aberrant cells that are 
able to produce tumors, because it greatly accelerates the accumulation of the fur- 
ther genetic and epigenetic changes that are required for tumor progression. 
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CANCER-CRITICAL GENES: HOW THEY ARE FOUND 
AND WHAT THEY DO 


As we have seen, cancer depends on the accumulation of inherited changes in 
somatic cells. To understand it at a molecular level we need to identify the muta- 
tions and epigenetic changes involved and to discover how they give rise to can- 
cerous cell behavior. Finding the relevant cells is often easy; they are favored by 
natural selection and call attention to themselves by giving rise to tumors. But 
how do we identify those genes with the cancer-promoting changes among all 
the other genes in the cancerous cells? A typical cancer depends on a whole set 
of mutations and epigenetic changes—usually a somewhat different set in each 
individual patient. In addition, a given cancer cell will also contain a large num- 
ber of somatic mutations that are accidental by-products—so-called passengers 
rather than drivers—of its genetic instability, and it can be difficult to distinguish 
these meaningless changes from those changes that have a causative role in the 
disease. Despite these difficulties, many of the genes that are repeatedly altered 
in human cancers have been identified over the past 40 years. We will call such 
genes, for want of a better term, cancer-critical genes, meaning all genes whose 
alteration contributes to the causation or evolution of cancer by driving tumori- 
genesis. 

In this section, we shall first discuss how cancer-critical genes are identified. 
We shall then examine their functions and the parts they play in conferring on 
cancer cells the properties outlined in the first part of the chapter. We shall end 
the section by discussing colon cancer as an extended example, showing how a 
succession of changes in cancer-critical genes enables a tumor to evolve from one 
pattern of bad behavior to another that is worse. 


The Identification of Gain-of-Function and Loss-of-Function 
Cancer Mutations Has Traditionally Required Different Methods 


Cancer-critical genes are grouped into two broad classes, according to whether the 
cancer risk arises from too much activity of the gene product or too little. Genes of 
the first class, in which a gain-of-function mutation can drive a cell toward cancer, 
are called proto-oncogenes; their mutant, overactive or overexpressed forms are 
called oncogenes. Genes of the second class, in which a loss-of-function muta- 
tion can contribute to cancer, are called tumor suppressor genes. In either case, 
the mutation may lead toward cancer directly (by causing cells to proliferate 
when they should not) or indirectly—for example, by causing genetic or epigen- 
etic instability and so hastening the occurrence of other inherited changes that 
directly stimulate tumor growth. Those genes whose alteration results in genomic 
instability represent a subclass of cancer-critical genes that are sometimes called 
genome maintenance genes. 

As we shall see, mutations in oncogenes and tumor suppressor genes can 
have similar effects in promoting the development of cancer; overproduction of a 
signal for cell proliferation, for example, can result from either kind of mutation. 
Thus, from the point of view of a cancer cell, oncogenes and tumor suppressor 
genes—and the mutations that affect them—are flip sides of the same coin. The 
techniques that led to the discovery of these two categories of genes, however, are 
quite different. 

The mutation of a single copy of a proto-oncogene that converts it to an onco- 
gene has a dominant, growth-promoting effect on a cell (Figure 20-17A). Thus, we 
can identify the oncogene by its effect when it is added—by DNA transfection, for 
example, or through infection with a viral vector—to the genome of a suitable type 
of tester cell or experimental animal. In the case of the tumor suppressor gene, on 
the other hand, the cancer-causing alleles produced by the change are generally 
recessive: often (but not always) both copies of the normal gene must be removed 
or inactivated in the diploid somatic cell before an effect is seen (Figure 20-17B). 
This calls for a different experimental approach, one focusing on discovering what 
is missing in the cancer cell. 
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Figure 20-17 Cancer-critical mutations 
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We begin by discussing some examples of each class of cancer-critical genes 
to illustrate basic principles. These examples are chosen also for their historical 
importance: the experiments that led to their discovery—at different times and by 
different methods—marked turning points in the understanding of cancer. 


Retroviruses Can Act as Vectors for Oncogenes That Alter Cell 
Behavior 


The search for the genetic causes of human cancer took a devious route, begin- 
ning with clues that came from the study of tumor viruses. Although viruses are 
involved only in a minority of human cancers, a set of viruses that infect animals 
provided critical early tools for studying cancer. 

One of the first animal viruses to be implicated in cancer was discovered over 
100 years ago in chickens, when an infectious agent that causes connective-tissue 
tumors, or sarcomas, was characterized as a virus—the Rous sarcoma virus. Like 
all the other RNA tumor viruses discovered since, it is a retrovirus. When it infects 
a cell, its RNA genome is copied into DNA by reverse transcription, and the DNA 
is inserted into the host genome, where it can persist and be inherited by subse- 
quent generations of cells. Something in the DNA inserted by the Rous sarcoma 
virus made the host cells cancerous, but what was it? The answer was a surprise. 
It turned out to be a piece of DNA that was unnecessary for the virus’s own sur- 
vival or reproduction; instead, it was a passenger, a gene called v-Src, that the virus 
had picked up on its travels. v-Src was unmistakably similar, but not identical, to a 
gene—c-Src—that was discovered in the normal vertebrate genome. c-Src had evi- 
dently been caught up accidentally by the retrovirus from the genome of a previ- 
ously infected host cell, and it had undergone mutation in the process to become 
an oncogene (v-Src). 

This Nobel Prize-winning finding was followed by a flood of discoveries of 
other viral oncogenes carried by retroviruses that cause cancer in nonhuman ani- 
mals. Each such oncogene turned out to have a counterpart proto-oncogene in 
the normal vertebrate genome. As was the case for Src, these other oncogenes 
generally differed from their normal counterparts, either in structure or in level of 
expression. But how did this relate to typical human cancers, most of which are 
not infectious and in which retroviruses play no part? 
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Different Searches for Oncogenes Converged on the Same 
Gene—Ras 


In an attempt to answer the above question, other researchers searched directly 
for oncogenes in the genomes of human cancer cells. They did this by searching 
for DNA fragments from cancer cells that could provoke uncontrolled prolifera- 
tion when introduced into noncancerous cell lines. As tester cells for the assay, 
cell lines derived from mouse fibroblasts were used. These cells had been pre- 
viously selected for their ability to proliferate indefinitely in culture, and they 
are thought to already contain alterations that take them part of the way toward 
malignancy. For this reason, the addition of a single oncogene can sometimes be 
enough to produce a dramatic effect. 

When DNA was extracted from the human tumor cells, broken into fragments, 
and introduced into the cultured cells, occasional colonies of abnormally prolifer- 
ating cells began to appear in the culture dish. These cells showed a transformed 
phenotype, outgrowing the untransformed cells in the culture and piling up in 
layer upon layer (see Figure 20-11). Each colony was a clone originating from a 
single cell that had incorporated a DNA fragment that drove cancerous behavior. 
This fragment, which carried markers of its human origin, could be isolated from 
the transformed cultured mouse cells. And once isolated and sequenced, it could 
be recognized: it contained a human version of a gene already known from study 
of a retrovirus that caused tumors in rats—an oncogene called v-Ras. 

The newly discovered oncogene was clearly derived by mutation from a nor- 
mal human gene, one of a small family of proto-oncogenes called Ras. This dis- 
covery in the early 1980s of the same oncogene in human tumor cells and in an 
animal tumor virus was electrifying. The implication that cancers are caused by 
mutations in a limited number of cancer-critical genes transformed our under- 
standing of the molecular biology of cancer. 

As discussed in Chapter 15, normal Ras proteins are monomeric GTPases that 
help transmit signals from cell-surface receptors to the cell interior (see Movie 
15.7). The Ras oncogenes isolated from human tumors contain point mutations 
that create a hyperactive Ras protein that cannot shut itself off by hydrolyzing its 
bound GTP to GDP. Because this makes the protein hyperactive, its effect is dom- 
inant—that is, only one of the cell’s two gene copies needs to change to have an 
effect. One or another of the three human Ras family members is mutated in per- 
haps 30% of all human cancers. Ras genes are thus among the most important of 
all cancer-critical genes. 


Genes Mutated in Cancer Can Be Made Overactive in Many Ways 


Figure 20-18 summarizes the types of accidents that can convert a proto-on- 
cogene into an oncogene. (1) A small change in DNA sequence such as a point 
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Figure 20-18 The types of accidents that can convert a proto-oncogene into an oncogene. 
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mutation or deletion may produce a hyperactive protein when it occurs within a 
protein-coding sequence, or lead to protein overproduction when it occurs within 
a regulatory region for that gene. (2) Gene amplification events, such as those that 
can be caused by errors in DNA replication, may produce extra gene copies; this 
can lead to overproduction of the protein. (3) A chromosomal rearrangement— 
involving the breakage and rejoining of the DNA helix—may either change the 
protein-coding region, resulting in a hyperactive fusion protein, or alter the con- 
trol regions for a gene so that a normal protein is overproduced. 

As one example, the receptor for the extracellular signal protein epidermal 
growth factor (EGF) can be activated by a deletion that removes part ofits extracel- 
lular domain, causing it to be active even in the absence of EGF (Figure 20-19). It 
thus produces an inappropriate stimulatory signal, like a faulty doorbell that rings 
even when nobody is pressing the button. Mutations of this type are frequently 
found in the most common type of human brain tumor, called glioblastoma. 

As another example, the Myc protein, which acts in the nucleus to stimulate 
cell growth and division (see Chapter 17), generally contributes to cancer by 
being overproduced in its normal form. In some cases, the gene is amplified— 
that is, errors of DNA replication lead to the creation of large numbers of gene 
copies in a single cell. Or a point mutation can stabilize the protein, which nor- 
mally turns over very rapidly. More commonly, the overproduction appears to 
be due to a change in a regulatory element that acts on the gene. For example, a 
chromosomal translocation can inappropriately bring powerful gene regulatory 
sequences next to the Myc protein-coding sequence, so as to produce unusually 
large amounts of Myc mRNA. Thus, in Burkitt’s lymphoma, a translocation brings 
the Myc gene under the control of sequences that normally drive the expression of 
antibody genes in B lymphocytes. As a result, the mutant B cells tend to proliferate 
excessively and form a tumor. Different specific chromosome translocations are 
common in other cancers. 


Studies of Rare Hereditary Cancer Syndromes First Identified 
Tumor Suppressor Genes 


Identifying a gene that has been inactivated in the genome of a cancer cell requires 
a different strategy from finding a gene that has become hyperactive: one cannot, 
for example, use a cell transformation assay to identify something that simply is 
not there. The key insight that led to the discovery of the first tumor suppressor 
gene came from studies of a rare type of human cancer, retinoblastoma, which 
arises from cells in the retina of the eye that are converted to a cancerous state by 
an unusually small number of mutations. As often happens in biology, the discov- 
ery arose from examination of a special case, but it turned out to reveal a gene of 
widespread importance. 

Retinoblastoma occurs in childhood, and tumors develop from neural pre- 
cursor cells in the immature retina. About one child in 20,000 is afflicted. One 
form of the disease is hereditary, and the other is not. In the hereditary form, 
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Figure 20-19 Mutation of the epidermal 
growth factor (EGF) receptor can make 
it active even in the absence of EGF, and 
consequently oncogenic. Only one of the 
possible types of activating mutations is 
illustrated here. 
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multiple tumors usually arise independently, affecting both eyes; in the nonhe- 
reditary form, only one eye is affected, and by only one tumor. A few individuals 
with retinoblastoma have a visibly abnormal karyotype, with a deletion of a spe- 
cific band on chromosome 13 that, if inherited, predisposes an individual to the 
disease. Deletions of this same region are also encountered in tumor cells from 
some patients with the nonhereditary disease, which suggested that the cancer 
was caused by loss of a critical gene in that location. 

Using the location of this chromosomal deletion, it was possible to clone and 
sequence the Rb gene. It was then discovered that those who suffer from the 
hereditary form of the disease have a deletion or loss-of-function mutation pres- 
ent in one copy of the Rb gene in every somatic cell. These cells are predisposed 
to becoming cancerous, but do not do so if they retain one good copy of the gene. 
The retinal cells that are cancerous are defective in both copies of Rb because of a 
somatic event that has eliminated the function of the previously good copy. 

In patients with the nonhereditary form of the disease, by contrast, the non- 
cancerous cells show no defect in either copy of Rb, while the cancerous cells have 
become defective in both copies. These nonhereditary retinoblastomas are very 
rare because they require two independent events that inactivate the same gene 
on two chromosomes in a single retinal cell lineage (Figure 20-20). 

The Rb gene is also missing in several common types of sporadic cancer, 
including carcinomas of lung, breast, and bladder. These more common cancers 
arise by a more complex series of genetic changes than does retinoblastoma, and 
they make their appearance much later in life. But in all of them, it seems, loss of 
Rb function is frequently a major step in the progression toward malignancy. 

The Rb gene encodes the Rb protein, which is a universal regulator of the cell 
cycle present in almost all cells of the body (see Figure 17-61). It acts as one of 
the main brakes on progress through the cell-division cycle, and its loss can allow 
cells to enter the cell cycle inappropriately, as we discuss later. 


Both Genetic and Epigenetic Mechanisms Can Inactivate Tumor 
Suppressor Genes 


For tumor suppressor genes, it is their inactivation that is dangerous. This inacti- 
vation can occur in many ways, with different combinations of mishaps serving 
to eliminate or cripple both gene copies. The first copy may, for example, be lost 
by a small chromosomal deletion or inactivated by a point mutation. The second 
copy is commonly eliminated by a less specific and more probable mechanism: 


Figure 20-20 The genetic mechanisms 
that cause retinoblastoma. In the 
hereditary form, all cells in the body lack 
one of the normal two functional copies of 
the Rb tumor suppressor gene, and tumors 
occur where the remaining copy is lost 

or inactivated by a somatic event (either 
mutation or epigenetic silencing). In the 
nonhereditary form, all cells initially contain 
two functional copies of the gene, and the 
tumor arises because both copies are lost 
or inactivated through the coincidence of 
two somatic events in a single line of cells. 
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the chromosome carrying the remaining normal copy may be lost from the cell 
through errors in chromosome segregation; or the normal gene, along with 
neighboring genetic material, may be replaced by a mutant version through 
either a mitotic recombination event or a gene conversion that accompanies it (see 
p. 286). 

Figure 20-21 summarizes the range of ways in which the remaining good copy 
of a tumor suppressor gene can be lost through a DNA sequence change, using the 
Rb gene as an example. It is important to note that, except for the point mutation 
mechanism illustrated at the far right, these pathways all produce cells that carry 
only a single type of DNA sequence in the chromosomal region containing their 
Rb genes—a sequence that is identical to the sequence in the original mutant 
chromosome. 

Epigenetic changes provide another important way to permanently inactivate 
a tumor suppressor gene. Most commonly, the gene may become packaged into 
heterochromatin and/or the C nucleotides in CG sequences in its promoter may 
become methylated in a heritable manner (see pp. 404-405). These mechanisms 
can irreversibly silence the gene in a cell and in all ofits progeny. Analysis of meth- 
ylation patterns in cancer genomes shows that epigenetic gene silencing is a fre- 
quent event in tumor progression, and epigenetic mechanisms are now thought 
to help inactivate several different tumor suppressor genes in most human can- 
cers (Figure 20-22). 


Systematic Sequencing of Cancer Cell Genomes Has Transformed 
Our Understanding of the Disease 


Methods such as those we have described above shone a spotlight on a set of can- 
cer-critical genes that were identified in a piecemeal fashion. Meanwhile, the rest 
of the cancer cell genome remained in darkness: it was a mystery how many other 
mutations might lurk there, of what types, in which varieties of cancer, at what 
frequencies, with what variations from patient to patient, and with what conse- 
quences. With the sequencing of the human genome and the dramatic advances 
in DNA sequencing technology (see Panel 8-1, pp. 478-481), it has become pos- 
sible to see the whole picture—to view cancer cell genomes in their entirety. This 
transforms our understanding of the disease. 

Cancer cell genomes can be scanned systematically in several different ways. 
At one extreme—the most costly, but no longer prohibitively so—one can deter- 
mine a tumor’s complete genome sequence. More cheaply, one can focus just on 
the 21,000 or so genes in the human genome that code for protein (the so-called 
exome), looking for mutations in the cancer cell DNA that alter the amino acid 
sequence of the product or prevent its synthesis (Figure 20-23). There are also 
efficient techniques to survey the genome for regions that have undergone 


1109 


Figure 20-21 Six ways of losing the 
remaining good copy of a tumor 
suppressor gene through a change in 
DNA sequences. A cell that is defective 
in only one of its two copies of a tumor 
suppressor gene—for example, the Rb 
gene—usually behaves as a normal, 
healthy cell; the diagrams below show how 
this cell may lose the function of the other 
gene copy as well and thereby progress 
toward cancer. A seventh possibility, 
frequently encountered with some tumor 
suppressors, is that the gene may be 
silenced by an epigenetic change, without 
alteration of the DNA sequence, 

as illustrated in Figure 20-22. (After 

W.K. Cavenee et al., Nature 305:779-784, 
1983. With permission from Macmillan 
Publishers Ltd.) 
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deletion or duplication, without the need for complete sequence information. 
The genome can be scanned for epigenetic changes. And finally, alterations in 
levels of gene expression can be systematically determined by analysis of mRNAs 
(see Figure 7-3). These approaches generally involve comparing cancer cells with 


normal controls—ideally, noncancerous cells originating in the same tissue and 
from the same patient. 
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Figure 20-22 The pathways leading to 
loss of tumor suppressor gene function 
in cancer involve both genetic and 
epigenetic changes. (A) As indicated, the 
changes that silence tumor suppressor 
genes can occur in any order. Both DNA 
methylation and the packaging of a gene 
into condensed chromatin can prevent 

its expression in a way that is inherited 
when a cell divides (See Figure 4—44). 

(B) The frequency of gene silencing by 
hypermethylation observed in four different 
types of cancer. The five genes listed at the 
top can all function as tumor suppressor 
genes; BRCA1 and hMLH7 affect genome 
stability and are in the subclass known as 
genome maintenance genes. ND, no data. 
(Adapted from M. Esteller et al., Cancer 
Res. 61:3225-3229, 2001.) 


Figure 20-23 The distinct types of DNA 
sequence changes found in oncogenes 
compared to tumor suppressor 

genes. In this diagram, mutations that 
change an amino acid are denoted by 
blue arrowheads, whereas mutations 

that truncate the polypeptide chain are 
marked by yellow arrowheads. (A) As in 
this example, oncogene mutations can 

be detected by the fact that the same 
nucleotide change is repeatedly found 
among the missense mutations in a 

gene. (B) For tumor Suppressor genes, by 
contrast, missense mutations that abort 
protein synthesis by creating stop codons 
predominate. (Adapted from B. Vogelstein 
et al., Science 339:1546-1558, 2013.) 
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Many Cancers Have an Extraordinarily Disrupted Genome 


Cancer genome analysis reveals, first of all, the scale of gross genetic disruption 
in cancer cells. This varies greatly from one type of cancer and one cancer patient 
to another, both in severity and in character. In some cases, the karyotype—the 
set of chromosomes as they appear at mitosis—is normal or nearly so, but many 
point mutations are detected in individual genes, suggesting a failure of the 
repair mechanisms that normally correct local errors in the replication or main- 
tenance of DNA sequences. Often, however, the karyotype is severely disordered, 
with many chromosome breaks and rearrangements. In some breast cancers, for 
example, genome sequencing reveals an astonishing scene of genetic chaos (Fig- 
ure 20-24), with hundreds of chromosome breaks and translocations, resulting in 
many deletions, duplications, and amplifications of parts of the genome. In such 
cells, the normal machinery for avoidance or repair of DNA double-strand breaks 
is evidently somehow defective, destabilizing the genome by giving rise to broken 
chromosomes whose fragments then rejoin in random combinations. From the 
pattern of changes, one can infer that this disruptive process has occurred repeat- 
edly during the evolution of the tumor, with a progressive increase of genetic dis- 
order. Breast cancers showing the most extreme chromosome disorder are usu- 
ally hard to treat and have a gloomy prognosis. 

One survey of more than 3000 individual cancer specimens showed that on 
average 24 separate blocks of genetic material were duplicated in each tumor, 
amounting to 17% of the normal genome, and 18 blocks were deleted, amount- 
ing to 16% of the normal genome. Many of these changes were found repeatedly, 
suggesting that they contain cancer-critical genes whose loss (tumor suppressor 
genes) or gain (oncogenes) confers a selective advantage. 

Whole-genome analysis also helps to explain some cancers that seem, at first 
sight, to be exceptions to the general rules. An example is retinoblastoma, with 
its early onset during childhood. If cancers in general require an accumulation of 
many genetic changes and are thus diseases of old age, what makes retinoblas- 
toma different? Whole-genome sequencing confirms that in retinoblastoma, the 
tumor cells contain loss-of-function mutations in the Rb gene; but, astonishingly, 
they contain practically no mutations or genome rearrangements that affect any 
other oncogene or tumor suppressor gene. Instead, they contain many epigenetic 
modifications, which alter the level of expression of many known cancer-critical 
genes—as many as 15 in one well-analyzed case. 


Many Mutations in Tumor Cells are Merely Passengers 


Cancer cells generally contain many mutations in addition to gross chromosome 
abnormalities: point mutations can be scattered over the genome as a whole at 
a rate of about one per million nucleotide pairs, in addition to the abnormalities 
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Figure 20-24 The chromosomal 
rearrangements in breast cancer 

cells. The results of an extensive DNA 
sequencing analysis performed on two 
different primary tumors are displayed 

as “Circos plots.” In each plot, the 
reference DNA sequences of the 22 
autosomes and single sex chromosome 

(X) of a normal human female (3.2 billion 
nucleotide pairs) are aligned end-to-end 

to form a circle. Colored lines within 

the circle are then used to indicate the 
chromosome alterations found in the 
particular primary tumor. As indicated, 
purple lines connect sites at which two 
different chromosomes have become 
joined to create an interchromosomal 
rearrangment, while green lines connect 
the sites of rearrangements found within a 
single chromosome. The intrachromosomal 
rearrangements can be seen to 
predominate, and most join neighboring 
sections of DNA that were originally located 
within 2 million nucleotide pairs of each 
other. The increases in copy number, 
shown in blue, reveal the amplified DNA 
sequences (see the highly amplified regions 
indicated). (Adapted from P.J. Stephens et 
al., Nature 462:1005-1010, 2009.) 
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attributed to chromosome breakage and rejoining. Systematic surveys of the pro- 
tein-coding genes in common solid tumors—such as those of the breast, colon, 
brain, or pancreas—have revealed that an average of 33 to 66 genes have under- 
gone somatic mutation affecting the sequence of their protein product. Mutations 
in noncoding regions of the genome are much more numerous, as one would 
expect from the much larger fraction of the genome that noncoding DNA rep- 
resents. But they are considerably more difficult to interpret. 

The high frequency of mutations testifies to the genetic instability of many 
cancer cells, but it leaves us with a difficult problem. How can we discover which 
of the mutations are drivers of cancer—that is, causal factors in the development 
of the disease—and which are merely passengers—mutations that happen to 
have occurred in the same cell as the driver mutations, thanks to genetic insta- 
bility, but are irrelevant to the development of the disease? A simple criterion is 
based on frequency of occurrence. Driver mutations affecting a gene that plays a 
part in the disease will be seen repeatedly, in many different patients. In contrast, 
passenger mutations, occurring at more-or-less random locations in the genome 
and conferring no selective advantage on the cancer cell, are unlikely to be found 
in the same genes in different patients. 

Figure 20-25 shows the results of an analysis of this sort for a large sample 
of colorectal cancers. The different sites in the genome are laid out on a two-di- 
mensional array, with chromosome serial number along one axis and position 
within each chromosome along the other. The frequency with which mutations 
are encountered is shown by height above this plane, creating a mutation “land- 
scape” with mountains (sites where mutations are found in a large proportion of 
the tumors in the sample), hills (where mutations are found less frequently but 
still more often than would be expected for a random scattering over the genome), 
and hillocks (sites of occasional mutations, occurring at a frequency no higher 
than would be expected for mutations scattered at random in each individual 
tumor). The mountains and the hills are strong candidates to be the sites of driver 
mutations—in other words, sites of cancer-critical genes; the hillocks are likely to 
correspond to passengers. Indeed, many of the mountains and hills turn out to be 
sites of known oncogenes or tumor suppressor genes, whereas the hillocks mostly 
correspond to genes that have no known or probable role in causation of cancer. 
Of course, some hillocks may correspond to genes that are mutated in only a few 
rare patients but are nevertheless cancer-critical for them. 


About One Percent of the Genes in the Human Genome Are 
Cancer-Critical 


From studies such as the one just described, it is estimated that the number of 
driver mutations for an individual case of cancer (the sum of meaningful epigen- 
etic and genetic changes in both coding sequences and regulatory regions) is typ- 
ically on the order of 10, explaining why cancer progression generally involves an 
increase in genetic and/or epigenetic instability that enhances the rate of such 
changes. 
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Figure 20-25 The mutation landscape in 
colorectal cancer. In this two-dimensional 
representation of the human genome, 

the green surface depicts the 22 human 
autosomes plus the X sex chromosome 
as being laid out side-by-side in numerical 
order from left to right, with the DNA 
sequence of each chromosome running 
from back to front. The mountains 
represent the locations of genes 

mutated with high frequency in different, 
independent tumors. As indicated, these 
are suspected driver mutations in the 
adenomatous polyposis coli (APC), K-Ras, 
053, phosphoinositide 3-kinase (PIKSCA), 
and ubiquitin ligase (FBXW7) proteins. 
(Adapted from L.D. Wood et al., Science 
318:1108-1113, 2007.) 
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By compiling the data for different types of cancer, each with its own range 
of identified driver mutations, we can develop a comprehensive catalog of genes 
that are strongly suspected to be cancer-critical. Current estimates put the total 
number of such genes at about 300, about 1% of the genes in the human genome. 
These cancer-critical genes are amazingly diverse. Their products include 
secreted signal proteins, transmembrane receptors, GTP-binding proteins, pro- 
tein kinases, transcription regulators, chromatin modifiers, DNA repair enzymes, 
cell-cell adhesion molecules, cell-cycle controllers, apoptosis regulators, scaffold 
proteins, metabolic enzymes, components of the RNA splicing machinery, and 
more besides. All these are susceptible to mutations that can contribute, in one 
way or another, in one tissue or another, to the evolution of cells with the cancer- 
ous properties that we listed earlier on page 1103. 

Clearly, the molecular changes that cause cancer are complex. As we now 
explain, however, the complexity is not quite as daunting as it may initially seem. 


Disruptions in a Handful of Key Pathways Are Common to Many 
Cancers 


Some genes, like Rb and Ras, are mutated in many cases of cancer and in cancers 
of many different types. The involvement of genes such as Rb and Ras in cancer 
is no surprise, now that we understand their normal functions: they control fun- 
damental processes of cell division and growth. But even these common culprits 
feature in considerably less than half of individual cases. What is happening to 
the control of these processes in the many cases of cancer where, for example, Rb 
is intact or Ras is not mutated? What part do mutations in the hundreds of other 
cancer-critical genes play in the development of the disease? With our increas- 
ing knowledge of the normal functions of the genes in the human genome, it is 
becoming easier to see patterns in the cataloged driver mutations and to give 
some simplifying answers to these questions. 

Glioblastoma—the commonest type of human brain tumor—provides a good 
example. Analysis of the genomes of tumor cells from 91 patients identified a total 
of at least 79 genes that were mutated in more than one individual. The normal 
functions of most of these genes were known or could be guessed, allowing them 
to be assigned to specific biochemical or regulatory pathways. Three functional 
groupings stood out, accounting for a total of 21 of the recurrently mutated genes. 
One of these groupings consisted of genes in the Rb pathway (that is, Rb itself, 
along with genes that directly regulate Rb); this pathway governs initiation of the 
cell-division cycle. Another consisted of genes in the same regulatory subnetwork 
as Ras—a more loosely defined system of genes referred to as the RTK/Ras/PI3K 
pathway, after three of its core components; this pathway serves to transmit sig- 
nals for cell growth and cell division from the cell exterior into the heart of the 
cell. The third grouping consisted of genes in a pathway regulating responses to 
stress and DNA damage—the p53 pathway. We shall have more to say about each 
of these pathways below. 

Out of all tumors, 74% had identifiable mutations in all three pathways. If one 
were to trace these three pathways further upstream and include all the com- 
ponents, known and unknown, on which they depend, this percentage would 
almost certainly be even higher. In other words, in almost every case of glioblas- 
toma, there are mutations that disrupt each of three fundamental controls: the 
control of cell growth, the control of cell division, and the control of responses to 
stress and DNA damage. 

Strikingly, in any given tumor-cell clone, there is a strong tendency for no more 
than one gene to be mutated in each pathway. Evidently, what matters for tumor 
evolution is the disruption of the control mechanism, and not the genetic means 
by which that is achieved. Thus, for example, in a patient whose tumor cells have 
no mutation in Rb itself, there is generally a mutation in some other component 
of the Rb pathway, producing a similar biological effect. 

Similar patterns are seen in other types of cancers. A survey of many specimens 
of the major variety of ovarian cancer, for example, identified 67% of patients as 
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having mutations in the Rb pathway, 45% in the Ras/PI3K pathway (defined more 
narrowly than in the glioblastoma study), and more than 96% in the p53 path- 
way. Allowing for additional pathway components not included in the analysis, 
it seems that most cases of this type of cancer, too, have mutations disrupting the 
same three controls, leading to misregulated cell growth, misregulated cell prolif- 
eration, and abnormal disregard of stress and DNA damage. It seems that these 
three fundamental controls are subverted in one way or another in virtually every 
type of cancer. 

We have devoted an entire chapter to the cell cycle and growth controls (Chap- 
ter 17). Some important details of the other two control pathways are reviewed 
next. 


Mutations in the PISK/Akt/mTOR Pathway Drive Cancer Cells to 
Grow 


Cell proliferation is not simply a matter of progression through the cell cycle; it 
also requires cell growth, which involves complex anabolic processes through 
which the cell synthesizes all the necessary macromolecules from small-mole- 
cule precursors. If a cell divides inappropriately without growing first, it will get 
smaller at each division and will ultimately die or become too small to divide. 
Cells appear to require two separate signals to grow and divide (Figure 20-26). 
Cancer depends, therefore, not only on a loss of restraints on cell-cycle progres- 
sion, but also on disrupted control of cell growth. 

The phosphoinositide 3-kinase (PI 3-kinase)/Akt/mTOR intracellular signal- 
ing pathway is critical for cell growth control. As described in Chapter 15, various 
extracellular signal proteins, including insulin and insulin-like growth factors, 
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Figure 20-26 Cells seem to require two types of signals to proliferate. (A) In order to multiply successfully, most normal cells are suspected to 
require both extracellular signals that drive cell-cycle progression (shown here as blue mitogen) and extracellular signals that drive cell growth (shown 
here as red growth factor). How mitogens activate the Rb pathway to drive entry into the cell cycle is described in Figure 17-61. (B) Diagram of 

the signaling system containing Akt that drives cell growth through greatly stimulating glucose uptake and utilization, including a conversion of the 
excess citric acid produced from sugar intermediates in mitochondria into the acetyl CoA that is needed in the cytosol for lipid synthesis and new 
membrane production. As indicated, protein synthesis is also increased. This system becomes abnormally activated early in tumor progression. TCA 
cycle indicates the tricarboxylic acid cycle (citric acid cycle). 
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normally activate this pathway. In cancer cells, however, the pathway is activated 
by mutation so that the cell can grow in the absence of such signals. The resulting 
abnormal activation of the protein kinases Akt and mTOR not only stimulates pro- 
tein synthesis (see Figure 17-64), but also greatly increases both glucose uptake 
and the production of the acetyl CoA in the cytosol required for cell lipid synthe- 
sis, as outlined in Figure 20-26B. 

The abnormal activation of the PI 3-kinase/Akt/mTOR pathway, which nor- 
mally occurs early in the process of tumor progression, helps to explain the exces- 
sive rate of glycolysis that is observed in tumor cells, known as the Warburg effect, 
as discussed earlier (see Figure 20-12). As expected from our previous discussion, 
cancers can activate this pathway in many different ways. Thus, for example, a 
growth factor receptor can become abnormally activated, as in Figure 20-19. Also 
very common in cancers is the loss of the PTEN phosphatase, an enzyme that nor- 
mally suppresses the PI 3-kinase/Akt/mTOR pathway by dephosphorylating the 
PI (3,4,5) P3 molecules that the PI 3-kinase forms (see pp. 859-861). PTEN is thus 
a common tumor-suppressor gene. 

Of course, mutation is not the only way to overactivate the pathway: high levels 
of insulin in the circulation can have a similar effect. This may explain why the 
risk of cancer is significantly increased, by a factor of two or more, in people who 
are obese or have type 2 diabetes. Their insulin levels are abnormally high, driv- 
ing cancer cell growth without need of mutation in the PI 3-kinase/Akt/mTOR 
pathway. 


Mutations in the p53 Pathway Enable Cancer Cells to Survive and 
Proliferate Despite Stress and DNA Damage 


That cancer cells must break the normal rules governing cell growth and cell divi- 
sion is obvious: that is part of the definition of cancer. It is not so obvious why 
cancer cells should also be abnormal in their response to stress and DNA dam- 
age, and yet this too is an almost universal feature. The gene that lies at the center 
of this response, the p53 gene, is mutated in about 50% of all cases of cancer— 
a higher proportion than for any other known cancer-critical gene. When we 
include with p53 the other genes that are closely involved in its function, we find 
that most cases of cancer harbor mutations in the p53 pathway. Why should this 
be? To answer, we must first consider the normal function of this pathway. 

In contrast to Rb, most cells in the body have very little p53 protein under nor- 
mal conditions: although the protein is synthesized, it is rapidly degraded. More- 
over, p53 is not essential for normal development. Mice in which both copies of 
the gene have been deleted or inactivated typically appear normal in all respects 
except one—they universally develop cancer before 10 months of age. These 
observations suggest that p53 has a function that is required only in special cir- 
cumstances. In fact, cells raise their concentration of p53 protein in response to 
a whole range of conditions that have only one obvious thing in common: they 
are, from the cell’s point of view, pathological, putting the cell in danger of death 
or serious injury. These conditions include DNA damage, putting the cell at risk 
from a faulty genome; telomere loss or shortening (see p. 1016), also dangerous 
to the integrity of the genome; hypoxia, depriving the cell of the oxygen it needs 
to keep its metabolism going; osmotic stress, causing the cell to swell or shrivel; 
and oxidative stress, generating dangerous levels of highly reactive free radicals. 

Yet another form of stress that can activate the p53 pathway arises, it seems, 
when regulatory signals are so intense or uncoordinated as to drive the cell 
beyond its normal limits and into a danger zone where its mechanisms of control 
and coordination break down, as in an engine driven badly or too fast. The p53 
concentration rises, for example, when Myc is overexpressed to oncogenic levels. 

All these circumstances call for desperate action, which may take either of two 
forms: the cell can block any further progress through the division cycle in order 
to take time out to repair or recover from the pathological condition; or it can 
accept that it must die, and do so in a way that minimizes damage to the organ- 
ism. A good death, from this point of view, is a death by apoptosis. In apoptosis, 
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the cell is phagocytosed by its neighbors and its contents are efficiently recycled. 
A bad death is a death by necrosis. In necrosis, the cell bursts or disintegrates and 
its contents are spilled into the extracellular space, inducing inflammation. 

The p53 pathway, therefore, behaves as a sort of antenna, sensing the pres- 
ence of a wide range of dangerous conditions, and when any are detected, trig- 
gering appropriate action—either a temporary or permanent arrest of cell cycling 
(senescence), or suicide by apoptosis (Figure 20-27). These responses serve to 
prevent deranged cells from proliferating. Cancer cells are indeed generally 
deranged, and their survival and proliferation thus depend on inactivation of the 
p53 pathway. If the p53 pathway were active in them, they would be halted in their 
tracks or die (Movie 20.4). 

The p53 protein performs its job mainly by acting as a transcription regulator 
(see Movie 17.8). Indeed, the most common mutations observed in p53 in human 
tumors are in its DNA-binding domain, where they cripple the ability of p53 to 
bind to its DNA target sequences. Because p53 binds to DNA as a tetramer, a sin- 
gle mutant subunit within a tetrameric complex can be enough to block its func- 
tion. Thus, mutations in p53 can have a dominant negative effect, causing loss of 
p53 function even when the cell also contains a wild-type version of the gene. For 
this reason, in contrast with other tumor suppressor genes such as Rb, the devel- 
opment of cancer does not always require that both copies of p53 be knocked out. 

As discussed in Chapter 17, the p53 protein exerts its inhibitory effects on the 
cell cycle, in part at least, by inducing the transcription of p21, which encodes a 
protein that binds to and inhibits the cyclin-dependent kinase (Cdk) complexes 
required for progression through the cell cycle. By blocking the kinase activity of 
these Cdk complexes, the p21 protein prevents the cell from progressing through 
S phase and replicating its DNA. 

The mechanism by which p53 induces apoptosis includes stimulation of the 
expression of many pro-apoptotic genes, and it will be described in Chapter 18. 


Genome Instability Takes Different Forms in Different Cancers 


If the p53 pathway is functional, a cell with unrepaired DNA damage will stop 
dividing or die; it cannot proliferate. Mutations in the p53 pathway are, therefore, 
generally present in cancer cells showing genome instability—which is to say, 
the majority. But how does this genome instability originate? Here too, cancer 
genome studies are illuminating. 

In ovarian cancers, for example, chromosome breaks, translocations, and 
deletions are very common, and these aberrations correlate with a high frequency 
of mutations and epigenetic silencing in the genes needed for repair of DNA dou- 
ble-strand breaks by homologous recombination, especially Brcal and Brca2 (see 
pp. 281-282). In a subset of colorectal cancers with DNA mismatch repair defects, 
on the other hand, one instead finds many point mutations scattered throughout 
the genome (see pp. 250-251). In both kinds of cancer, the genome is commonly 
destabilized, but different types of mutations can bring this about. 


Figure 20-27 Modes of action of the 
p53 tumor suppressor. The p53 protein 
is a cellular stress sensor. In response to 
hyperproliferative signals, DNA damage, 
hypoxia, telomere shortening, and various 
other stresses, the p53 levels in the cell 
rise. As indicated, this may either arrest 
cell cycling in a way that allows the cell to 
adjust and survive, trigger cell suicide by 
apoptosis, or cause cell “senescence” — an 
irreversible cell-cycle arrest that stops 
damaged cells from dividing. 
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Cancers of Specialized Tissues Use Many Different Routes to 
Target the Common Core Pathways of Cancer 


Mutations in core components of the machinery that regulates cell growth, divi- 
sion, and survival, such as Rb, Ras, PTEN, or p53, are not the only way to pervert 
the control of these processes. Specialized tissues depend on a variety of path- 
ways, as discussed in Chapter 15, to relay environmental signals to the core con- 
trol machinery, and each pathway lays the cells open to subversion in a different 
set of ways. Thus, in different cancers, we can find examples of driver mutations 
in practically all the major signaling pathways through which cells communicate 
during development and tissue maintenance (discussed in Chapters 21 and 22). 

In glioblastoma, for example, most patients have mutations in one or other of 
a set of cell-surface receptor tyrosine kinases, especially the EGF receptor men- 
tioned earlier (linking into the Ras/PI3K pathway), suggesting that the cells from 
which the cancer originates are normally controlled by this route. The cells of the 
prostate gland, on the other hand, respond to the androgen hormone testoster- 
one, and in prostate cancer, components of the androgen receptor signaling path- 
way (a variety of nuclear hormone receptor signaling; see Chapter 15) are often 
mutated. In the normal gut lining, Wnt signaling is critical, and Wnt pathway 
mutations are present in most colorectal cancers. Pancreatic cancers generally 
have mutations in the transforming growth factor- (TGF) signaling pathway. 
Activating mutations in the Notch pathway are present in more than 50% of T cell 
acute lymphocytic leukemias, and so on. 

Cells are generally regulated by several different types of external signals that 
must act in combination, representing a “fail-safe” control mechanism that pro- 
tects the organism as a whole from cancer. These signals are different in different 
tissues. As expected, therefore, the corresponding cancers often have mutations 
in several signaling pathways concurrently. This is true of the examples we have 
just listed, which commonly have mutations in other signaling pathways in addi- 
tion to the ones that we have singled out. 


Studies Using Mice Help to Define the Functions of Cancer-Critical 
Genes 


The ultimate test of a gene’s role in cancer has to come from investigations in the 
intact, mature organism. The most favored organism for such studies, apart from 
humans themselves, is the mouse. To explore the function of a candidate onco- 
gene or tumor suppressor gene, one can make a transgenic mouse that overex- 
presses it or a knockout mouse that lacks it. Using the techniques described in 
Chapter 8, one can engineer mice in which the misexpression or deletion of the 
gene is restricted to a specific set of cells, or in which expression of the gene can 
be switched on at will at a chosen point in time, or both, to see whether and how 
tumors develop. Moreover, to follow the growth of tumors from day to day in the 
living organism, the cells of interest can be genetically marked and made visible 
by expression of a fluorescent or luminescent reporter (Figure 20-28). In these 
ways, one can begin to clarify the part that each cancer-critical gene plays in can- 
cer initiation or progression. 


metastases 





103 144 266 372 
age in days 


1117 


Figure 20-28 Monitoring tumor growth 
and metastasis in a mouse with a 
luminescent reporter. A mouse was 
genetically engineered in a way that allows 
both copies of its PTEN tumor suppressor 
gene to be inactivated in the prostate 
gland, simultaneously with the prostate- 
specific activation of a gene engineered 

to produce the enzyme luciferase (derived 
from fireflies). After an injection of luciferin 
(the substrate molecule for luciferase) into 
the mouse’s bloodstream, the cells in the 
prostate emit light and can be detected 
by their bioluminescense in a live mouse, 
as seen in the 67-day-old animal at the 
left. Cells lacking the PTEN phosphatase 
enzyme contain elevated amounts of the 
Akt activator, PI(3,4,5)P3, and this causes 
the prostate cells to proliferate abnormally, 
progressing over time to form a cancer. In 
this way, the process of metastasis could 
be followed in the same animal over the 
course of a year. The light intensity in these 
experiments is proportional to the number 
of prostate-cell descendants, increasing 
from light blue to green, to yellow, to red 
in this representation. (Adapted from 

C.-P. Liao et al., Cancer Res. 67:7525- 
7588, 2007.) 
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Figure 20-29 Oncogene collaboration in transgenic mice. The graphs 
show the incidence of tumors in three types of transgenic mouse strains, one 
carrying a Myc oncogene, one carrying a Ras oncogene, and one carrying 
both oncogenes. For these experiments, two lines of transgenic mice were 
first generated. One carries an inserted copy of an oncogene created by 
fusing the proto-oncogene Myc with the mouse mammary tumor virus 
regulatory DNA (which then drives Myc overexpression in the mammary 
gland). The other line carries an inserted copy of the Ras oncogene under 
control of the same regulatory element. Both strains of mice develop tumors 
much more frequently than normal, most often in the mammary or salivary 
glands. Mice that carry both oncogenes together are obtained by crossing 
the two strains. These hybrids develop tumors at a far higher rate still, 

much greater than the sum of the rates for the two oncogenes separately. 
Nevertheless, the tumors arise only after a delay and only from a small 
proportion of the cells in the tissues where the two genes are expressed. 
Further accidental changes, in addition to the two oncogenes, are apparently 
required for the development of cancer. (After E. Sinn et al., Cell 49:465-475, 
1987. With permission from Elsevier.) 


Transgenic mouse studies confirm, for example, that a single oncogene is gen- 
erally not enough to turn a normal cell into a cancer cell. Thus, in mice engineered 
to express a Myc or Ras oncogenic transgene, some of the tissues that express the 
oncogene may show enhanced cell proliferation, and, over time, occasional cells 
will undergo further changes to give rise to cancers. Most cells expressing the 
oncogene, however, do not give rise to cancers. Nevertheless, from the point of 
view of the whole animal, the inherited oncogene is a serious menace because 
it creates a high risk that a cancer will arise somewhere in the body. Mice that 
express both Myc and Ras oncogenes (bred by mating a transgenic mouse carry- 
ing a Myc oncogene with one carrying a Ras oncogene) develop cancers earlier 
and at a much higher rate than either parental strain (Figure 20-29); but, again, 
the cancers originate as scattered, isolated tumors among noncancerous cells. 
Thus, even cells expressing these two oncogenes must undergo further, randomly 
generated changes to become cancerous. This strongly suggests that multiple 
mutations are required for tumorigenesis, as supported by a great deal of other 
evidence discussed earlier. Experiments using mice with deletions of tumor sup- 
pressor genes lead to similar conclusions. 


Cancers Become More and More Heterogeneous as They 
Progress 


From simple histology, looking at stained tissue sections, it is clear that some 
tumors contain distinct sectors, all clearly cancerous, but differing in appear- 
ance because they differ genetically: the cancer cell population is heterogeneous. 
Evidently, within the initial clone of cancerous cells, additional mutations have 
arisen and thrived, creating diverse subclones. Today, the ability to analyze can- 
cer genomes lets us look much deeper into the process. 

One approach involves taking samples from different regions of a primary 
tumor and from the metastases that it has spawned. With modern methods, it is 
even possible to take representative single cells and analyze their genomes. Such 
studies reveal a classic picture of Darwinian evolution, occurring on a time scale 
of months or years rather than millions of years, but governed by the same rules of 
natural selection (Figure 20-30). 

One such investigation compared the genomes of 100 individual cells from dif- 
ferent regions of a primary tumor of the breast. A large fraction—just over half—of 
the chosen cells was genetically normal or nearly so: these were connective-tissue 
cells and other cell types, such as those of the immune system, that were mixed 
up with the cancer cells. The cancer cells themselves were distinguished by their 
severely disrupted genomes. The detailed pattern of gene deletions and ampli- 
fications in each such cell revealed how closely it was related to the others, and 
from this data one could draw up a family tree (Figure 20-30B). In this case, three 
main branches of the tree were seen; that is, the cancer consisted of three major 
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subclones. From the shared abnormalities, one could deduce that their last com- 
mon ancestor—the presumed founder of the cancer—was already very different 
from a normal cell, but that the first split between branches occurred early, when 
the tumor was small. This was followed by a large amount of additional change 
within each branch. A hint of the future could be seen in the smallest of the three 
major subclones: its cells were distinguished by a massive amplification of a Ras 
oncogene. Given more time, perhaps they would have out-competed the other 
cancer cells and taken over the whole tumor. 

Similar results have been obtained with other cancers. Clearly, cancer cells are 
constantly mutating, multiplying, competing, evolving, and diversifying as they 
exploit new ecological niches and react to the treatments that are used against 
them (Figure 20-30C). Diversification accelerates as they metastasize and colo- 
nize new territories, where they encounter new selection pressures. The longer 
the evolutionary process continues, the harder it becomes to catch them all in the 
same net and kill them. 


The Changes in Tumor Cells That Lead to Metastasis Are Still 
Largely a Mystery 


Perhaps the most significant gap in our understanding of cancer concerns inva- 
siveness and metastasis. For a start, it is not clear exactly what new properties a 
cancer cell must acquire to become metastatic. In some cases, it is possible that 
invasion and metastasis require no further genetic changes beyond those needed 
to violate the normal controls on cell growth, cell division, and cell death. On the 
other hand, it may be that, for some cancers, metastasis requires a large number 
of additional mutations and epigenetic changes. Clues are coming from compar- 
isons of the genomes of cells of primary tumors with the cells of metastases that 
they have spawned. The results appear complex and variable from one cancer to 
another. Nevertheless, some general principles have emerged. 

As we discussed earlier, it is helpful to distinguish three phases of tumor pro- 
gression required for a carcinoma to metastasize (see Figure 20-16). First, the cells 
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Figure 20-30 How cancers progress 

as a series of subclones. (A) Schematic 
illustration of the pattern of mutation and 
natural selection in a clone of tumor cells. 
(B) A family tree of cancer cells sampled 
from different regions of a single breast 
tumor, showing how the cells have evolved 
and diversified from a common ancestor, 
the cancer founder cell. The genome of 
each of the indicated 100 cells from a 
human breast tumor was sequenced to 
produce an evolutionary tree. About half 
of these cells were normal cells from the 
stroma (blue cells). The red cells have 
greatly amplified their K-Ras gene. 

Note that many subclones appear to have 
died out, including the one that contained 
the founder cells for the three subclones 
that survive. 

(C) A depiction of how driver mutations 
are thought to cause cancer progression 
over long periods of time, before producing 
a large enough clone of proliferating cells to 
be detected as a tumor. The data indicate 
that driver mutations occur only rarely in 
a background of long-lived subclones of 
cells that continually accumulate passenger 
mutations without gaining a growth 
advantage. (A, adapted from M. Greaves, 
Semin. Cancer Biol. 20:65-70, 2010; 

B, adapted from N. Navin et al., Nature 
472:90-94, 2011; C, adapted from 

S. Nik-Zainal et al., Cell 149:994—-1007, 
2012.) 
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must escape the normal confines of their parent epithelium and begin to invade 
the tissue immediately beneath. Second, they must travel via the blood or lymph 
to lodge in distant sites. Third, they must survive there and multiply. It is the first 
and last steps in this sequence that are the most difficult to accomplish for most 
cancers (Figure 20-31). 

The first step, local invasiveness, requires a relaxation of the mechanisms that 
normally hold epithelial cells together. As mentioned earlier, this step resembles 
the normal developmental process known as the epithelial-mesenchymal transi- 
tion (EMT), in which epithelial cells undergo a shift in character, becoming less 
adhesive and more migratory (discussed in Chapter 19). A key part of the EMT 
process involves switching off expression of the E-cadherin gene. The primary 
function of the transmembrane E-cadherin protein is in cell-cell adhesion, bind- 
ing epithelial cells together through adherens junctions (see Figure 19-13). In 
some carcinomas of the stomach and of the breast, E-cadherin has been identified 
as a tumor suppressor gene, and a loss of E-cadherin may promote cancer devel- 
opment by facilitating local invasiveness. 

The initial entry of tumor cells into the circulation is helped by the presence of 
a dense supply of blood vessels and sometimes lymphatic vessels, which tumors 
attract to themselves as they grow larger and become hypoxic in their interior. 
This process, called angiogenesis, is caused by the secretion of angiogenic factors 
that promote the growth of blood vessels, such as vascular endothelial growth 
factor (VEGF; see Figure 22-26). An abnormal fragility and leakiness of the new 
vessels that form may help the cells that have become invasive to enter and then 
move through the circulation with relative ease. 

The remaining steps in metastasis, involving exit from a blood or lymphatic 
vessel and the effective colonization of remote sites, are much harder to study. To 
discover which of the later steps in metastasis present cancer cells with the great- 
est difficulties, one can label the cells with a fluorescent dye or green fluorescent 
protein (GFP), inject them into the bloodstream of a mouse, and then monitor 
their fate (Movie 20.5). In such experiments, one observes that many cells sur- 
vive in the circulation, lodge in small vessels, and exit into the surrounding tis- 
sue, regardless of whether they come from a tumor that metastasizes or one that 
does not. Some cells die immediately after they enter foreign tissue; others survive 
entry into the foreign tissue but fail to proliferate. Still others divide a few times 
and then stop, forming micrometastases containing ten to several thousand cells. 
Very few establish full-blown metastases. 

What, if anything, distinguishes the survivors from the failures? A clue may 
come from the fact that in many types of tumors, the cancer cells show a kind 
of heterogeneity that resembles the heterogeneity seen among the cells of those 
normal tissues that renew themselves continually by a stem-cell strategy, as we 
discuss next. 

















A Small Population of Cancer Stem Cells May Maintain Many 
Tumors 


Self-renewing tissues, where cell division continues throughout life, are the breed- 
ing ground for the great majority of human cancers. They include the epidermis 


Figure 20-31 The barriers to metastasis. 
Studies of labeled tumor cells leaving a 
tumor site, entering the circulation, and 
establishing metastases show which steps 
in the metastatic process, outlined in Figure 
20-16, are difficult or “inefficient,” in the 
sense that they are steps in which large 
numbers of cells fail and are lost. It is in 
these difficult steps that cells from highly 
metastatic tumors are observed to have 
much greater success than cells from a 
nonmetastatic source. It seems that the 
ability to escape from the parent tissue, and 
an ability to survive and grow in the foreign 
tissue, are Key properties that cells must 
acquire to become metastatic. (Adapted 
from A.F. Chambers et al., Breast Cancer 
Res. 2:400-407, 2000. With permission 
from BioMed Central Ltd.) 
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(the outer epithelial layer of the skin), the lining of the digestive and reproductive 
tracts, and the bone marrow, where blood cells are generated (see Chapter 22). 
In almost all these tissues, renewal depends on the presence of stem cells, which 
divide to give rise to terminally differentiated cells, which do not divide. This cre- 
ates a mixture of cells that are genetically identical and closely related by lineage, 
but are in different states of differentiation. Many tumors seem likewise to consist 
of cells in varied states of differentiation, with different capacities for cell division 
and self-renewal. 

To see the implications, itis helpful to consider how normal stem-cell systems 
operate. When a normal stem cell divides, each daughter cell has a choice—it 
can remain a stem cell, or it can commit to a pathway leading to differentiation. 
A stem-cell daughter remains in place to generate more cells in the future. A 
committed daughter typically undergoes some rounds of cell proliferation (as a 
so-called transit amplifying cell) but then stops dividing, terminally differentiates, 
and eventually is discarded and replaced (it may die by apoptosis, with recycling 
of its materials, or be shed from the body). On average, the two fates—stem cell or 
differentiating cell—normally occur with equal probability, so that half the daugh- 
ters of stem-cell divisions take the one path and half take the other. In a healthy 
body, feedback controls regulate the process, adjusting this balance of cell-fate 
choices to correct for any departure from the proper cell population numbers. 
Thus, the number of stem cells remains approximately constant, and the termi- 
nally differentiated cells are continually replaced at a steady rate. Because of the 
divisions undergone by the transit amplifying cells, the stem cells may be vastly 
outnumbered by the cells that are committed to terminal differentiation and have 
lost the capacity for self-renewal. But the stem cells, though few and far between 
and often relatively slowly dividing, carry the whole responsibility for mainte- 
nance of the tissue in the long term. 

Some cancers seem to be organized in a similar way: they consist of rare can- 
cer stem cells capable of dividing indefinitely, together with much larger num- 
bers of dividing transit amplifying cells that are derived from the cancer stem cells 
but have a limited capacity for self-renewal (Figure 20-32). These non-stem cells 
appear to constitute the great majority of the cell population in some tumors. 


The Cancer Stem-Cell Phenomenon Adds to the Difficulty of 
Curing Cancer 


Evidence for the cancer stem-cell phenomenon comes chiefly from experiments 
in which individual cells from a cancer are tested for their ability to give rise to fresh 
tumors: a standard assay is to implant the cells into an immunodeficient mouse 
(Figure 20-33). It has been known for half a century that there is usually only a 
small chance—typically much less than 1%—that a tumor cell chosen at random 
and tested in this way will generate a new tumor. This by itself does not prove that 


1121 


Figure 20-32 Cancer stem cells can be 
responsible for tumor growth and yet 
remain only a small part of the tumor- 
cell population. (A) How stem 

cells produce transit amplifying cells. 

(B) How a small proportion of cancer stem 
cells can maintain a tumor. Suppose, for 
example, that each daughter of a cancer 
stem cell has a probability slightly greater 
than 50% of retaining stem-cell potential 
and a probability slightly less than 50% of 
becoming a transit amplifying cell that is 
committed to a program of cell divisions 
that stops after 10 division cycles. While 
the number of cancer stem cells will 
increase slowly but steadily to give a 
growing tumor, the non-stem cells that they 
give rise to will always outnumber the stem 
cells by a large factor—in this example, by 
a factor of about 1000. (If the cell-division- 
cycle and survival times for the two classes 
of cells are equal.) 
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the tumor cells are heterogeneous: like seeds scattered on difficult ground, each 
of them may have only a small chance of finding a spot where it can survive and 
grow. Modern technologies for sorting cells have shown, however, that in some 
cancers at least, the rate of success in founding new tumors is even lower than it 
would otherwise be because the cancer cells are heterogeneous in their state of 
differentiation, and only a small subset of them—the cancer stem cells—have the 
special properties needed for tumor propagation. For example, in several types 
of cancer, including breast cancers and leukemias, one can fractionate the tumor 
cells using monoclonal antibodies that recognize a particular cell-surface marker 
that is present on the normal stem cells in the tissue of origin of the cancer. The 
purified cancer cells expressing this marker are found to have a greatly enhanced 
ability to found new tumors. And the new tumors consist of mixtures of cells that 
express the marker and cells that do not, all generated from the same founder cell 
that expressed the marker. 

Experiments with breast cancer cells have revealed that, instead of following a 
rigid program from stem cell to transit amplifying cell to terminally differentiated 
cell, these cancer cells can randomly switch to and fro—with a certain low transi- 
tion probability—between different states of differentiation that express different 
molecular markers. In one state, they behave like stem cells, dividing slowly but 
capable of founding new tumors; in other states, they behave like transit amplify- 
ing cells, dividing rapidly but unable to found new tumors in a standard transplant 
assay. But a single cell in any of these states—given time in culture, or a congenial 
environment in the body—will give rise to a mixed population that includes all the 
other states as well. 

The cancer stem-cell phenomenon, whatever its basis, implies that even when 
the tumor cells are genetically similar, they are phenotypically diverse. A treat- 
ment that wipes out those in one state is likely to allow survival of others that 
remain a danger. Radiotherapy or a cytotoxic drug, for example, may selectively 
kill off the rapidly dividing cells, reducing the tumor volume to almost nothing, 
and yet spare a few slowly dividing cells that go on to resurrect the disease. This 
greatly adds to the difficulty of cancer therapy, and it is part of the reason why 
treatments that seem at first to succeed often end in relapse and disappointment. 


Colorectal Cancers Evolve Slowly Via a Succession of Visible 
Changes 


At the beginning of this chapter, we saw that most cancers develop gradually from 
a single aberrant cell, progressing from benign to malignant tumors by the accu- 
mulation of a number of independent genetic and epigenetic changes. We have 
discussed what some of these changes are in molecular terms and seen how they 
contribute to cancerous behavior. We now examine one of the common human 
cancers more closely, using it to illustrate and enlarge upon some of the general 
principles and molecular mechanisms we have introduced. We take colorectal 
cancer as our example. 

Colorectal cancers arise from the epithelium lining the colon (the large intes- 
tine) and rectum (the terminal segment of the gut). The organization of this tissue 
is broadly similar to that of the small intestine, discussed in detail in Chapter 22 
(pp. 1217-1221). For both the small and large intestine, the epithelium is renewed 
at an extraordinarily rapid rate, taking about a week to completely replace most of 
the epithelial sheet. In both regions, the renewal depends on stem cells that lie in 
deep pockets of the epithelium, called intestinal crypts. The signals that maintain 
the stem cells and control the normal organization and renewal of the epithelium 
are beginning to be quite well understood, as explained in Chapter 22. Mutations 
that disrupt these signals begin the process of tumor progression for most col- 
orectal cancers (Movie 20.6). 

Colorectal cancers are common, currently causing nearly 60,000 deaths a year 
in the United States, or about 10% of total deaths from cancer. Like most can- 
cers, they are not usually diagnosed until late in life (90% occur after the age of 
55). However, routine examination of normal adults with a colonoscope (a fiber 





Figure 20-33 An immunodeficient 
mouse, as used in transplantation 
assays to test human cancer cells for 
their ability to found new tumors. This 
nude mouse has a mutation that blocks 
development of the thymus and, as a 

side effect, robs it of hair. Because it has 
practically no T cells, it tolerates grafts of 
cells even from other species. (Courtesy of 
Harlan Sprague Dawley.) 
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optic device for viewing the interior of the colon and rectum) often reveals a small 
benign tumor, or adenoma, of the gut epithelium in the form of a protruding mass 
of tissue called a polyp (see Figure 22-4). These adenomatous polyps are believed 
to be the precursors of a large proportion of colorectal cancers. Because the pro- 
gression of the disease is usually very slow, there is typically a period of about 
10 years in which the slowly growing tumor is detectable but has not yet turned 
malignant. Thus, when people are screened by colonoscopy in their fifties and the 
polyps are removed through the colonoscope—a quick and easy surgical proce- 
dure—the subsequent incidence of colorectal cancer is much lower: according to 
some studies, less than a quarter of what it would be otherwise. 

In microscopic sections of polyps smaller than 1 cm in diameter, the cells and 
their arrangement in the epithelium usually appear almost normal. The larger the 
polyp, the more likely it is to contain cells that look abnormally undifferentiated 
and form abnormally organized structures. Sometimes, two or more distinct areas 
can be distinguished within a single polyp, with the cells in one area appearing 
relatively normal and those in the other appearing clearly cancerous, as though 
they have arisen as a mutant subclone within the original clone of adenomatous 
cells. At later stages in the disease, some tumor cells become invasive in a small 
fraction of the polyps, first breaking through the epithelial basal lamina, then 
spreading through the layer of muscle that surrounds the gut, and finally metas- 
tasizing to lymph nodes via lymphatic vessels and to liver, lung, and other organs 
via blood vessels. 


A Few Key Genetic Lesions Are Common to a Large Fraction of 
Colorectal Cancers 


What are the mutations that accumulate with time to produce this chain of 
events? Of those genes so far discovered to be involved in colorectal cancer, three 
stand out as most frequently mutated: the proto-oncogene K-Ras (a member of 
the Ras gene family), in about 40% of cases; p53, in about 60% of cases; and the 
tumor suppressor gene Apc (discussed below), in more than 80% of cases. Others 
are involved in smaller numbers of colon cancers, and some of these are listed in 
Table 20-1. 

The role of Apc first came to light through study of certain families show- 
ing a rare type of hereditary predisposition to colorectal cancer, called familial 


TABLE 20-1 


K-Ras Oncogene Receptor tyrosine 40 
kinase signaling 


Da3 Tumor suppressor | Response to stress 
and DNA damage 


TGFP receptor II ? TGFB signaling 10 
15 


MLH1 and other Tumor suppressor | DNA mismatch 
DNA mismatch repair | (genetic stability) repair 

genes (often silenced 

by DNA methylation) 


1.2The genes with the same superscript numeral act in the same pathway, and therefore only 
one of the components is mutated in an individual cancer. 
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Figure 20-34 Colon of familial adenomatous polyposis coli patient 
compared with normal colon. (A) The normal colon wall is a gently 
undulating but smooth surface. (B) The polyposis colon is completely covered 
by hundreds of projecting polyps, each resembling a tiny cauliflower when 
viewed with the naked eye. (Courtesy of Andrew Wyllie and Mark Arends.) 


adenomatous polyposis coli (FAP). In this syndrome, hundreds or thousands of 
polyps develop along the length of the colon (Figure 20-34). These polyps start 
to appear in early adult life, and if they are not removed, one or more will almost 
always progress to become malignant; the average time from the first detection of 
polyps to the diagnosis of cancer is 12 years. The disease can be traced to a dele- 
tion or inactivation of the tumor suppressor gene Apc, named after the syndrome. 
Individuals with FAP have inactivating mutations or deletions of one copy of the 
Apc gene in all their cells and show loss of heterozygosity in tumors, even in the 
benign polyps. Most patients with colorectal cancer do not have the hereditary 
condition. Nevertheless, in more than 80% of the cases, their cancer cells (but not 
their normal cells) have inactivated both copies of the Apc gene through muta- 
tions acquired during the patient’s lifetime. Thus, by a route similar to that which 
we discussed for retinoblastoma, mutation of the Apc gene was identified as one 
of the central ingredients of colorectal cancer. 

The Apc protein, as we now know, is an inhibitory component of the Wnt 
signaling pathway (discussed in Chapter 15). It binds to the /-catenin protein, 
another component of the Wnt pathway, and helps to induce the protein’s deg- 
radation. By inhibiting B-catenin in this way, Apc prevents the B-catenin from 
migrating to the nucleus, where it would act as a transcriptional regulator to drive 
cell proliferation and maintain the stem-cell state (see Figure 15-60). Loss of Apc 
results in an excess of free B-catenin and thus leads to an uncontrolled expansion 
of the stem-cell population. This causes massive increase in the number and size 
of the intestinal crypts (see Figure 22-4). 

When the /-catenin gene was sequenced in a collection of colorectal tumors, 
it was discovered that, many of the tumors that did not have Apc mutations had 
activating mutations in /-catenin instead. Thus, it is excessive activity in the Wnt 
signaling pathway that is critical for the initiation of this cancer, rather than any 
single oncogene or tumor suppressor gene that the pathway contains. 

This being so, why is the Apc gene in particular so often the most common 
culprit in colorectal cancer? The Apc protein is large and it interacts not only with 
8-catenin but also with various other cell components, including microtubules. 
Loss of Apc appears to increase the frequency of mitotic spindle defects, leading 
to chromosome abnormalities when cells divide. This additional, independent 
cancer-promoting effect could explain why Apc mutations feature so prominently 
in the causation of colorectal cancer. 


Some Colorectal Cancers Have Defects in DNA Mismatch Repair 


In addition to the hereditary disease (FAP) associated with Apc mutations, there 
is asecond, more common kind of hereditary predisposition to colon carcinoma 
in which the course of events differs from the one we have described for FAP. In 
this more common condition, called hereditary nonpolyposis colorectal cancer 
(HNPCC), the probability of colon cancer is increased without any increase in the 
number of colorectal polyps (adenomas). Moreover, the cancer cells are unusual, 
in that they have a normal (or almost normal) karyotype. The majority of colorec- 
tal tumors in non-HNPCC patients, in contrast, have gross chromosomal abnor- 
malities, with multiple translocations, deletions, and other aberrations, as well as 
having many more chromosomes than normal (Figure 20-35). 

The mutations that predispose HNPCC individuals to colorectal cancer occur 
in one of several genes that code for central components of the DNA mismatch 
repair system. These genes are homologous in structure and function to the MutL 
and MutsS genes in bacteria and yeast (see Figure 5-19). Only one of the two cop- 
ies of the involved gene is defective, so the repair system is still able to remove 
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the inevitable DNA replication errors that occur in the patient’s cells. However, as 
discussed previously, these individuals are at risk, because the accidental loss or 
inactivation of the remaining good gene copy will immediately elevate the spon- 
taneous mutation rate by a hundredfold or more (discussed in Chapter 5). These 
genetically unstable cells then can presumably speed through the standard pro- 
cesses of mutation and natural selection that allow clones of cells to progress to 
malignancy. 

This particular type of genetic instability produces invisible changes in the 
chromosomes—most notably changes in individual nucleotides and short expan- 
sions and contractions of mono- and dinucleotide repeats such as AAAA... or 
CACACA.... Once the defect in HNPCC patients was recognized, the epigenetic 
silencing or mutation of mismatch repair genes was found in about 15% of the 
colorectal cancers occurring in people with no inherited predisposing mutation. 

Thus, the genetic instability found in many colorectal cancers can be acquired 
in at least two ways. The majority of the cancers display a form of chromosomal 
instability that leads to visibly altered chromosomes, whereas in the others the 
instability occurs on a much smaller scale and reflects a defect in DNA mismatch 
repair. Indeed, many carcinomas show either chromosomal instability or defec- 
tive mismatch repair—but rarely both. These findings clearly demonstrate that 
genetic instability is not an accidental by-product of malignant behavior but a 
contributory cause—and that cancer cells can acquire this instability in multiple 
ways. 


The Steps of Tumor Progression Can Often Be Correlated with 
Specific Mutations 


In what order do K-Ras, p53, Apc, and the other identified colorectal cancer-crit- 
ical genes mutate, and what contribution does each of them make to the asocial 
behavior of the cancer cell? There is no single answer, because colorectal cancer 
can arise by more than one route: thus, we know that in some cases, the first muta- 
tion can be in a DNA mismatch repair gene; in others, it can be in a gene regulat- 
ing cell proliferation. Moreover, as previously discussed, a general feature such as 
genetic instability or a tendency to proliferate abnormally can arise in a variety of 
ways, through mutations in different genes. 

Nevertheless, certain sets of mutations are particularly common in colorectal 
cancer, and they occur in a characteristic order. Thus, in most cases, mutations 
inactivating the Apc gene appear to be the first, or at least a very early step, as they 
are detected at the same high frequency in small benign polyps as in large malig- 
nant tumors. Changes that lead to genetic and epigenetic instability are likely also 
to arise early in tumor progression, since they are needed to drive the later steps. 

Activating mutations in the K-Ras gene occur later, as they are rare in small 
polyps but common in larger ones that show disturbances in cell differentiation 
and histological pattern. 

Inactivating mutations in p53 are thought to come later still, as they are rare 
in polyps but common in carcinomas (Figure 20-36). We have seen that loss 
of p53 function allows cancer cells to endure stress and to avoid apoptosis and 
cell-cycle arrest. Additionally, loss of p53 is related to the heightened activation 
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Figure 20-35 Chromosome 
complements (karyotypes) of colon 
cancers showing different kinds of 
genetic instability. (A) The karyotype 

of a typical cancer shows many gross 
abnormalities in chromosome number 

and structure. Considerable variation can 
also exist from cell to cell (not shown). 

(B) The karyotype of a tumor that has a 
stable chromosome complement with 

few chromosomal anomalies; the genetic 
abnormalities are mostly invisible, having 
been created by defects in DNA mismatch 
repair. All of the chromosomes in this figure 
were stained as in Figure 4-10, the DNA of 
each human chromosome being marked 
with a different combination of fluorescent 
dyes. (Courtesy Wael Abdel-Rahman and 
Paul Edwards.) 
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of oncogenes such as Ras. Experiments in mice show that an initial low level of 
oncogene activation can give rise to a slowly growing tumor even while p53 is 
functional: genes such as Ras are, after all, part of the normal machinery of growth 
control, and moderate activation is not stressful for a cell and does not call the p53 
protein into play. Progression of a tumor from slow to rapid, malignant growth, 
however, involves activation of oncogenes beyond normal physiological limits to 
a higher, stressful level. If the p53 protein is present and functional, this should 
lead to cell-cycle arrest or death. Only by losing p53 function can the cancer cells 
with hyperactive oncogenes survive and progress. 

The steps we have just described are only part of the picture. It is important to 
emphasize that each case of colorectal cancer is different, with its own detailed 
combination of mutations, and that even for the mutations that are commonly 
shared, the sequence of occurrence may vary. The same is true for cancers in gen- 
eral. 

Advances in molecular biology have recently provided the tools to find out 
precisely which genes are amplified, deleted, mutated, or misregulated by epi- 
genetic mechanisms in the tumor cells of any given patient. As we discuss in the 
next section, such information promises to become as important for the diagnosis 
and treatment of cancer as was the breakthrough of being able to identify micro- 
organisms for the treatment of infectious diseases. 


Summary 


The molecular analysis of cancer cells reveals two classes of cancer-critical genes: 
oncogenes and tumor suppressor genes. A set of these genes becomes altered by a 
combination of genetic and epigenetic accidents to drive tumor progression. Many 
cancer-critical genes code for components of the social control pathways that reg- 
ulate when cells grow, divide, differentiate, or die. In addition, a subclass of tumor 
suppressors can be categorized as “genome maintenance genes,” because their nor- 
mal role is to help maintain genome integrity. 

The inactivation of the p53 pathway, which occurs in nearly all human cancers, 
allows genetically damaged cells to escape apoptosis and continue to proliferate. 
Inactivation of the Rb pathway also occurs in most human cancers, illustrating how 
fundamental each of these pathways is for protecting us against cancer. 

The sequencing of cancer cell genomes reveals that—except for the cancers of 
childhood—many cancers acquire 10 or so driver mutations over the long course 
of tumor progression, along with a considerably larger number of passenger muta- 
tions of no consequence. The same methods reveal how subclones of cells arise and 
die out as a tumor ages. Tumors thus contain a heterogeneous mixture of cells, 
some—the so-called cancer stem cells—being much more dangerous than others. 

We can often correlate the steps of tumor progression with mutations that 
activate specific oncogenes and inactivate specific tumor suppressor genes, with 
colon cancer providing a good example. But different combinations of mutations 
and epigenetic changes are found in different types of cancer, and even in different 
patients with the same type of cancer, reflecting the random way in which these 
inherited changes arise. Nevertheless, many of the same changes are encoun- 
tered repeatedly, suggesting that there are a limited number of ways to breach our 
defenses against cancer. 
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Figure 20-36 Suggested typical 
sequence of genetic changes underlying 
the development of a colorectal 
carcinoma. This oversimplified diagram 
provides a general idea of the way mutation 
and tumor development are related. 

But many other mutations are generally 
involved, and different colon cancers can 
progress through different sequences of 
mutations (and/or epigenetic changes). 
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CANCER PREVENTION AND TREATMENT: PRESENT 
AND FUTURE 


We can apply the growing understanding of the molecular biology of cancer 
to sharpen our attack on the disease at three levels: prevention, diagnosis, and 
treatment. Prevention is always better than cure, and indeed many cancers can 
be prevented, especially by avoiding smoking. Highly sensitive molecular assays 
promise new opportunities for earlier and more precise diagnosis, with the aim of 
detecting primary tumors while they are still small and have not yet metastasized. 
Cancers caught at these early stages can often be nipped in the bud by surgery or 
radiotherapy, as we saw for colorectal polyps. Nevertheless, full-blown malignant 
disease will continue to be common for many years to come, and cancer treat- 
ments will continue to be needed. 

In this section, we first examine the preventable causes of cancer and then 
consider how advances in our understanding at a molecular level are beginning 
to transform the treatment of the disease. 


Epidemiology Reveals That Many Cases of Cancer Are 
Preventable 


A certain irreducible background incidence of cancer is to be expected regardless 
of circumstances. As discussed in Chapter 5, mutations can never be absolutely 
avoided because they are an inescapable consequence of fundamental limita- 
tions on the accuracy of DNA replication and repair. If a person could live long 
enough, it is inevitable that at least one of his or her cells would eventually accu- 
mulate a set of mutations sufficient for cancer to develop. 

Nevertheless, environmental factors seem to play a large part in determining 
the risk for cancer. This is demonstrated most clearly by a comparison of cancer 
incidence in different countries: for almost every cancer that is common in one 
country, there is another country where the incidence is much lower. Because 
migrant populations tend to adopt the pattern of cancer incidence typical of their 
new host country, the differences are thought to be due mostly to environmental, 
not genetic, factors. From such findings, it has been suggested that 80-90% of can- 
cers should be avoidable, or at least postponable (Figure 20-37). 

Unfortunately, different cancers have different environmental risk factors, and 
a population that escapes one such danger is usually exposed to another. This is 
not, however, inevitable. There are some human subgroups whose way of life sub- 
stantially reduces the total cancer death rate among individuals of a given age. 
Under the current conditions in the United States and Europe, approximately one 
in five people will die of cancer. But the incidence of cancer among strict Mor- 
mons in Utah—who avoid alcohol, coffee, cigarettes, drugs, and casual sex—is 
only about half the incidence for non-practicing members of the same family or 
for Americans in general. Cancer incidence is also low in certain relatively afflu- 
ent populations in Africa. 

Although such observations on human populations indicate that cancer can 
often be avoided, it has been difficult in most cases—with tobacco as a striking 
exception—to pinpoint the specific environmental factors responsible for these 
large population differences or to establish how they act. Nevertheless, several 
important classes of environmental cancer risk factors have been identified (Fig- 
ure 20-37B). One thinks first of mutagens. But there are also many other influ- 
ences—including the amount of food we eat, the hormones that circulate in our 
bodies, and the irritations, infections, and damage to which we expose our tis- 
sues—that are no less important and favor development of the disease in other 
ways. 


Sensitive Assays Can Detect Those Cancer-Causing Agents that 
Damage DNA 


Many quite disparate chemicals are carcinogenic when they are fed to exper- 
imental animals or painted repeatedly on their skin. Examples include a range 
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estimated effects of environment and lifestyle on cancer in the United 
States (US). The table shows both the yearly deaths in the US attributable 

(A) to each cancer and the estimated percentage of that cancer that could be 
eliminated through prevention. (B, data from G.A. Colditz, K.Y. Wolin and 
S. Gehlert, Sci. Transl. Med. 4:127rv4, 2012.) 


of aromatic hydrocarbons and derivatives of them such as aromatic amines, 
nitrosamines, and alkylating agents such as mustard gas. Although these chemi- 
cal carcinogens are diverse in structure, a large proportion of them have at least 
one shared property—they cause mutations. In one common test for mutagen- 
icity (the Ames test), the carcinogen is mixed with an activating extract prepared 
from rat liver cells (to mimic the biochemical processing that occurs in an intact 
animal). The mixture is then added to a culture of specially designed test bacte- 
ria and the bacterial mutation rate measured. Most of the compounds scored as 
mutagenic by this rapid and convenient assay in bacteria also cause mutations or 
chromosome aberrations when tested on mammalian cells. 

A few of these carcinogens act directly on DNA. But generally the more potent 
ones are relatively inert chemically; these chemicals become damaging only after 
they have been converted to a more reactive molecule by metabolic processes in 
the liver, catalyzed by a set ofintracellular enzymes known as the cytochrome P-450 
oxidases. These enzymes normally help to convert ingested toxins into harmless 
and easily excreted compounds. Unhappily, their activity on certain chemicals 
generates products that are highly mutagenic. Examples of carcinogens activated 
in this way include benzo/a/pyrene, a cancer-causing chemical present in coal tar 
and tobacco smoke and the fungal toxin aflatoxin B1 (Figure 20-38). 


Fifty Percent of Cancers Could Be Prevented by Changes in 
Lifestyle 


Tobacco smoke is the most important carcinogen in the world today. Even though 
many other chemical carcinogens have been identified, none of these appear to 
be responsible for anything like the same numbers of human cancer deaths. It is 
sometimes thought that the main environmental causes of cancer are the prod- 
ucts of a highly industrialized way of life—the rise in pollution, the enhanced use 
of food additives, and so on—but there is little evidence to support this view. The 
idea may have come in part from the identification of some highly carcinogenic 
materials used in industry, such as 2-naphthylamine and asbestos. Except for the 
increase in cancers caused by smoking, however, age-adjusted death rates for 
most common human cancers have stayed much the same over the past half-cen- 
tury, or, in some cases, have declined significantly (Figure 20-39). Survival rates, 
moreover, have improved. Thirty years ago, less than 50% of patients lived more 
than five years from the time of diagnosis; now, more than two-thirds do so. 
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naturally contaminate foods such as tropical peanuts and is an important cause of 
liver cancer in Africa and Asia. 

Except for tobacco, chemical toxins and mutagens are of lesser importance 
as contributory causes of cancer than other factors that are more a matter of per- 
sonal choice. One important factor is the quantity of food we eat: as mentioned 
earlier, the risk of cancer is greatly increased in people who are obese. In fact, it is 
estimated that as many as 50% of all cancers could be avoided by simple, identifi- 
able changes in lifestyle (see Figure 20-37B). 


osteosarcoma 
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Viruses and Other Infections Contribute to a Significant Proportion 
of Human Cancers 


Cancer in humans is not an infectious disease, and most human cancers do not 
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TABLE 20-2 


Papovavirus family 

Papillomavirus (many Warts (benign) Worldwide 

distinct strains) . 
Carcinoma of the uterine Worldwide 
Cervix 

Hepadnavirus family 


Hepatitis-B virus Liver cancer (hepatocellular | Southeast Asia, tropical 
carcinoma) Africa 


Herpesvirus family 


Epstein-Barr virus Burkitt’s lymphoma (cancer | West Africa, Papua 
of B lymphocytes) New Guinea 


Nasopharyngeal carcinoma | Southern China, 
Greenland 


Human herpesvirus 8 Kaposi’s sarcoma Central and Southern 
Africa 


Retrovirus family 


Human T-cell leukemia virus | Adult T-cell leukemia/ Japan, West Indies 
type | (HTLV-1) lymphoma 


Human immunodeficiency Kaposi’s sarcoma (via Central and Southern 
virus (HIV, the AIDS virus) human herpesvirus 8) Africa 


Flavivirus family 


Hepatitis-C virus Liver cancer (hepatocellular | Worldwide 
carcinoma) 


For all these viruses, the number of people infected is much larger than the number who 
develop cancer: the viruses must act in conjunction with other factors. As described in the text, 
different viruses contribute to cancer in different ways. 





with hepatitis-C virus, which has infected 170 million people worldwide, is also 
clearly associated with the development of liver cancer. 

The main culprits, as shown in Table 20-2, are the DNA viruses. The DNA 
tumor viruses cause cancer by the most direct route—by interfering with controls 
of the cell cycle and apoptosis. To understand this type of viral carcinogenesis, it 
is important to review the life history of viruses. Many DNA viruses use the host 
cell’s DNA replication machinery to replicate their own genomes. However, to 
produce a large number of infectious virus particles within a single host cell, the 
DNA virus has to commandeer this machinery and drive it hard, breaking through 
the normal constraints on DNA replication and usually killing the host cell in the 
process. Many DNA viruses reproduce only in this way. But some have a second 
option: they can propagate their genome as a quiet, well-behaved passenger in 
the host cell, replicating in parallel with the host cell’s DNA (either integrated into 
the host genome, or as an extrachromosomal plasmid) in the course of ordinary 
cell-division cycles. These viruses will switch between two modes of existence 
according to circumstances, remaining latent and harmless for a long time, but 
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then proliferating in occasional cells in a process that kills the host cell and gener- 
ates large numbers of infectious particles. 

Neither of these conditions converts the host cell to a cancerous character, 
nor is it in the interest of the virus to do so. But for viruses with a latent phase, 
accidents can occur that prematurely activate some of the viral proteins that the 
virus would normally use in its replicative phase to allow the viral DNA to repli- 
cate independently of the cell cycle. As described in the example below, this type 
of accident can switch on the persistent proliferation of the host cell itself, leading 
to cancer. 


Cancers of the Uterine Cervix Can Be Prevented by Vaccination 
Against Human Papillomavirus 


The papillomaviruses are a prime example of DNA tumor viruses. They are 
responsible for human warts and are especially important as a cause of carcinoma 
of the uterine cervix: this is the second commonest cancer of women in the world 
as a whole, representing about 6% of all human cancers. Human papillomaviruses 
(HPV) infect the cervical epithelium and maintain themselves in a latent phase 
in the basal layer of cells as extrachromosomal plasmids, which replicate in step 
with the chromosomes. Infectious virus particles are generated through a switch 
to a replicative phase in the outer epithelial layers, as progeny of these cells begin 
to differentiate before being sloughed from the surface. Here, cell division should 
normally stop, but the virus interferes with this cell-cycle arrest so as to allow 
replication of its own genome. Usually, the effect is restricted to the outer layers 
of cells and is relatively harmless, as in a wart. Occasionally, however, a genetic 
accident causes the viral genes that encode the proteins that prevent cell-cycle 
arrest to integrate into the host chromosome and become active in the basal layer, 
where the stem cells of the epithelium reside (see Figure 22-10). This can lead to 
cancer, with the viral genes acting as oncogenes (Figure 20-40). 

The whole process, from initial infection to invasive cancer, is slow, taking 
many years. It involves a long intermediate stage when the affected patch of cervi- 
cal epithelium is visibly disordered but the cells have not yet begun to invade the 
underlying connective tissue—a phenomenon called intraepithelial neoplasia. 
Many such lesions regress spontaneously. Moreover, at this stage, it is still easy to 
cure the condition by destroying or surgically removing the abnormal tissue. For- 
tunately, the presence of such lesions can be detected by scraping off a sample of 
cells from the surface of the cervix and viewing it under the microscope (the “Pap 
smear” technique). 

Better still, a vaccine has now been developed that protects against infection 
with the relevant strains of human papillomavirus. This vaccine, given to girls 
before puberty and thus before they become sexually active, has been shown 
to greatly reduce their risk of ever developing cervical cancer. Because the virus 
spreads through sexual activity, it is now recommended that both young males 
and young females be routinely vaccinated. Mass immunization programs have 
begun in several countries. 


Figure 20-40 How certain papillomaviruses are thought to give rise 

to cancer of the uterine cervix. Papillomaviruses have double-stranded 
circular DNA chromosomes of about 8000 nucleotide pairs. These 
chromosomes are normally stably maintained in the basal cells of the 
epithelium as plasmids (red circles), whose replication is regulated so as to 
keep step with the chromosomes of the host. (A) Normally, the virus perturbs 
the host cell cycle only when the virus is programmed to produce infectious 
progeny, in the outer layers of an epithelium. This is relatively harmless. 

(B) Rare accidents can cause the integration of a fragment of such a plasmid 
into a chromosome of the host, altering the environment of the viral genes in 
the basal cells of an epithelium. This can disrupt the normal control of viral 
gene expression. The unregulated production of certain viral proteins (E6 and 
E7) interferes with the control of cell division in the basal cells, thereby helping 
to generate a cancer (bottom). 
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Infectious Agents Can Cause Cancer in a Variety of Ways 


In papillomaviruses, the viral genes that are mainly to blame are called E6 and E7. 
The protein products of these viral oncogenes interact with many host-cell pro- 
teins, but, in particular, they bind to two key tumor suppressor proteins of the host 
cell, putting them both out of action and so permitting the cell to replicate its DNA 
and divide in an uncontrolled way. One of these host proteins is Rb; the other 
is p53. Other DNA tumor viruses use similar mechanisms to inhibit Rb and p53, 
underlining the central importance of inactivating both of these tumor suppres- 
sor pathways if a cell is to escape the normal constraints on proliferation. 

In other cancers, viruses have indirect tumor-promoting actions. The hepati- 
tis-B and C viruses, for example, favor the development of liver cancer by causing 
chronic inflammation (hepatitis), which stimulates an extensive cell division in 
the liver that promotes the eventual evolution of tumor cells. In AIDS, the human 
immunodeficiency virus (HIV) promotes development of an otherwise rare can- 
cer called Kaposi’s sarcoma by destroying the immune system, thereby permit- 
ting a secondary infection with a human herpesvirus (HHV-8) that has a direct 
carcinogenic action. By causing severe inflammation, chronic infection with 
parasites and bacteria can also promote the development of some cancers. For 
example, chronic infection of the stomach with the bacterium Helicobacter pylori, 
which causes ulcers, appears to be a major cause of stomach cancer; dramatic 
falls in the incidence of stomach cancer over the last half-century (see Figure 
20-39) correlate with a decline in the incidence of Helicobacter infections. 


The Search for Cancer Cures Is Difficult but Not Hopeless 


The difficulty of curing a cancer is similar to the difficulty of getting rid of weeds. 
Cancer cells can be removed surgically or destroyed with toxic chemicals or radi- 
ation, but it is hard to eradicate every single one of them. Surgery can rarely ferret 
out every metastasis, and treatments that kill cancer cells are generally toxic to 
normal cells as well. Moreover, unlike normal cells, cancer cells can mutate rap- 
idly and will often evolve resistance to the poisons and irradiation used against 
them. 

In spite of these difficulties, effective cures using anticancer drugs (alone or in 
combination with other treatments) have already been found for some formerly 
highly lethal cancers, including Hodgkin’s lymphoma, testicular cancer, chorio- 
carcinoma, and some leukemias and other cancers of childhood. Even for types 
of cancer where a cure at present seems beyond our reach, there are treatments 
that will prolong life or at least relieve distress. But what prospect is there of doing 
better and finding cures for the most common forms of cancer, which still cause 
great suffering and so many deaths? 


Traditional Therapies Exploit the Genetic Instability and Loss of 
Cell-Cycle Checkpoint Responses in Cancer Cells 


Anticancer therapies need to take advantage of some molecular peculiarity of 
cancer cells that distinguishes them from normal cells. One such property is 
genetic instability, reflecting deficiencies in chromosome maintenance, cell-cy- 
cle checkpoints, and/or DNA repair. Remarkably, the most widely used cancer 
therapies seem to work by exploiting these abnormalities, although this was not 
known by the scientists who first developed the treatments. Ionizing radiation 
and most anticancer drugs damage DNA or interfere with chromosome segrega- 
tion at mitosis, and they preferentially kill cancer cells because cancer cells have a 
diminished ability to survive the damage. Normal cells treated with radiation, for 
example, arrest their cell cycle until they have repaired the damage to their DNA, 
thanks to the cell-cycle checkpoint responses discussed in Chapter 17. Because 
cancer cells generally have defects in their checkpoint responses, they may con- 
tinue to divide after irradiation, only to die after a few days because the genetic 
damage remains unrepaired. More generally, most cancer cells are physiologi- 
cally deranged to a stressful degree: they live dangerously. Even though the cells 
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in a tumor have evolved to be unusually tolerant of minor DNA damage, they are 
hypersensitive to the much greater amount of damage that can be created by radi- 
ation and by DNA-damaging drugs. A small increase of genetic damage can be 
enough to tip the balance between proliferation and death. 

Unfortunately, while the molecular defects present in cancer cells often 
enhance their sensitivity to cytotoxic agents, they can also increase their resis- 
tance. For example, where a normal cell might die by apoptosis in response to 
DNA damage, thanks to the stress response mediated by p53, a cancer cell may 
escape apoptosis because its p53 is lacking. Cancers vary widely in their sensitiv- 
ity to cytotoxic treatments, some responding to one drug, some to another, prob- 
ably reflecting the particular kinds of defects that a particular cancer has in DNA 
repair, cell-cycle checkpoints, and the control of apoptosis. 


New Drugs Can Kill Cancer Cells Selectively by Targeting Specific 
Mutations 


Radiotherapy and traditional cytotoxic drugs are rather weakly selective: they hurt 
normal cells as well as the cancer cells, and the safety margin is narrow. The dose 
often cannot be raised high enough to kill all the cancer cells, because this would 
kill the patient, and curative treatments, where achievable, generally require a 
combination of several cytotoxic agents. The side effects can be harsh and hard to 
endure. How can we do better? 

An ideal treatment is one that is cell-lethal in combination with some lesion 
that is present in the cancer cells, but harmless to cells where this lesion is absent. 
Such a treatment is said to be synthetic-lethal (from the original sense of the word 
synthesis, meaning “putting together”): it kills only in partnership with the can- 
cer-specific mutation. As we become increasingly able to pinpoint the specific 
alterations in cancer cells that make them different from their normal neighbors, 
new opportunities for such precisely targeted treatments are coming into view. 
We end this chapter with some examples of new treatments of this type that are 
already being put into practice. 


PARP Inhibitors Kill Cancer Cells That Have Defects in Brca7 or 
Brea2 Genes 


As we have emphasized, the genetic instability of cancer cells makes the cells 
both dangerous and vulnerable—dangerous because of the enhancement in their 
ability to evolve and proliferate, and vulnerable because treatment that leads to 
still more extreme genetic disruption can take them over the brink and kill them. 
In some cancers, genetic instability results from an identified fault in one of the 
many devices on which normal cells depend for DNA repair and maintenance. 
In this case, a drug is tailored to block a complementary part of the DNA repair 
machinery can lead to such severe genetic damage that the cancer cells die. 

Detailed studies of the mechanisms for DNA maintenance discussed in Chap- 
ter 5 reveal a surprising amount of apparent redundancy. Thus, knocking out a 
particular pathway for DNA repair is generally less disastrous than one might 
expect, because alternate repair pathways exist. For example, stalled DNA rep- 
lication forks can arise when the fork encounters a single-strand break in a tem- 
plate strand, but cells can avoid the disaster that would otherwise result either by 
directly repairing these single-strand breaks, or, if that fails, repairing the broken 
fork that results by homologous recombination (see Figure 5-50). Suppose that the 
cells in a particular cancer have become genetically unstable by acquiring a muta- 
tion that reduces their ability to repair broken replication forks by homologous 
recombination. Might it be possible to eradicate that cancer by treating it with a 
drug that inhibits the repair of single-strand breaks, thereby greatly increasing the 
number of forks that break? The consequences of such drug treatment might be 
expected to be relatively harmless for normal cells, but lethal for the cancer. 

This strategy appears to work to kill the cells in at least one class of cancers— 
those that have inactivated both copies of either their Brcal or their Brca2 tumor 
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suppressor genes. As described in Chapter 5, Brca2 is an accessory protein that 
interacts with the Rad51 protein (the RecA analog in humans) in the repair of 
DNA double-strand breaks by homologous recombination. Brcal is another 
protein that is also required for this repair process. Like Rb, the Brcal and Brca2 
genes were discovered as mutations that predispose humans to cancer—in this 
case, chiefly cancers of the breast and ovaries (though unlike Rb, they seem to be 
involved in only a small proportion of such cancers). Individuals who inherit one 
mutant copy of Brcal or Brca2 develop tumors that have inactivated the second 
copy of the same gene, presumably because this change makes the cells geneti- 
cally unstable and speeds tumor progression. 

While Brcal and Brca2 are needed for the repair of DNA double-strand breaks, 
single-strand breaks are repaired by other machinery, involving an enzyme called 
PARP (polyADP-ribose polymerase). This understanding of the basic mecha- 
nisms of DNA repair led to a striking discovery: drugs that block PARP activity kill 
Brca-deficient cells with extraordinary selectivity. At the same time, PARP inhibi- 
tion has very little effect on normal cells; in fact, mice that have been engineered 
to lack PARP1—the major PARP family member involved in DNA repair—remain 
healthy under laboratory conditions. This result suggests that, while the repair 
pathway requiring PARP provides a first line of defense against persistent breaks in 
a DNA strand, these breaks can be repaired efficiently by a genetic recombination 
pathway in normal cells. In contrast, tumor cells that have acquired their genetic 
instability by the loss of Brcal or Brca2 have lost this second line of defense, and 
they are therefore uniquely sensitive to PARP inhibitors (Figure 20-41). 

PARP inhibitors are still under clinical trial, but they have produced some strik- 
ing results, causing tumors to regress in many Brca-deficient patients and delay- 
ing progression of their disease, with relatively few disagreeable side effects. These 
drugs also appear to be applicable to cancers with other mutations that cause 
defects in the cell’s homologous recombination machinery—a small, though sig- 
nificant, proportion of cancer cases. 


Figure 20-41 How a tumor’s genetic 
instability can be exploited for cancer 
therapy. As explained in Chapter 5, the 
maintenance of DNA sequences is so 
critical for life that cells have evolved 
multiple pathways for repairing DNA 
damage and reducing DNA replication 
errors. As illustrated, a DNA replication 
fork will stall whenever it encounters a 
break in a DNA template strand. In this 
example, normal cells have two different 
repair pathways that help them to avoid 
the problem, pathways 1 and 2. They are 
therefore not harmed by treatment with a 
drug that blocks repair pathway 1. But, 
because the inactivation of repair pathway 
2 was selected for during the evolution of 
the tumor cell, the tumor cells are killed by 
the same drug treatment. 

In the actual case that underlies this 
example, the function of repair pathway 1 
(requiring the PARP protein discussed in 
the text) is to remove persistent, accidental 
breaks in a DNA single strand before they 
are encountered by a moving replication 
fork. Pathway 2 is the recombination- 
dependent process (requiring the Brca2 
and Brca1 proteins) for repairing stalled 
replication forks illustrated in Figure 5—50. 
PARP inhibitors have promise for treating 
cancers with defective Brca2 or Brca1 
tumor Suppressor genes. 
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PARP inhibition provides an example of the type of rational, highly selective 
approach to cancer therapy that is beginning to be possible. Along with other new 
treatments to be discussed below, it raises high hopes for treating many other 
cancers. 


small Molecules Can Be Designed to Inhibit Specific Oncogenic 
Proteins 


An obvious tactic for treating cancer is to attack a tumor expressing an oncogene 
with a drug designed to specifically block the function of the protein that the 
oncogene produces. But how can such a treatment avoid hurting the normal cells 
that depend on the function of the proto-oncogene from which the oncogene has 
evolved, and why should the drug kill the cancer cells, rather than simply calm 
them down? One answer may lie in the phenomenon of oncogene dependence. 
Once a cancer cell has undergone an oncogenic mutation, it will often undergo 
further mutations, epigenetic changes, or physiological adaptations that make it 
reliant on the hyperactivity of the initial oncogene, just as drug addicts become 
reliant on high doses of their drug. Blocking the activity of the oncogenic protein 
may then kill the cancer cell without significantly harming its normal neighbors. 
Some remarkable successes have been achieved in this way. 

As we Saw earlier, chronic myelogenous leukemia (CML) is usually associated 
with a particular chromosomal translocation, visible as the Philadelphia chromo- 
some (see Figure 20-5). This results from chromosome breakage and rejoining 
at the sites of two specific genes, Abl and Bcr. The fusion of these genes creates 
a hybrid gene, called Bcr-Abl, that codes for a chimeric protein consisting of the 
N-terminal fragment of Bcr fused to the C-terminal portion of Abl (Figure 20-42). 
Abl is a tyrosine kinase involved in cell signaling. The substitution of the Bcr frag- 
ment for the normal N-terminus of Abl makes it hyperactive, so that it stimulates 
inappropriate proliferation of the hemopoietic precursor cells that contain it and 
prevents these cells from dying by apoptosis—which many of them would nor- 
mally do. As a result, excessive numbers of white blood cells accumulate in the 
bloodstream, producing CML. 

The chimeric Bcr-Abl protein is an obvious target for therapeutic attack. 
Searches for synthetic drug molecules that can inhibit the activity of tyrosine 
kinases discovered one, called imatinib (trade name Gleevec®), that blocks Bcr- 
Abl (Figure 20-43). When the drug was first given to patients with CML, nearly 
all of them showed a dramatic response, with an apparent disappearance of the 
cells carrying the Philadelphia chromosome in over 80% of patients. The response 
appears relatively durable: after years of continuous treatment, many patients 
have not progressed to later stages of the disease—although imatinib-resistant 
cancers emerge with a probability of about 5% per year during the early years. 
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Figure 20-42 The conversion of the 
Abl proto-oncogene into an oncogene 
in patients with chronic myelogenous 
leukemia. The chromosome translocation 
responsible joins the Bcr gene on 
chromosome 22 to the Ab/ gene from 
chromosome 9, thereby generating a 
Philadelphia chromosome (see Figure 
20-5). The resulting fusion protein has the 
N-terminus of the Bcr protein joined to 
the C-terminus of the Abl tyrosine protein 
kinase; in consequence, the Abl kinase 
domain becomes inappropriately active, 
driving excessive proliferation of a clone of 
hemopoietic cells in the bone marrow. 
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Figure 20-43 How imatinib (Gleevec) blocks the activity of Bcr-Abl protein and halts chronic myelogenous leukemia. 
(A) Imatinib sits in the ATP-binding pocket of the tyrosine kinase domain of Bcr-Abl and thereby prevents Bcr-Abl from 
transferring a phosphate group from ATP onto a tyrosine residue in a substrate protein. This blocks transmission of a signal for 
cell proliferation and survival. (B) The structure of the complex of imatinib (Solid blue object) with the tyrosine kinase domain of 
the Abl protein (ribbon diagram), as determined by x-ray crystallography. (C) The chemical structure of the drug. It can be given 
by mouth; it has side effects, but they are usually quite tolerable. (B, from T. Schindler et al., Science 289:1938-1942, 2000. 
With permission from AAAS.) 


Results are not so good for those patients who have already progressed to the 
more acute phase of myeloid leukemia, known as blast crisis, where genetic insta- 
bility has set in and the march of the disease is far more rapid. These patients show 
a response at first and then relapse because the cancer cells develop a resistance 
to imatinib. This resistance is usually associated with secondary mutations in the 
part of the Bcr-Abl gene that encodes the kinase domain, disrupting the ability 
of imatinib to bind to Bcr-Abl kinase. Second-generation inhibitors that function 
effectively against a whole range of imatinib-resistant mutants have now been 
developed. By combining one or more of these new inhibitors with imatinib as 
the initial therapy (see below), it seems that CML—at least in the chronic (early) 
stage—may be on its way to becoming a curable disease. 

Despite the complications with resistance, the extraordinary success of ima- 
tinib is enough to drive home an important principle: once we understand pre- 
cisely what genetic lesions have occurred in a cancer, we can begin to design effec- 
tive rational methods to treat it. This success story has fueled efforts to identify 
small-molecule inhibitors for other oncogenic protein kinases and to use them 
to attack the appropriate cancer cells. Increasing numbers are being developed. 
These include molecules that target the EGF receptor and are currently approved 
for the treatment of some lung cancers, as well as drugs that specifically target the 
B-Raf oncoprotein in melanomas. 

Protein kinases have been relatively easy to inhibit with small molecules 
like imatinib, and many kinase inhibitors are being produced by pharmaceuti- 
cal companies in the hope that they can be effective as drugs for some forms of 
cancer. Many cancers lack an oncogenic mutation in a protein kinase. But most 
tumors contain inappropriately activated signaling pathways, for which a target 
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somewhere in the pathway can hopefully be found (Movie 20.7). As an example, 
Figure 20-44 displays some of the anticancer drugs and drug targets that are cur- 
rently being tested for a pathway frequently activated in cancers. 


Many Cancers May Be Treatable by Enhancing the Immune 
Response Against the Specific Tumor 


Cancers have complex interactions with the immune system, and its various com- 
ponents may sometimes help as well as hinder tumor progression. But for more 
than a century it has been a dream of cancer researchers to somehow harness 
the immune system in a controlled and efficient way to exterminate cancer cells, 
just as it exterminates infectious organisms. There are finally signs that this dream 
may one day be realized, at least for some forms of cancer. 

The simplest type of immunological therapy, conceptually at least, is to inject 
the patient with antibodies that target the cancer cells. This approach has had 
some successes. About 25% of breast cancers, for example, express unusually high 
levels of the Her2 protein, a receptor tyrosine kinase related to the EGF receptor 
that plays a part in the normal development of mammary epithelium. A monoclo- 
nal antibody called trastuzumab (trade name Herceptin®) that binds to Her2 and 
inhibits its function slows the growth of breast tumors in humans that overexpress 
Her2, and it is now an approved therapy for these cancers (see Figure 20-44). A 
related approach uses antibodies to deliver poisons to the cancer cells. Antibodies 
against proteins that are abundant on the surface of a particular type of cancer cell 
but rare on normal cells can be armed with a toxin that kills those cells that bind 
the antibody molecule. 

A great deal of current excitement centers around a different type of approach, 
based on the relatively recent recognition that the microenvironment in a tumor is 
highly immunosuppressive. As a result, the cancer victim’s immune system is pre- 
vented from destroying the tumor cells. Recall that, from the thousands of gnome 
sequences thus far determined, we know that a typical cancer cell will contain on 
the order of 50 proteins with a mutation that alters an amino acid sequence, most 
of these being “passenger” mutations, as previously explained (see p. 1104). Many 
of these mutant proteins will be recognized by the patient’s immune system as 
foreign, but—to allow the cancer cells to survive throughout the course of tumor 
progression—the cancer cells have evolved a set of anti-immune defenses. These 
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Figure 20-44 Some anticancer drugs 
and drug targets in the Ras-MAP-kinase 
signaling pathway. Each of the signaling 
proteins in this diagram has been identified 
as a product of a cancer-critical gene, with 
the exception of Rafl and Erk. This Ras- 
MAP-kinase signaling pathway is triggered 
by a variety of receptor tyrosine kinases 
(RTKs), including the EGF receptor (see 
Figures 15-47 and 15-49). Those drugs 
that are antibodies end in “mab,” while 
those that are small molecules end in “nib.” 
(Adapted from B. Vogelstein et al, Science 
339:1546-1558, 2013.) 
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defenses include the expression on the cancer cell surface of one or more proteins 
that bind to inhibitory receptors on activated T cells. 

The normal immune system is subject to complex controls that keep its activity 
within safe bounds and prevent autoimmunity from developing. The inhibitory 
receptors that are expressed on the surface of activated T cells have an important 
normal function: they control the immune response by down-regulating the T 
cell response under appropriate circumstances. But in the context of a tumor, the 
down-regulation is inappropriate, because it prevents the organism from killing 
the cancer cells that are threatening its survival. 

In its attack on infectious organisms, the natural immune system usually elim- 
inates every last trace of infection and maintains this immunity in the long term. 
The challenge is to find ways of recruiting the immune system to attack cancers 
with similar efficiency and specificity, hunting the cancer cells down by virtue of 
the tumor-specific antigens that they express. With this aim, a new type of anti- 
cancer therapy focuses on overcoming the immunosuppressive environment in 
a tumor through the use of specific antibodies that prevent the tumor cells from 
engaging with the inhibitory receptors on T cells. As illustrated in Figure 20-45A, 
blocking the action of the immune suppressors with such treatments should 
unleash an immune attack on the cancer cells. Importantly, multiple antigens 
are recognized as foreign; thus, the cancer cells cannot escape through the muta- 
tional loss of a single antigen, making it difficult for the tumor to escape from the 
T cell attack. 

This is a potentially dangerous strategy. If one provokes the immune system to 
recognize the cancer cells as targets for destruction, there is a risk of autoimmune 
side effects with dire consequences for normal tissues of the body, since the can- 
cer cells and the normal cells are close cousins and share most of their molecular 
features. Nevertheless, several recent successes seem to hold great promise for 
the future. 

One of the many molecules involved in keeping the activity of the normal 
immune system within safe bounds is a protein called CTLA4 (cytotoxic T-lym- 
phocyte-associated protein 4), which functions as an inhibitory receptor on the 
surface of T cells. If the function of CTLA4 is blocked, the T cells become more 
reactive and may mount an attack on cells that they would otherwise leave 
in peace. In particular, the T cells may attack tumor cells that are recognizably 
abnormal but whose presence was previously tolerated. With this in mind, cancer 
immunologists developed a monoclonal antibody, called ipilimumab, that binds 
to CTLA4 and blocks its action. Injected repeatedly into patients with metastatic 
melanoma, this antibody increases their median lifespan by several months and, 
in one large trial, enabled as many as a quarter of them to survive for five years 
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Figure 20-45 Therapies designed 

to remove the immunosuppressive 
microenvironment in tumors. (A) The 
cells in tumors will produce many mutant 
proteins. As described in Chapter 24, 
peptides from these proteins will be 
displayed on MHC complexes on the 
tumor-cell surface and would normally 
activate a T cell response that destroys 
the tumor (See Figure 24—42). However, as 
schematically illustrated, during the course 
of tumor progression, the cancer cells have 
evolved immunosuppressive mechanisms 
that protect them from such killing. (B) The 
cells in tumors often protect themselves 
from immune attack by expressing 
proteins on their surface that bind to and 
thereby activate the inhibitory receptors 

on T cells. As indicated, this makes the 
tumor susceptible to specific antibody 
therapies. In this diagram, two such 
inhibitory receptors are shown, PD1 anda 
hypothetical protein X. Different tumors are 
thought to protect themselves by activating 
different members of a large set of T cell 
inhibitory receptors, some of which are not 
yet well characterized. 
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or more—far beyond expectations for comparable patients without this treat- 
ment. Even more promising are recent clinical trials using a combination of two 
antibodies, one against CTLA4 and the other against PD1, a second cell-surface 
receptor on T cells that normally restrains their activity. 

In clinical trials using such techniques, a substantial fraction of the patients 
can respond in a dramatic way, with their cancer being driven into remission 
for years, while the treatment fails to help others with the same type of cancer. 
One possible explanation is that, while most tumors express proteins that protect 
them from T-cell attack, these proteins are different for different tumors. Thus, 
while some tumors will respond dramatically when treated with an antibody that 
blocks a particular immunosuppressive agent, many others will not. If true, one 
can foresee an era of personalized immunotherapy, in which each patient’s tumor 
is molecularly analyzed to determine its particular mechanisms of immunosup- 
pression. The patient would then be treated with a specific cocktail of antibodies 
designed to remove these blocks (see Figure 20-45). 


Cancers Evolve Resistance to Therapies 


High hopes have to be tempered with sobering realities. We have seen that genetic 
instability can provide an Achilles heel that cancer therapies can exploit, but at the 
same time it can make eradicating the disease more difficult by allowing the can- 
cer cells to evolve resistance to therapeutic drugs, often at an alarming rate. This 
applies even to the drugs that target genetic instability itself. Thus, PARP inhibitors 
give valuable remission of illness, but in the long term the disease generally comes 
back. For example, Brca-deficient cancers can sometimes develop resistance to 
PARP inhibitors by undergoing a second mutation in an affected Brca gene that 
restores its function. By then, the cancer is already out of control and it may be too 
late to affect the course of the disease with additional treatments. 

There are many different strategies by which cancers can evolve resistance to 
anticancer drugs. Often, a cancer will be dramatically reduced in size by an initial 
drug treatment, with all of the detectable tumor cells seeming to disappear. But 
months or years later the cancer will reappear in an altered form that is resistant 
to the drug that was at first so successful. In such cases, the initial drug treatment 
has evidently failed to destroy some tiny fraction of cells in the original tumor- 
cell population. These cells have escaped death because they carry a protective 
mutation or epigenetic change, or perhaps simply because they were lurking in 
a protected environment. They eventually regenerate the cancer by continuing to 
proliferate, mutating and evolving still further as they do so. 

In some cases, cells that are exposed to one anticancer drug evolve a resis- 
tance not only to that drug but also to other drugs to which they have never been 
exposed. This phenomenon of multidrug resistance frequently correlates with 
amplification of a part of the genome that contains a gene called Mdr1 or Abcb1. 
This gene encodes a plasma-membrane-bound transport ATPase of the ABC 
transporter superfamily (discussed in Chapter 11), which pumps lipophilic drugs 
out of the cell (see Movie 11.5). The overproduction of this protein (or some of its 
other family members) by a cancer cell can prevent the intracellular accumula- 
tion of many cytotoxic drugs, making the cell insensitive to them. 

In the to-and-fro struggle between advanced metastatic cancer and the ther- 
apist, as current practice stands, the cancer usually wins in the end. Does it have 
to be so? As we discuss below, there is reason to think that by attacking a cancer 
with many weapons at once—instead of using them one after another, each until 
it fails—it may be possible to do much better. 


Combination Therapies May Succeed Where Treatments with 
One Drug at a Time Fail 
Nowadays, cancers caught at an early stage can often be cured, by surgery, radia- 


tion, or drugs. For most cancers that have progressed and metastasized widely, 
however, cure is still beyond us. Treatments such as those described above can 
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give valuable remissions, but sooner or later these are typically followed by relapse. 

Nevertheless, for some relatively rare forms of advanced cancer, curative ther- 
apies have been developed. These generally involve a cocktail of several different 
anticancer agents: by trial and error, certain combinations of cytotoxic drugs have 
been found to wipe out the cancer completely. Discovering such combinations 
has hitherto involved a long, hard search. But now, armed with our new tools for 
identifying the specific genetic lesions that cancer cells contain, the prospects are 
better. 

The logic of combination therapies is the same as that behind the current treat- 
ment of HIV-AIDS with a cocktail of three different protease inhibitors: whereas 
there may always be some cells in the initial population carrying the rare muta- 
tions that confer resistance to any one drug treatment, there should be no cell 
carrying the whole set of rare mutations that would confer resistance to several 
different drugs delivered simultaneously. In contrast, sequential drug treatments 
will allow the few cells resistant to the first drug to multiply to large numbers. 
Within this large population of cells resistant to the first drug, a small number 
of cells are likely to have arisen that are resistant to the next drug also; and so on 
(Figure 20-46). 


We Now Have the Tools to Devise Combination Therapies 
Tailored to the Individual Patient 


Efficient, rational combination drug therapy requires three things. First, we have 
to identify multiple peculiarities of cancer cells that make them vulnerable in ways 
that normal cells are not. Second, we have to devise drugs (or other treatments) 
that target each of these vulnerabilities. Third, we have to match the combination 
of drugs to the specific set of peculiarities present in the cancer cells of the indi- 
vidual patient. 

The first requirement is already partially met: we now have large catalogs of 
cancer-critical genes that are commonly mutated in cancer cells. The second 
requirement is harder, but attainable: we have described some remarkable recent 
successes, and for cancer researchers there is excitement in the air. It is becoming 
increasingly possible to use our growing knowledge of cell and molecular biol- 
ogy to design new drugs against designated targets. At the same time, efficient, 
high-throughput automated methods are available to screen large libraries of 
chemicals for any that may be effective against cells with a given cancer-related 
defect. In such searches, the goal is synthetic lethality: a cell death that occurs 
when and only when a particular drug is put together with a particular cancer 
cell abnormality. Through these and other approaches, the repertoire of precisely 
targeted anticancer drugs is rapidly increasing. 
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This brings us to the third requirement: the therapy—the choice of drugs to 
be given in combination—must be tailored to the individual patient. Here, too, 
the prospects are bright. Cancers evolve by a fundamentally random process, 
and each patient is different; but modern methods of genome analysis now let us 
characterize the cells from a tumor biopsy in exhaustive detail so as to discover 
which cancer-critical genes are affected in a particular case. Admittedly, this is 
not straightforward: the tumor cells in an individual patient are heterogeneous 
and do not all contain the same genetic lesions. With increased understandings of 
the pathways of cancer evolution, however, and with the experience gained from 
many different cases, it should become possible to make good guesses at the opti- 
mal therapies to use. 

From the perspective of the patient, the pace of advance in cancer research 
can seem frustratingly slow. Each new drug has to be tested in the clinic, first for 
safety and then for efficacy, before it can be released for general use. And if the 
drug is to be used in combination with others, the combination therapy must then 
go through the same long process. Strict ethical rules constrain the conduct of 
trials, which means that they take time—typically several years. But slow and cau- 
tious steps, taken systematically in the right direction, can lead to great advances. 
There is still far to go, but the examples that we have discussed provide proof of 
principle and grounds for optimism. 

From the cancer research effort, we have learned a great deal of what we know 
about the molecular biology of the normal cell. Now, more and more, we are dis- 
covering how to put that knowledge to use in the battle with cancer itself. 


Summary 


Our growing understanding of the cell biology of cancers has already begun to lead 
to better ways of preventing, diagnosing, and treating these diseases. Anticancer 
therapies can be designed to destroy cancer cells preferentially by exploiting the 
properties that distinguish cancer cells from normal cells, including the cancer cells’ 
dependence on oncogenic proteins and the defects they harbor in their DNA repair 
mechanisms. We now have good evidence that, by increasing our understanding 
of normal cell control mechanisms and exactly how they are subverted in specific 
cancers, we can eventually devise drugs to kill cancers precisely by attacking specific 
molecules critical for the growth and survival of the cancer cells. In addition, great 
progress has recently been made through sophisticated immunological approaches 
to cancer therapy. And, as we become better able to determine which genes are 
altered in the cells of any given tumor, we can begin to tailor treatments more accu- 
rately to each individual patient. 


PROBLEMS 


Which statements are true? Explain why or why not. 
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WHAT WE DON’T KNOW 


e What is required to enable a cancer 
cell to metastasize? 


e How can the molecular analysis of 
an individual tumor be more effectively 
used to design effective therapies to 
kill it’? 


e Can we identify general features 
common to all cancer cells—such as 
their production of misfolded, mutated 
proteins —that can be used for the 
targeted destruction of many different 
types of cancers? 


e Can sensitive and reliable blood tests 
be devised to detect cancers very 
early, before they have grown to a size 
where treatment with a single drug will 
generally be defeated by the survival of 
a preexisting resistant variant? 


e How can the observed 
environmental effects on cancer rates 
be exploited to reduce avoidable 
cancers? 


e Can new technologies be devised 
to reveal exactly how a quiescent 
micrometastasis converts to a full- 
blown metastatic tumor? 


20-1 ‘The chemical carcinogen dimethylbenz|a|anthra- 
cene (DMBA) must be an extraordinarily specific mutagen 
since 90% of the skin tumors it causes have an A-to-T alter- 
ation at exactly the same site in the mutant Ras gene. 


20-2 In the cellular regulatory pathways that control 
cell growth and proliferation, the products of oncogenes 
are stimulatory components and the products of tumor 
suppressor genes are inhibitory components. 


20-3 Cancer therapies directed solely at killing the rap- 
idly dividing cells that make up the bulk of a tumor are 
unlikely to eliminate the cancer from many patients. 


20-4 ‘The main environmental causes of cancer are the 
products of our highly industrialized way of life such as 
pollution and food additives. 


Discuss the following problems. 


20-5 In contrast to colon cancer, whose incidence 
increases dramatically with age, incidence of osteosar- 
coma—a tumor that occurs most commonly in the long 
bones—peaks during adolescence. Osteosarcomas are rel- 
atively rare in young children (up to age 9) and in adults 
(over 20). Why do you suppose that the incidence of osteo- 
sarcoma does not show the same sort of age-dependence 
as colon cancer? 
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20-6 Mortality due to lung cancer was followed in 
groups of males in the United Kingdom for 50 years. Figure 
Q20-1 shows the cumulative risk of dying from lung can- 
cer as a function of age and smoking habits for four groups 
of males: those who never smoked, those who stopped at 
age 30, those who stopped at age 50, and those who contin- 
ued to smoke. These data show clearly that individuals can 
substantially reduce their cumulative risk of dying from 
lung cancer by stopping smoking. What do you suppose is 
the biological basis for this observation? 


20-7 A small fraction—2 to 3%—of all cancers, across 
many subtypes, displays a quite remarkable phenome- 
non: tens to hundreds of rearrangements that primarily 
involve a single chromosome, or chromosomal region. 
The breakpoints can be tightly clustered, with several in a 
few kilobases; the junctions of the rearrangements often 
involve segments of DNA that were not originally close 
together on the chromosome. The copy number of various 
segments within the rearranged chromosome was found 
to be either zero, indicating deletion, or one, indicating 
retention. 

You can imagine two ways in which such multi- 
ple, localized rearrangements might happen: a progressive 
rearrangements model with ongoing inversions, deletions, 
and duplications involving a localized area, or a cata- 
strophic model in which the chromosome is shattered into 
fragments that are stitched back together in random order 
by nonhomologous end joining (Figure Q20-2). 

A. Which of the two models in Figure Q20-2 accounts 
more readily for the features of these highly rearranged 
chromosomes? Explain your reasoning. 

B. For whichever model you choose, suggest how 
such multiple rearrangements might arise. (The true 
mechanism is not known.) 

C. Do you suppose such rearrangements are likely 
to be causative events in the cancers in which they are 
found, or are they probably just passenger events that are 
unrelated to the cancer? If you think they could be driver 
events, suggest how such rearrangements might activate 
an oncogene or inactivate a tumor suppressor gene. 


20-8 Virtually all cancer treatments are designed to kill 
cancer cells, usually by inducing apoptosis. However, one 
particular cancer—acute promyelocytic leukemia (APL)— 
has been successfully treated with all-trans-retinoic acid, 
which causes the promyelocytes to differentiate into neu- 
trophils. How might a change in the state of differentiation 
of APL cancer cells help the patient? 


20-9 One major goal of modern cancer therapy is to 
identify small molecules—anticancer drugs—that can 
be used to inhibit the products of specific cancer-critical 
genes. If you were searching for such molecules, would 
you design inhibitors for the products of oncogenes or 
the products of tumor suppressor genes? Explain why you 
would (or would not) select each type of gene. 
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Figure Q20-2 Two models to explain the multiple, localized 
chromosome rearrangements found in some cancers (Problem 
20-7). The progressive rearrangements model shows a sequence 

of rearrangements that disrupts the chromosome, generating 
increasingly complex chromosomal configurations. The chromosome 
catastrophe model shows the chromosome being fragmented and 
then reassembled randomly, with some pieces left out. 
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20-10 PolyADP-ribose polymerase (PARP) plays a key 
role in the repair of DNA single-strand breaks. In the pres- 
ence of the PARP inhibitor olaparib, single-strand breaks 
accumulate. When a replication fork encounters a sin- 
gle-strand break, it converts it to a double-strand break, 
which in normal cells is then repaired by homologous 
recombination. In cells defective for homologous recom- 
bination, however, inhibition of PARP triggers cell death. 

Patients who have only one functional copy of the 
Brcal gene, which is required for homologous recombina- 
tion, are at much higher risk for cancer of the breast and 
ovary. Cancers that arise in these tissues in these patients 
can be treated successfully with olaparib. Explain how it is 
that treatment with olaparib kills the cancer cells in these 
patients, but does not harm their normal cells. 


20-11 The Tasmanian devil, a carnivorous Australian 
marsupial, is threatened with extinction by the spread 
of a fatal disease in which a malignant oral-facial tumor 
interferes with the animal’s ability to feed. You have been 
called in to analyze the source of this unusual cancer. It 
seems Clear to you that the cancer is somehow spread from 
devil to devil, very likely by their frequent fighting, which 
is accompanied by biting around the face and mouth. To 
uncover the source of the cancer, you isolate tumors from 
11 devils captured in widely separated regions and exam- 
ine them. As might be expected, the karyotypes of the 
tumor cells are highly rearranged relative to that of the 
wild-type devil (Figure Q20-3). Surprisingly, you find that 
the karyotypes from all 11 tumor samples are very similar. 
Moreover, one of the Tasmanian devils has an inversion on 
chromosome 5 that is not present in its facial tumor. How 
do you suppose this cancer is transmitted from devil to 
devil? Is it likely to arise as a consequence of an infection 
by a virus or microorganism? Explain your reasoning. 
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Figure Q20-3 Karyotypes of cells from Tasmanian devils (Problem 
20-11). (A) A Tasmanian devil. (B) Normal karyotype for a male 
Tasmanian devil. The karyotype has 14 chromosomes, including XY. 

(C) Karyotype of cancer cells found in each of the 11 facial tumors 
studied. The karyotype has 13 chromosomes, no sex chromosomes, 
no chromosome 2 pair, one chromosome 6, two chromosomes 1 with 
deleted long arms, and four highly rearranged marker chromosomes 
(M1—M4). (A, reproduced courtesy of Museum Victoria; B and C, from 
A.M. Pearse and K. Swift, Nature 439:549, 2006. With permission from 
Macmillan Publishers Ltd.) 
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CHAPTER 


Develooment of 
Multicellular Organisms 


An animal or plant starts its life as a single cell—a fertilized egg, or zygote. During IN THIS CHAPTER 
development, this cell divides repeatedly to produce many different kinds of cells, 
arranged in a final pattern of spectacular complexity and precision. The goal of OVERVIEW OF DEVELOPMENT 
developmental cell biology is to understand the cellular and molecular mecha- 
nisms that direct this amazing transformation (Movie 21.1). MECHANISMS OF PATTERN 
Plants and animals have very different ways of life, and they use different FORMATION 
developmental strategies; in this chapter, we focus mainly on animals. Four pro- 
cesses are fundamental to animal development: (1) cell proliferation, which pro- DEVELOPMENTAL TIMING 
duces many cells from one; (2) cell-cell interactions, which coordinate the behav- 
ior of each cell with that of its neighbors; (3) cell specialization, or differentiation, MORPHOGENESIS 
which creates cells with different characteristics at different positions; and (4) cell 
, . . GROWTH 
movement, which rearranges the cells to form structured tissues and organs (Fig- 
ure 21-1). Itis on the fourth point that plant development differs radically: plant NEURAL DEVELOPMENT 
cells are unable to migrate or move independently through the embryo because 
each one is contained within a cell wall, through which itis cemented to its neigh- 
bors, as discussed in Chapter 19. 
In a developing animal embryo, the four fundamental processes are happen- 
ing in a kaleidoscopic variety of ways, as they give rise to different parts of the 
organism. Like the members of an orchestra, the cells in the embryo have to play 
their individual parts in a highly coordinated manner. In the embryo, however, 
there is no conductor—no central authority—to direct the performance. Instead, 
development is a self-assembly process in which the cells, as they grow and prolif- 
erate, organize themselves into increasingly complex structures. Each of the mil- 
lions of cells has to choose for itself how to behave, selectively utilizing the genetic 
instructions in its chromosomes. 
At each stage in its development, the cell is presented with a limited set of 
options, so that its developmental pathway branches repeatedly, reflecting a large 
set of sequential choices. Like the decisions we make in our own lives, the choices 
made by the cell are based on its internal state—which largely reflects its his- 
tory—and on current influences from other cells, especially its close neighbors. 
To understand development, we need to know how each choice is controlled and 
how it depends on previous choices. Beyond that, we need to understand how 
the choices, once made, influence the cell’s chemistry and behavior, and how cell 
behaviors act synergistically to determine the structure and function of the body. 
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Figure 21-1 The four essential cell processes that allow a multicellular organism to be made. 
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As cells become specialized they change not only their chemistry but also their 
shape and their attachments to other cells and to the extracellular matrix. They 
move and rearrange themselves to create the complex architecture of the body, 
with all its tissues and organs, each structured precisely and defined in size. To 
understand this process of form generation, or morphogenesis, we will need to 
take account of the mechanical, as well as the biochemical, interactions between 
the cells. 

At first glance, one would no more expect the worm, the flea, the eagle, and 
the giant squid all to be generated by the same developmental mechanisms than 
one would suppose that the same methods were used to make a shoe and an air- 
plane. Remarkably, however, research in the past 30 years has revealed that much 
of the basic machinery of development is essentially the same in all animals—not 
just in all vertebrates, but in all the major phyla of invertebrates too. Recognizably 
similar, evolutionarily related molecules define the specialized animal cell types, 
mark the differences between body regions, and help create the animal body pat- 
tern. Homologous proteins are often functionally interchangeable between very 
different species. Thus, a human protein produced artificially in a fly, for example, 
can perform the same function as the fly’s own version of that protein (Figure 
21-2). Thanks to an underlying unity of mechanisms, developmental biologists 
have been making great strides toward a coherent understanding of animal devel- 
opment. 

We begin this chapter with an overview of some of the basic mechanisms that 
operate in animal development. We then discuss, in sequence, how cells in the 
embryo diversify to form patterns in space, how the timing of developmental 
events is controlled, how cell movements contribute to morphogenesis, and how 
the size of an animal is regulated. We end by considering the most challenging 
aspect of development—the mechanisms that enable a highly complex nervous 
system to form. 
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site, such as a wing (B) or a leg (C). The 
scanning electron micrographs show 
a patch of eye tissue on the leg of a fly 
resulting from misexpression of Drosophila 
Eyeless (E) and of squid Pax6 (F). The 
homologous protein from a human or 
practically any animal possessing eyes, 
when similarly misexpressed in a transgenic 
fly, has the same effect. The entire eye of a 
normal Drosophila is shown for comparison 
in (A) and (D). (B-C, courtesy of Georg 
Halder; D—F, from S. I. Tomarev, et al. Proc. 
ga pi PANATA Natl Acad. Sci. USA 94:2421-2426, 1997. 
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Animals live by eating other organisms. Thus, despite their remarkable diversity, 
animals as different as worms, mollusks, insects, and vertebrates share anatom- 
ical features that are fundamental to this way of life. Epidermal cells form a pro- 
tective outer layer; gut cells absorb nutrients from ingested food; muscle cells 
allow movement; and neurons and sensory cells control behavior. These diverse 
cell types are organized into tissues and organs, forming a sheet of skin covering 
the exterior, a mouth for feeding, and an internal gut tube to digest food—with 
muscles, nerves, and other tissues arranged in the space between the skin and 
the gut tube. Many animals have clearly defined axes—an anteroposterior axis, 
with mouth and brain anterior and anus posterior; a dorsoventral axis, with back 
dorsal and belly ventral; and a left-right axis. In this section, we discuss some fun- 
damental mechanisms underlying animal development, beginning with how the 
basic animal body plan is established. 


Conserved Mechanisms Establish the Basic Animal Body Plan 


The shared anatomical features of animals develop through conserved mecha- 
nisms. After fertilization, the zygote usually divides rapidly, or cleaves, to form 
many smaller cells; during this cleavage, the embryo, which cannot yet feed, does 
not grow. This phase of development is initially driven and controlled entirely by 
the material deposited in the egg by the mother. The embryonic genome remains 
inactive until a point is reached when maternal mRNAs and proteins rather 
abruptly begin to be degraded. The embryo’s genome is activated, and the cells 
cohere to form a blastula—typically a solid or a hollow fluid-filled ball of cells. 
Complex cell rearrangements called gastrulation (from the Greek “gaster, mean- 
ing “belly”) then transform the blastula into a multilayered structure containing 
a rudimentary internal gut (Figure 21-3). Some cells of the blastula remain exter- 
nal, constituting the ectoderm, which will give rise to the epidermis and the ner- 
vous system; other cells invaginate, forming the endoderm, which will give rise to 
the gut tube and its appendages, such as lung, pancreas, and liver. Another group 
of cells moves into the space between ectoderm and endoderm and forms the 
mesoderm, which will give rise to muscles, connective tissues, blood, kidney, and 
various other components. Further cell movements and accompanying cell dif- 
ferentiations create and refine the embryo’s architecture. 
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Figure 21-3 The early stages of 
development, as exemplified by a frog. 
(A) A fertilized egg divides to produce a 
blastula—a sheet of epithelial cells often 
surrounding a cavity. During gastrulation, 
some of the cells tuck into the interior to 
form the mesoderm (green) and endoderm 
(yellow). Ectodermal cells (blue) remain on 
the outside. (B) A cross section through 
the trunk of an amphibian embryo shows 
the basic animal body plan, with a sheet 

of ectoderm on the outside, a tube of 
endoderm on the inside, and mesoderm 
sandwiched between them. The endoderm 
forms the epithelial lining of the gut, from 
the mouth to the anus. It gives rise not only 
to the pharynx, esophagus, stomach, and 
intestines, but also to many associated 
structures. The salivary glands, liver, 
pancreas, trachea, and lungs, for example, 
all develop from the wall of the digestive 
tract and grow to become systems of 
branching tubes that open into the gut or 
pharynx. The endoderm forms only the 
epithelial components of these structures — 
the lining of the gut and the secretory 

cells of the pancreas, for example. The 
supporting muscular and fibrous elements 
arise from the mesoderm. 

The mesoderm gives rise to the 
connective tissues —at first, to the loose 
mesh of cells in the embryo known as 
mesenchyme, and ultimately to cartilage, 
bone, and fibrous tissue, including the 
dermis (the inner layer of the skin). The 
mesoderm also forms the muscles, the 
entire vascular system — including the heart, 
blood vessels, and blood cells—and the 
tubules, ducts, and supporting tissues of 
the kidneys and gonads. The notochord 
forms from the mesoderm and serves 
as the core of the future backbone and 
the source of signals that coordinate the 
development of surrounding tissues. 

The ectoderm will form the epidermis 
(the outer, epithelial layer of the skin) and 
epidermal appendages such as hair, 
sweat glands, and mammary glands. It will 
also give rise to the whole of the nervous 
system, central and peripheral, including 
not only neurons and glia but also the 
sensory cells of the nose, the ear, the eye, 
and other sense organs. (B, after T. Mohun 
et al., Cell 22:9-15, 1980. With permission 
from Elsevier.) 
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The ectoderm, mesoderm, and endoderm formed during gastrulation consti- 
tute the three germ layers of the early embryo. Many later developmental trans- 
formations will produce the elaborately structured organs. But the basic body 
plan and axes set up in miniature during gastrulation are preserved into adult life, 
when the organism may be billions of times larger (Movie 21.2). 


The Developmental Potential of Cells Becomes Progressively 
Restricted 


Concomitant with the refinement of the body plan, the individual cells become 
more and more restricted in their developmental potential. During the blastula 
stages, cells are often totipotent or pluripotent—they have the potential to give 
rise to all or almost all of the cell types of the adult body. The pluripotency is lost 
as gastrulation proceeds: a cell located in the endodermal germ layer, for exam- 
ple, can give rise to the cell types that will line the gut or form gut-derived organs 
such as the liver or pancreas, but it no longer has the potential to form meso- 
derm-derived structures such as skeleton, heart, or kidney. Such a cell is said 
to be determined for an endodermal fate. Thus, cell determination starts early 
and progressively narrows the options as the cell steps through a programmed 
series of intermediate states—guided at each step by its genome, its history, and 
its interactions with neighbors. The process reaches its limit when a cell under- 
goes terminal differentiation to form one of the highly specialized cell types of 
the adult body (Figure 21-4). Although there are cell types in the adult that retain 
some degree of pluripotency, their range of options is generally narrow (discussed 
in Chapter 22). 


Cell Memory Underlies Cell Decision-Making 


Underlying the richness and astonishingly complex outcomes of development is 
cell memory (see p. 404). Both the genes a cell expresses and the way it behaves 
depend on the cell’s past, as well as on its present circumstances. The cells of our 
body—the muscle cells, the neurons, the skin cells, the gut cells, and so on—main- 
tain their specialized characters largely because they retain a record of the extra- 
cellular signals their ancestors received during development, rather than because 
they continually receive such instructions from their surroundings. Despite their 
radically different phenotypes, they retain the same complete genome that was 
present in the zygote; their differences arise instead from differential gene expres- 
sion. We have discussed the molecular mechanisms of gene regulation, cell mem- 
ory, cell division, cell signaling, and cell movement in previous chapters. In this 
chapter, we shall see how these basic processes are collectively deployed to create 
an animal. 


several Model Organisms Have Been Crucial for Understanding 
Develooment 


The anatomical features that animals share have undergone many extreme mod- 
ifications in the course of evolution. As a result, the differences between species 
are usually more striking to our human eye than the similarities. But at the level 
of the underlying molecular mechanisms and the macromolecules that mediate 
them, the reverse is true: the similarities among all animals are profound and 
extensive. Through more than half a billion years of evolutionary divergence, all 
animals have retained unmistakably similar sets of genes and proteins that are 
responsible for generating their body plans and for forming their specialized cells 
and organs. 

This astonishing degree of evolutionary conservation was discovered not by 
broad surveys of animal diversity, but through intensive study of a small num- 
ber of representative species—the model organisms discussed in Chapter 1. For 
animal developmental biology, the most important have been the fly Drosophila 
melanogaster, the frog Xenopus laevis, the roundworm Caenorhabditis elegans, 
the mouse Mus musculus, and the zebrafish Danio rerio. In discussing the mecha- 
nisms of development, we shall draw our examples mainly from these few species. 
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Figure 21-4 The lineage from 
blastomere to differentiated cell type. 
As development proceeds, cells become 
more and more specialized. Blastomeres 
have the potential to give rise to most or all 
cell types. Under the influence of signaling 
molecules and gene regulatory factors, 
cells acquire more restricted fates until 
they differentiate into highly specialized cell 
types, such as the pancreatic B-islet cells 
that secrete the hormone insulin. 
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Genes Involved in Cell—Cell Communication and Transcriptional 
Control Are Especially Important for Animal Develooment 


What are the genes that animals share with one another but not with other king- 
doms of life? These would be expected to include genes required specifically for 
animal development but not needed for unicellular existence. Comparison of 
animal genomes with the genome of budding yeast—a unicellular eukaryote— 
suggests that three classes of genes are especially important for multicellular 
organization. The first class includes genes that encode proteins used for cell-cell 
adhesion and cell signaling; hundreds of human genes encode signal proteins, 
cell-surface receptors, cell adhesion proteins, or ion channels that are either not 
present in yeast or present in much smaller numbers. The second class includes 
genes encoding proteins that regulate transcription and chromatin structure: 
more than 1000 human genes encode transcription regulators, but only about 250 
yeast genes do so. As we shall see, the development of animals is dominated by 
cell-cell interactions and by differential gene expression. The third class of non- 
coding RNAs has a more uncertain status: it includes genes that encode microR- 
NAs (miRNAs); there are at least 500 of these in humans. Along with the regula- 
tory proteins, they play a significant part in controlling gene expression during 
animal development, but the full extent of their importance is still unclear. The 
loss of individual miRNA genes in C. elegans, where their functions have been well 
studied, rarely leads to obvious phenotypes, suggesting that the roles of miRNAs 
during animal development are often subtle, serving to fine-tune the develop- 
mental machinery rather than to form its core structures. 


Regulatory DNA Seems Largely Responsible for the Differences 
Between Animal Species 


As discussed in Chapter 7, each gene in a multicellular organism is associated 
with many thousands of nucleotides of noncoding DNA that contains regulatory 
elements. These regulatory elements determine when, where, and how strongly 
the gene is to be expressed, according to the transcription regulators and chroma- 
tin structures that are present in the particular cell (Figure 21-5). Consequently, 
a change in the regulatory DNA, even without any change in the coding DNA, can 
alter the logic of the gene-regulatory network and change the outcome of devel- 
opment. 

As discussed in Chapter 4, when we compare the genomes of different ani- 
mal species, we find that evolution has altered the coding and regulatory DNA to 
different extents. The coding DNA, for the most part, has been highly conserved, 
the noncoding regulatory DNA much less so. It seems that changes in regulatory 
DNA are largely responsible for the dramatic differences between one class of 
animals and another (see p. 227). We can view the protein products of the cod- 
ing sequences as a conserved kit of common molecular parts, and the regulatory 
DNA as instructions for assembly: with different instructions, the same kit of parts 
can be used to make a whole variety of different body structures. We will return to 
this important concept later. 


Figure 21-5 Regulatory DNA defines 
the gene expression patterns in 
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small Numbers of Conserved Cell-Cell Signaling Pathways 
Coordinate Spatial Patterning 


Spatial patterning of a developing animal requires that cells become different 
according to their positions in the embryo, which means that cells must respond 
to extracellular signals produced by other cells, especially their neighbors. In what 
is probably the commonest mode of spatial patterning, a group of cells starts out 
with the same developmental potential, and a signal from cells outside the group 
then induces one or more members of the group to change their character. This 
process is called inductive signaling. Generally, the inductive signal is limited in 
time and space so that only a subset of the cells capable of responding—the cells 
close to the source of the signal—take on the induced character (Figure 21-6). 
Some inductive signals depend on cell-cell contact; others act over a longer range 
and are mediated by molecules that diffuse through the extracellular medium or 
are transported in the bloodstream (see Figure 15-2). 

Most of the known inductive events in animal development are governed by 
a small number of highly conserved signaling pathways, including transforming 
growth factor-B (TGFB), Wnt, Hedgehog, Notch, and receptor tyrosine kinase 
(RTK) pathways (discussed in Chapter 15). The discovery of the limited vocabu- 
lary that developing cells use for intercellular communication has emerged over 
the past 25 years as one of the great simplifying features of developmental biology. 


Through Combinatorial Control and Cell Memory, Simple Signals 
Can Generate Complex Patterns 


But how can this small number of signaling pathways generate the huge diversity 
of cells and patterns? Three kinds of mechanisms are responsible. First, through 
gene duplication, the basic components of a pathway often come to be encoded 
by small families of closely related homologous genes. This allows for diversity in 
the operation of the pathway, according to which family member is employed in 
a given situation. Notch signaling, for example, may be mediated by Notch] in 
one tissue, but by its homolog Notch4 in another. Second, the response of a cell 
to a given signal protein depends on the other signals that the cell is receiving 
concurrently (Figure 21-7A). As a result, different combinations of signals can 
generate a large variety of different responses. Third, and most fundamental, the 
effect of activating a signaling pathway depends on the previous experiences of 
the responding cell: past influences leave a lasting mark, registered in the state 
of the cell’s chromatin and the selection of transcription regulatory proteins and 
RNA molecules that the cell contains. This cell memory enables cells with different 
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Figure 21-6 Inductive signaling. 


Figure 21-7 Two mechanisms for 
generating different responses to the 
same inductive signal. (A) In combinatorial 
signaling, the effect of a signal depends on 
the presence of other signals received at 
the same time. (B) Through cell memory, 
previous signals (or other events) can leave 
a lasting trace that alters the response to 
the current signal (see Figure 7-54). The 
memory trace is represented here in the 
coloring of the cell nucleus. 
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histories to respond to the same signals differently (Figure 21-7B). Thus, the same 
few signaling pathways can be used repeatedly at different times and places with 
different outcomes, so as to generate patterns of unlimited complexity. 


Morphogens Are Long-Range Inductive Signals That Exert Graded 
Effects 


Signal molecules often govern simple yes-no choices—one outcome when 
their concentration is high, another when it is low. In many cases, however, the 
responses are more finely graded: a high concentration of a signal molecule may, 
for example, direct cells into one developmental pathway, an intermediate con- 
centration into another, and a low concentration into yet another. 

One common way to generate such different concentrations of a signal mole- 
cule is for the molecule to diffuse out from a localized signaling source, creating 
a concentration gradient. Cells at different distances from the source are driven 
to behave in a variety of different ways, according to the signal concentration that 
they experience (Figure 21-8). A signal molecule that imposes a pattern on a 
whole field of cells in this way is called a morphogen. In the simplest case, a spe- 
cialized group of cells produces a morphogen at a steady rate, and the morphogen 
is then degraded as it diffuses away from this source. The speed of diffusion and 
the half-life of the morphogen will together determine the range and steepness of 
its resulting gradient (Figure 21-9). 

This simple mechanism can be modified in various ways. Receptors on the 
surface of cells along the way, for example, may trap the diffusing morphogen and 
cause it to be endocytosed and degraded, shortening its effective half-life. Alter- 
natively, the morphogen may bind to molecules in the extracellular matrix such as 
heparan sulfate proteoglycan (discussed in Chapter 19), thereby greatly reducing 
its diffusion rate. 


Lateral Inhibition Can Generate Patterns of Different Cell Types 


Morphogen gradients, and other kinds of inductive signal, exploit an existing 
asymmetry in the embryo to create further asymmetries and differences between 
cells: already, at the outset, some cells are specialized to produce the morphogen 
and thereby impose a pattern on another class of cells that are sensitive to it. But 
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Figure 21-8 Gradient formation and 
interpretation. A gradient forms by 
localized production of an inducer—a 
morphogen—that diffuses away from 

its source. Different concentrations of 
morphogen (or different durations of 
exposure) induce different gene expression 
patterns and cell fates in responding cells. 
Diffusive transport can generate gradients 
only over short distances, and morphogens 
generally act over distances of 1 mm or 
less. 


Figure 21-9 Setting up a signal gradient 
by diffusion. (A-C) Each graph shows six 
successive stages in the buildup of the 
concentration of a signal molecule that is 
produced at a steady rate at the origin, with 
production starting at time O. In all cases, 
the molecule undergoes degradation as 

it diffuses away from the source, and the 
graphs are calculated on the assumption 
that diffusion is occurring along two axes in 
space (for example, radially from a source 
in an epithelial sheet). (A) The pattern of the 
morphogen assuming that the molecule 
has a half-life of 170 minutes, and that it 
diffuses with an effective diffusion constant 
of D = 1 um? sec™!, typical of a small 
protein molecule in extracellular tissues. 
Note that the gradient is already close to its 
steady-state form within an hour and that 
the concentration at steady state falls off 
exponentially with distance. (B) A threefold 
increase in the diffusion constant of the 
morphogen extends its range but lowers its 
concentration next to the source, whereas 
(C) a threefold increase in morphogen half- 
life increases its concentration throughout 
the tissue. Effects of the morphogen will 
depend not just on its concentration at 
some critical moment, but also on how 
each target cell integrates its response over 
time. (Courtesy of Patrick Muller.) 


Figure 21-10 Genesis of asymmetry through lateral inhibition and 
positive feedback. In this example, two cells interact, each producing 

a substance X that acts on the other cell to inhibit its production of X, an 
effect known as lateral inhibition. An increase of X in one of the cells leads 
to a positive feedback that tends to increase X in that cell still further, while 
decreasing X in its neighbor. This can create a runaway instability, making 
the two cells become radically different. Ultimately, the system comes to rest 
in one or the other of two opposite stable states. The final choice of state 
represents a form of memory: the small influence that initially directed the 
choice is no longer required to maintain it. 


what if there is no clear initial asymmetry? Can a regular pattern arise sponta- 
neously within a set of cells that are initially all alike? 

The answer is yes. The fundamental principle underlying such de novo pat- 
tern formation is positive feedback: cells can exchange signals in such a way that 
any small initial discrepancy between cells at different sites becomes self-ampli- 
fying, driving the cells toward different fates. This is most clearly illustrated in the 
phenomenon of lateral inhibition, a form of cell-cell interaction that forces close 
neighbors to become different and thereby generates fine-grained patterns of dif- 
ferent cell types. 

Consider a pair of adjacent cells that start off in a similar state. Each of these 
cells can both produce and respond to a certain signal molecule X, with the added 
rule that the stronger the signal a cell receives, the weaker the signal it gener- 
ates (Figure 21-10). If one cell produces more X, the other is forced to produce 
less. This gives rise to a positive feedback loop that tends to amplify any initial 
difference between the two adjacent cells. Such a difference may arise from a 
bias imposed by some present or past external factor, or it may simply originate 
from spontaneous random fluctuations, or “noise”—an inevitable feature of the 
genetic control circuitry in cells (discussed in Chapter 7). In either case, lateral 
inhibition means that if cell 1 makes a little more of X, it will thereby cause cell 2 to 
make less; and because cell 2 makes less X, it delivers less inhibition to cell 1 and 
so allows the production of X in cell 1 to rise higher still; and so on, until a steady 
state is reached where cell 1 produces a lot of X and cell 2 produces very little. In 
the standard case, the signal molecule X acts in the receiving cell by regulating 
gene transcription, and the result is that the two cells are driven along different 
pathways of differentiation. 

In almost all tissues, a balanced mixture of different cell types is required. 
Lateral inhibition provides a common way to generate the mixture. As we 
shall see, lateral inhibition is very often mediated by exchange of signals at cell- 
cell contacts via the Notch signaling pathway, driving cell diversification by 
enabling individual cells that express one set of genes to direct their immediate 
neighbors to express a different set, in exactly the way we have described (see also 
Figure 15-58). 


Short-Range Activation and Long-Range Inhibition Can Generate 
Complex Cellular Patterns 


Lateral inhibition mediated by the Notch pathway is not the only example of pat- 
tern generation through positive feedback: there are other ways in which, through 
the same basic principle, a system that starts off homogeneous and symmetrical 
can pattern itself spontaneously, even in the absence of an external morphogen. 
Positive feedback processes mediated by diffusible signal molecules can operate 
over broad arrays of cells to create many types of spatial patterns. Mechanisms of 
this sort are called reaction-diffusion systems. For example, a substance A (a short- 
range activator) may stimulate its own production in the cells that contain it and 
in their immediate neighbors, while also causing these cells to produce a signal 
I (a long-range inhibitor) that diffuses widely and inhibits the production of A in 
cells farther away. If the cells all start the same, but one group gains a slight advan- 
tage by making a little more A than the rest, the asymmetry can be self-amplifying 
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Figure 21-11 Pattern generation by a reaction-diffusion system. From 
(A) a uniform field of cells, (B) local positive feedback and (C) long-range 
inhibition can (D) generate patterns within the initially uniform field. The 
patterns can be complex, resembling the spots of a leopard (as shown) or 
the stripes of a zebra; or they can be simple, with creation of a single cluster 
of specialized cells that can, for example, go on to serve as the source of a 
morphogen gradient. 


(Figure 21-11). Such short-range activation combined with long-range inhibition 
can account for the formation of clusters of cells within an initially homogeneous 
tissue that become specialized as localized signaling centers. 


Asymmetric Cell Division Can Also Generate Diversity 


Cell diversification does not always depend on extracellular signals: in some 
cases, daughter cells are born different as a result of an asymmetric cell divi- 
sion, in which some important molecule or molecules are distributed unequally 
between the two daughters. This asymmetric inheritance ensures that the two 
daughter cells develop differently (Figure 21-12). Asymmetric division is a com- 
mon feature of early development, where the fertilized egg already has an internal 
pattern and cleavage of this large cell segregates different determinants into sep- 
arate blastomeres. We shall see that asymmetric division also plays a part in some 
later developmental processes. 


Initial Patterns Are Established in Small Fields of Cells and Refined 
by Sequential Induction as the Embryo Grows 


The signals that organize the spatial pattern of cells in an embryo generally act 
over short distances and govern relatively simple choices. A morphogen, for 
example, typically acts over a distance of less than 1 mm—an effective range for 
diffusion —and directs choices between several developmental options for the 
cells on which it acts. Yet the organs that eventually develop are much larger and 
more complex than this. 

The cell proliferation that follows the initial specification accounts for the size 
increase, while the refinement of the initial pattern is explained by a series of local 
inductions plus other interactions that add successive levels of detail on an ini- 
tially simple sketch. For example, as soon as two types of cells are present in a 
developing tissue, one of them can produce a signal that induces a subset of the 
neighboring cells to specialize in a third way. The third cell type can in turn signal 





1153 


(A) uniform field of cells 


© 


(B) short-range activator (green) in one cell 
stimulates its own production 


aoe 
oe 


(C) long-range inhibitor (red) blocks production 
of activator by other cells in the neighborhood 





(D) 


Figure 21-12 Two ways of making sister 
cells different. 
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back to the other two cell types nearby, generating a fourth and a fifth cell type, 
and so on (Figure 21-13). 

This strategy for generating a progressively more complicated pattern is called 
sequential induction. It is chiefly through sequential inductions that the body 
plan of a developing animal, after being first roughed out in miniature, becomes 
elaborated with finer and finer details as development proceeds. 


Developmental Biology Provides Insights into Disease and Tissue 
Maintenance 


The rapid progress in understanding animal development has been one of the 
great success stories in biology over the last few decades, and it has important 
practical implications. Some 2 to 5% of all human babies are born with anatomical 
abnormalities, such as heart malformations, truncated limbs, cleft palate, or spina 
bifida. Advances in developmental biology help us understand how these defects 
arise, even if we cannot yet prevent or cure most of them. 

Less obvious, but even more important from a practical point of view, is that 
developmental biology provides insights into the workings of cells and tissues 
in the adult body. Developmental processes do not halt at birth; they continue 
throughout life, as tissues are maintained and repaired. The fundamental mecha- 
nisms of cell growth and division, cell-cell signaling, cell memory, cell adhesion, 
and cell movement are involved in adult tissue maintenance and repair—just as 
they are in embryo development. 

Embryos are simpler than adults, and they allow us to analyze such basic pro- 
cesses more easily. Studies of the early Drosophila embryo, for example, were cru- 
cial to the discovery of several conserved signaling pathways, including the Wnt, 
Hedgehog, and Notch pathways. They also provided the key to understanding the 
central role of these pathways in the maintenance of normal adult human tissues 
and in diseases such as cancer. 

In Chapter 22, we shall consider how other developmental mechanisms oper- 
ate in the normal adult body, especially in tissues that are continually renewed by 
means of stem cells—including the gut, skin, and the hematopoietic system. But 
now, we must look more closely at the way in which an early embryo generates its 
spatial pattern of specialized cells, beginning with the transformations that create 
the adult body plan. 


Summary 


Animal development is a self-assembly process, in which the cells of the embryo 
become different from one another and organize themselves into increasingly com- 
plex structures. The process begins with a single large cell—the fertilized egg. This 
cell cleaves to form many smaller cells, producing a blastula. The blastula under- 
goes gastrulation to generate the three germ layers of the embryo—ectoderm, meso- 
derm, and endoderm—consisting of cells determined for different fates. As develop- 
ment continues, the cells become more and more narrowly specialized according 
to their locations and their interactions with one another. Through cell memory, 
these cell-cell interactions, even though transient, can have lasting effects on each 


Figure 21-13 Patterning by sequential 
induction. A series of inductive interactions 
can generate many types of cells, starting 
from only a few. 
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cell’s internal state. Thus, a succession of simple cues that a cell receives at different 
times can direct it along a complex developmental pathway. At each step, the cell 
becomes further restricted in the range of final states open to it. The process reaches 
its limit when the cell differentiates to form one of the specialized cell types of the 
adult body. 

Differences between developing cells arise in various ways and have to be prop- 
erly coordinated in space. In one common strategy, initially similar cells within 
a group become different by exposure to different levels of an inductive signal or 
morphogen emanating from a source outside the group. Neighboring cells can also 
become different by lateral inhibition, in which a cell signals to its neighbors not to 
follow the same fate. These cell-cell interactions are mediated by a small number of 
highly conserved signaling pathways, which are used repeatedly in different organ- 
isms and at different times during development. Not all cell diversification arises 
by cell-cell interactions, however: daughter cells can be born different as a result of 
asymmetric cell division. 

Regulators of transcription and chromatin structure bind to regulatory DNA 
and determine the fate of each cell. Differences of body plan seem to arise to a large 
extent from differences in the regulatory DNA associated with each gene. This DNA 
has a central role in defining the sequential program of development, calling genes 
into action at specific times and places according to the pattern of gene expression 
that was present in each cell at the previous developmental stage. 

Development has been most thoroughly studied in a handful of model organ- 
isms. But most of the genes and mechanisms thereby identified are used in all ani- 
mals and repeatedly at different stages of development. Thus, insights from worms, 
flies, fish, frogs, and mice deeply inform our understanding of embryology, birth 
defects, and adult tissue maintenance in humans. 


MECHANISMS OF PATTERN FORMATION 


A developing multicellular organism has to create a pattern in fields of cells where 
there was little or none before. Some of the early microscopists imagined the 
entire shape and structure of the human body to be already present in the sperm 
as a “homunculus,” a miniature human; after fertilization, the homunculus would 
simply grow and generate a full-sized human. We now know that this view is 
incorrect and that development is a progression from simple to complex, through 
a gradual refinement of an animal’s anatomy. To see how the whole sequence of 
events of spatial patterning and cell determination is set in train, we must return 
to the egg and the early embryo. 


Different Animals Use Different Mechanisms to Establish Their 
Primary Axes of Polarization 


Surprisingly, the earliest steps of animal development are among the most vari- 
able, even within a phylum. A frog, a chicken, and a mammal, for example, even 
though they develop in similar ways later, make eggs that differ radically in size 
and structure, and they begin their development with different sequences of cell 
divisions and cell specializations. Gastrulation occurs in all animal embryos, but 
the details of its timing, of the associated pattern of cell movements, and of the 
shape and size of the embryo as gastrulation proceeds are highly variable. Like- 
wise, there is great variation in the time and manner in which the primary axes of 
the body become marked out. However, this polarization of the embryo usually 
becomes discernible very early, before gastrulation begins: it is the first step of 
spatial patterning. 

Three axes generally have to be established. The animal-vegetal (A-V) axis, 
in most species, defines which parts are to become internal (through the move- 
ments of gastrulation) and which are to remain external. (The bizarre name dates 
from a century ago and has nothing to do with vegetables.) The anteroposterior 
(A-P) axis specifies the locations of future head and tail. The dorsoventral (D-V) 
axis specifies the future back and belly. 
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Figure 21-14 The frog egg and its asymmetries. (A) Side view of a Xenopus egg photographed just before fertilization. 

(B) The asymmetric distribution of molecules inside the egg, and how this changes following fertilization so as to define a 
dorsoventral as well as an animal-vegetal asymmetry. Fertilization, through a reorganization of the microtubule cytoskeleton, 
triggers a rotation of the egg cortex (a layer a few um deep) through about 30° relative to the core of the egg; the direction of 
rotation determined by the site of soerm entry. Some components are carried still further to the future dorsal side by active 
transport along microtubules. The resulting dorsal concentration of Wnt11 mRNA leads to dorsal production of the Wnt11 signal 
protein and defines the dorsoventral polarity of the future embryo. Vegetally localized VegT defines the vegetal source of signals 
that will induce endoderm and mesoderm. (A, courtesy of Tony Mills.) 


At one extreme, the egg is spherically symmetrical, and the axes only become 
defined during embryogenesis. The mouse comes close to being an example, with 
little obvious sign of polarity in the egg. Correspondingly, the blastomeres pro- 
duced by the first few cell divisions seem to be all alike and are remarkably adapt- 
able. If the early mouse embryo is split in two, a pair of identical twins can be pro- 
duced—two complete, normal individuals from a single cell. Similarly, if one of 
the cells in a two-cell mouse embryo is destroyed by pricking it with a needle and 
the resulting “half-embryo” is placed in the uterus of a foster mother to develop, 
in many cases a perfectly normal mouse will emerge. 

At the opposite extreme, the structure of the egg defines the future axes of the 
body. This is the case for most species, including insects such as Drosophila, as we 
shall see shortly. Many other organisms lie between the two extremes. The egg of 
the frog Xenopus, for example, has a clearly defined A-V axis even before fertiliza- 
tion: the nucleus near the top defines the animal pole, while the mass of yolk (the 
embryo’s food supply, destined to be incorporated in the gut) toward the bottom 
defines the vegetal pole. Several types of mRNA molecules are already localized 
in the vegetal cytoplasm of the egg, where they produce their protein products. 
After fertilization, these mRNAs and proteins act in and on the cells in the lower 
and middle part of the embryo, giving the cells there specialized characters, both 
by direct effects and by stimulating the production of secreted signal proteins. For 
example, mRNA encoding the transcription regulator VegT is deposited at the 
vegetal pole during oogenesis. After fertilization, this mRNA is translated, and the 
resulting VegT protein activates a set of genes that code for signal proteins that 
induce mesoderm and endoderm, as discussed later. 

The D-V axis of the Xenopus embryo, by contrast, is defined through the act of 
fertilization. Following entry of the sperm, the outer cortex of the egg cytoplasm 
rotates relative to the central core of the egg, so that the animal pole of the cor- 
tex becomes slightly shifted to one side (Figure 21-14). Treatments that block 
the rotation allow cleavage to occur normally but produce an embryo with a cen- 
tral gut and no dorsal structures or D-V asymmetry. Thus, this cortical rotation is 
required to define the D-V axis of the future body by creating the D-V axis of the 
egg. 

The site of sperm entry that biases the direction of the cortical rotation in 
Xenopus, perhaps through the centrosome that the sperm brings into the egg— 
inasmuch as the rotation is associated with a reorganization of the microtubules 
nucleated from the centrosome in the egg cytoplasm. The reorganization leads to 
a microtubule-based transport of several cytoplasmic components, including the 
mRNA coding for Wnt11, a member of the Wnt family of signal proteins, moving 
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it toward the future dorsal side (see Figure 21-14). This mRNA is soon translated 
and the Wnt11 protein secreted from cells that form in that region of the embryo 
activates the Wnt signaling pathway (see Figure 15-60). This activation is crucial 
for triggering the cascade of subsequent events that will organize the dorsoventral 
axis of the body. (The A-P axis of the embryo will only become clear later, in the 
process of gastrulation.) 

Although different animal species use a variety of different mechanisms to 
specify their axes, the outcome has been relatively well conserved in evolution: 
head is distinguished from tail, back from belly, and gut from skin. It seems that it 
does not much matter what tricks the embryo uses to break the initial symmetry 
and set up this basic body plan. 


Studies in Drosophila Have Revealed the Genetic Control 
Mechanisms Underlying Development 


It is the fly Drosophila, more than any other organism, that has provided the key to 
our present understanding of how genes govern development. Decades of genetic 
study culminated in a large-scale genetic screen, focusing especially on the early 
embryo and searching for mutations that disrupt its pattern. This revealed that 
the key developmental genes fall into a relatively small set of functional classes 
defined by their mutant phenotypes. The discovery of these genes and the subse- 
quent analysis of their functions was a famous tour de force and had a revolution- 
ary impact on all of developmental biology, earning its discoverers a Nobel Prize. 
Some parts of the developmental machinery revealed in this way are conserved 
between flies and vertebrates, some parts not. But the logic of the experimental 
approach and the general strategies of genetic control that it revealed have trans- 
formed our understanding of multicellular development in general. 

To understand how the early developmental machinery operates in Drosoph- 
ila, it is important to note a peculiarity of fly development. Like the eggs of other 
insects, but unlike most vertebrates, the Drosophila egg—shaped like a cucum- 
ber—begins its development with an extraordinarily rapid series of nuclear divi- 
sions without cell division, producing multiple nuclei in a common cytoplasm—a 
syncytium. The nuclei then migrate to the cell cortex, forming a structure called 
the syncytial blastoderm. After about 6000 nuclei have been produced, the plasma 
membrane folds inward between them and partitions them into separate cells, 
converting the syncytial blastoderm into the cellular blastoderm (Figure 21-15). 

We shall see that the initial patterning of the Drosophila embryo depends on 
signals that diffuse through the cytoplasm at the syncytial stage and exert their 
actions on genes in the rapidly dividing nuclei, before the partitioning of the egg 
into separate cells. Here, there is no need for the usual forms of cell-cell signaling; 
neighboring regions of the syncytial blastoderm can communicate by means of 
transcription regulatory proteins that move through the cytoplasm of the giant 
multinuclear cell. 


Egg-Polarity Genes Encode Macromolecules Deposited in the Egg 
to Organize the Axes of the Early Drosophila Embryo 


As in most insects, the main axes of the future body of Drosophila are defined 
before fertilization by a complex exchange of signals between the developing egg, 
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Figure 21-15 Development of the 
Drosophila egg from fertilization to the 
cellular blastoderm stage. 
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or oocyte, and the follicle cells that surround it in the ovary. In the stages before 
fertilization, the anteroposterior and dorsoventral axes of the future embryo 
become defined by four systems of egg-polarity genes that create landmarks— 
either mRNA or protein—in the developing oocyte. Following fertilization, each 
landmark serves as a beacon, providing a signal that organizes the developmental 
process in its neighborhood. 

The nature of the genes emerged from studies of mutants in which the pat- 
terning of the embryo was altered. One class of mutations gave embryos with dis- 
rupted polarity—for example, tail-end structures at both ends of the body, with 
no head-end structures. This class of mutations identified the set of egg-polarity 
genes. The egg-polarity gene responsible for the signal that organizes the ante- 
rior end of the embryo is called Bicoid. A deposit of Bicoid mRNA molecules is 
localized, before fertilization, at the anterior end of the egg. Upon fertilization, 
the mRNA is translated to produce Bicoid protein. This protein is an intracellular 
morphogen and transcription regulator that diffuses away from its source to form (B) Bicoid protein 
a concentration gradient within the syncytial cytoplasm, with its maximum at the 
head end of the embryo (Figure 21-16). The different concentrations of Bicoid 
along the A-P axis help determine different cell fates by regulating the transcrip- 
tion of genes in the nuclei of the syncytial blastoderm (discussed in Chapter 7). 

Of the three other egg-polarity gene systems, two contribute to patterning the 
syncytial nuclei along the A-P axis and one to patterning them along the D-V axis. 
Together with the Bicoid group of genes, and acting in a broadly similar way, their 
gene products mark out three fundamental partitions of body regions—head ver- 
sus rear, dorsal versus ventral, and endoderm versus mesoderm and ectoderm— 
as well as a fourth partition, no less fundamental to the body plan of animals: the t t 
distinction between germ cells and somatic cells (Figure 21-17). antenor posterior 

The egg-polarity genes have a further special feature: they are all maternal-ef- Bi 
fect genes, in that it is the mother’s genome rather than the zygote’s genome that Figure 21-16 The Bicoid protein 
is critical. For example, a fly whose chromosomes are mutantin both copies ofthe gradient. (A) Bicoid mRNA is deposited at 
Bicoid gene but who is born from a mother carrying one normal copy of Bicoid the anterior pole during oogenesis. 
develops perfectly normally, without any defects in the head pattern. However, if (B) Local translation followed by diffusion 
that offspring is a female, she cannot deposit any functional Bicoid mRNA into her an age ca geen ee CS 

; ; , (C) Absence of the Bicoid protein gradient 
own eggs, which will therefore develop into headless embryos, regardless of the in embryos from Bicoid homozygous 
father’s genotype. mutant mothers. (A and B, courtesy of 

The egg-polarity genes act first in a hierarchy of gene systems that define a Stephen Small.) 
progressively more detailed pattern of body parts. In the next few pages, we begin 
with the molecular mechanisms that pattern the developing Drosophila embryo 
and larva along the A-P axis, before considering the patterning along the D-V axis. 
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Figure 21-17 The organization of the four egg-polarity gradient systems in Drosophila. Nanos is a translational repressor that governs the 
formation of the abdomen. Localized Nanos mRNA is also incorporated into the germ cells as they form at the posterior of the embryo, and Nanos 
protein is necessary for germ-line development. Bicoid protein is a transcriptional activator that determines the head and thoracic regions. Toll and 
Torso are receptor proteins that are distributed all over the membrane but are activated only at the sites indicated by the coloring, through localized 
exposure to the extracellular ligands Spaetzle (the ligand for Toll) and Trunk (the ligand for Torso). Toll activity determines the mesoderm and Torso 
activity determines the formation of terminal structures. 
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Figure 21-18 The origins of the Drosophila body segments. (A) At 

3 hours, the embryo (shown in side view) is at the blastoderm stage and 
no segmentation is visible, although a fate map can be drawn showing the 
future segmented regions (color). (B) At 10 hours, all the segments are 
clearly defined (T1: first thoracic segment; A1: first abdominal segment). 
See Movie 21.3. (C) The segments of the Drosophila larva and their 
correspondence with regions in the embryo. (D) The segments of the 
Drosophila adult and their correspondence with regions in the embryo. 


Three Groups of Genes Control Drosophila Segmentation Along 
the A-P Axis 


The body of an insect is divided along its A-P axis into a series of segments. The 
segments are repetitions of a theme with variations: each segment forms highly 
specialized structures, but all built according to a similar fundamental plan (Fig- 
ure 21-18). The gradients of transcription regulators set up along the A-P axis in 
the early embryo by the egg-polarity genes are the prelude to creation of the seg- 
ments. These regulators initiate the orderly transcription of segmentation genes, 
which refine the pattern of gene expression to define the boundaries and ground 
plan of the individual segments. Segmentation genes are expressed by subsets of 
cells in the embryo, and their products are the first components that the embryo’s 
own genome contributes to embryonic development; they are therefore called 
zygotic-effect genes, to distinguish them from the earlier-acting maternal-effect 
genes. Mutations in segmentation genes can alter either the number of segments 
or their basic internal organization. 

The segmentation genes fall into three groups according to their mutant phe- 
notypes (Figure 21-19). It is convenient to think of the three groups as acting in 
sequence, although in reality their functions overlap in time. First to be expressed 
is a set of at least six gap genes, whose products mark out coarse A-P subdivisions 
of the embryo. Mutations in a gap gene eliminate one or more groups of adjacent 
segments: in the mutant Kriippel, for example, the larva lacks eight segments. Next 
comes a set of eight pair-rule genes. Mutations in these genes cause a series of 
deletions affecting alternate segments, leaving the embryo with only half as many 
segments as usual; although all the mutants display this two-segment periodicity, 
they differ in the precise pattern. Finally, there are at least 10 segment-polarity 
genes, in which mutations produce a normal number of segments but with a part 
of each segment deleted and replaced by a mirror-image duplicate of all or part of 
the rest of the segment. 

In parallel with the segmentation process, a further set of genes—the homeotic 
selector, or Hox, genes—serves to define and preserve the differences between one 
segment and the next, as we describe shortly. 

The phenotypes of the various segmentation mutants suggest that the seg- 
mentation genes form a coordinated system that subdivides the embryo progres- 
sively into smaller and smaller domains along the A-P axis, each distinguished 
by a different pattern of gene expression. Molecular genetics has helped to reveal 
how this system works. 


A Hierarchy of Gene Regulatory Interactions Subdivides the 
Drosophila Embryo 


Like Bicoid, most of the segmentation genes encode transcription regulator pro- 
teins. Their control by the egg-polarity genes and their actions on one another and 
on still other genes can be deciphered by comparing gene expression in normal 
and mutant embryos. By using appropriate probes to detect RNA transcripts or 
their protein products, one can observe genes switch on and off in changing pat- 
terns. By comparing these patterns in different mutants, one can begin to discern 
the logic of the entire gene control system. 

The products of the egg-polarity genes provide the global positional signals in 
the early embryo (see Figure 21-17). The Bicoid protein, as we have seen, acts as 
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Figure 21-19 Examples of the phenotypes of mutations affecting egg-polarity genes and the three types of 
segmentation genes. In each case, the areas shaded in green on the normal larva (left) are deleted in the mutant or are 
replaced by mirror-image duplicates of the unaffected regions. (Modified from C. Nusslein-Volhard and E. Wieschaus, Nature 
287:795-801, 1980. With permission from Macmillan Publishers Ltd.) 


a morphogen and activates different sets of genes at different positions along the 
A-P axis: some gap genes are only activated in regions with high levels of Bicoid, 
others only where levels of Bicoid are lower. After the gap gene products refine 
their positions by mutual repression, they provide a second tier of positional sig- 
nals that act more locally to regulate finer details of patterning. Gap genes act by 
controlling the expression of yet other genes, including the pair-rule genes. The 
pair-rule genes, in turn, collaborate with one another and with the gap genes to 
set up a regular, periodic pattern of expression of the segment-polarity genes, 
which collaborate with one another to define the internal pattern of each individ- 
ual segment (Figure 21-20). 

The initial steps in creation of the segmental pattern occur before cellulariza- 
tion of the syncytial blastoderm and are governed by the combinatorial effects of 
transcription regulators, as discussed in detail in Chapter 7 for the regulation of 
the expression of the pair-rule gene Even-skipped (see pp. 394-396). After cellular- 
ization, the segment-polarity genes further subdivide each segment into smaller 
domains. A large subset of the segment-polarity genes codes for components of 
two signaling pathways—the Wnt pathway and the Hedgehog pathway, including 
the secreted signal proteins Wingless (the first-named member of the Wnt fam- 
ily) and Hedgehog. (The Hedgehog pathway was first discovered through study 
of Drosophila segmentation, and it takes its name from the prickly appearance 
of the surface of the Hedgehog mutant embryo.) Wingless and Hedgehog are syn- 
thesized in different bands of cells that serve as signaling centers within each seg- 
ment. The two proteins mutually maintain each other’s expression, while regulat- 
ing the expression of genes such as Engrailed in neighboring cells (Figure 21-21). 
In such a manner, a series of sequential inductions creates a fine-grained pattern 
of gene expression within each segment. 


Egg-Polarity, Gap, and Pair-Rule Genes Create a Transient Pattern 
That Is Remembered by Segment-Polarity and Hox Genes 


The gap genes and pair-rule genes are activated within the first few hours after fer- 
tilization. Their mRNA products initially appear in patterns that only approximate 
the final picture; then, within a short time, this fuzzy initial pattern resolves itself 
into a regular, crisply defined system of stripes. But this pattern itself is unstable 
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and transient: as the embryo proceeds through gastrulation and beyond, the 
pattern disintegrates. The genes’ actions, however, have passed on an enduring 
memory of their patterns of expression by inducing the expression of certain seg- 
ment polarity genes along with Hox genes (discussed shortly). After a period of 
pattern refinement mediated by cell-cell interactions, the expression patterns of 
these new groups of patterning genes is stabilized to provide positional labels that 
serve to maintain the segmental organization of the larva and adult fly. 

The segment-polarity gene Engrailed provides a good example. Its RNA tran- 
scripts form a series of 14 bands in the cellular blastoderm, each approximately 
one-cell wide. These stripes lie immediately anterior to similar stripes of expres- 
sion of another segment polarity gene, Wingless. As the cells in the develop- 
ing embryo continue to grow, divide, and move, a mutually reinforcing signal 
between the Wingless expressing cells and the Engrailed expressing cells main- 
tains narrow stripes of their expression (see Figure 21-21). After three cell cycles, 
newly expressed regulators stabilize an Engrailed expression pattern that will last 
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Figure 21-20 An example of the 
regulatory hierarchy of egg-polarity, 
segmentation, and Hox genes. As 
discussed in the text, there are three 
groups of segmentation genes. The 
photographs show mRNA expression 
patterns of representative examples of 
genes of each type. (Courtesy of Stephen 
Small.) 


Figure 21-21 Mutual maintenance of 
Hedgehog and Wingless expression. 
Engrailed is a transcription regulator (blue) 
that drives the expression of Hedgehog. 
Hedgehog encodes a secreted protein 
(red) that activates its signaling pathway 

in neighboring cells and thereby drives 
them to express the Wingless gene. In 
turn, Wingless encodes a secreted protein 
(green) that acts back on neighbors of the 
Wingless-expressing cell to maintain their 
expression of Engrailed and Hedgehog. As 
indicated, the same control loop repeats 
along the A-P axis of the fly. (Based on 

S. Dinardo et al., Curr. Opin. Genet. Dev. 
4:529-534, 1994.) 
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throughout the life of the fly, long after the signals that induced and refined it have 
disappeared. The segment borders will form at the posterior edge of each such 
Engrailed stripe (Figure 21-22). 

In addition to regulating the segment-polarity genes, the products of pair-rule 
genes collaborate with those of gap genes to induce the precisely localized acti- 
vation of a further set of genes—originally called homeotic selector genes and now 
often called Hox genes, for reasons that will become clear shortly. It is the Hox 
genes that permanently distinguish one segment from another. In the next sec- 
tion, we examine these important genes in detail and consider their role in cell 
memory; we shall see that this role is critical in a wide range of animals, including 
ourselves. 


Hox Genes Permanently Pattern the A-P Axis 


As animal development proceeds, the body becomes more and more complex. 
But again and again, in every species and at every level of organization, we find 
that complex structures are made by repeating a few basic themes, with varia- 
tions. Thus, a limited number of basic differentiated cell types, such as muscle 
cells or fibroblasts, recur with subtle individual variations in different sites. These 
cell types are organized into a limited variety of tissue types, such as muscle 
or tendon, which again are repeated with subtle variations in different regions 
of the body. From the various tissues, organs such as teeth or digits are built— 
molars and incisors, fingers and thumbs and toes—a few basic kinds of structure, 
repeated with variations. 

Wherever we find this phenomenon of modulated repetition, we can break 
down the developmental biologist’s problem into two kinds of questions: what is 
the basic construction mechanism common to all the objects of the given class, 
and how is this mechanism modified to give the observed variations in differ- 
ent animals? The segments of the insect body provide a good example. We have 
thus far sketched the way in which the rudiment of a single body segment is con- 
structed and how cells within each segment become different from one another. 
We now consider how one segment becomes determined, or specified, to be dif- 
ferent from another. 

The first glimpse of the answer to this problem came over 80 years ago, with 
the discovery of a set of mutations in Drosophila that cause bizarre disturbances 
in the organization of the adult fly. In the Antennapedia mutant, for example, legs 
sprout from the head in place of antennae, whereas in the Bithorax mutant, por- 
tions of an extra pair of wings appear where normally there should be the much 
smaller appendages called halteres (Figure 21-23). These mutations transform 
parts of the body into structures appropriate to other positions, and they are 
called homeotic mutations (from the Greek “homoios,” meaning similar) because 
the transformation is between structures of a recognizably similar general type, 
changing one kind of limb, or one kind of segment, into another. It was eventu- 
ally discovered that a whole set of genes, the homeotic selector genes, or Hox 
genes, serve to permanently specify the A-P characters of the whole set of animal 
segments. These genes are all related to one another as members of a multigene 
family. 

There are eight Hox genes in the fly, and they all lie in one or the other of two 
gene clusters known as the Bithorax complex and the Antennapedia complex. 
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Figure 21-22 The pattern of expression 
of Engrailed, a segment-polarity gene. 
The Engrailed pattern is shown in a 10- 
hour embryo and an adult (whose wings 
have been removed in this preparation). 
The pattern is revealed by constructing 

a strain of Drosophila containing the 
control sequences of the Engrailed gene 
coupled to the coding sequence of the 
reporter LacZ, whose product is detected 
histochemically through the brown product 
generated by immunohistochemistry 
against LacZ (10-hour embryo) or through 
the blue product generated by a reaction 
that LacZ catalyzes (adult). Note that the 
Engrailed pattern, once established, is 
preserved throughout the animals life. 
(Courtesy of Tom Kornberg.) 


Figure 21-23 Homeotic mutations. 
Ultrabithorax, or Ubx, is one of three 
genes in the Bithorax gene complex (a 
Hox gene cluster). Ubx is responsible for 
all of the differences between the second 
and third thoracic segments. (A, B) Ubx 
loss-of-function mutations transform the 
haltere-bearing segment (A) into a wing- 
bearing segment, resulting in four-winged 
flies (B). (C) Ubx gain-of-function in the 
second thoracic segment transforms 

this wing-bearing segment into a haltere- 
bearing segment, resulting in wingless flies. 
(Courtesy of Richard Mann.) 
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The genes in the Bithorax complex control the differences among the abdominal 
and thoracic segments of the body, while those in the Antennapedia complex con- 
trol the differences among thoracic and head segments. Comparisons with other 
species show that the same genes are present in essentially all animals, includ- 
ing humans. These comparisons also reveal that the Antennapedia and Bithorax 
complexes are the two halves of a single entity, called the Hox complex, that has 
become split in the course of the fly’s evolution, and whose members operate in 
a coordinated way to exert their control over the head-to-tail pattern of the body. 

The products of the Hox genes, the Hox proteins, are transcription regulators, 
all of which possess a highly conserved, 60-amino-acid-long DNA-binding home- 
odomain (see p. 376). The corresponding motif in the DNA sequence is called a 
“homeobox, from which, by abbreviation, the Hox complex takes its name. There 
are many homeobox-containing genes, but only those located in a Hox complex 
are Hox genes. 


Hox Proteins Give Each Segment Its Individuality 


The Hox proteins can be viewed as molecular address labels possessed by the cells 
of each segment: these labels give the cells in each region a positional value—that 
is, an intrinsic character that differs according to a cell’s location. If the address 
labels in a developing Drosophila segment are changed, the segment behaves 
as though it were located somewhere else; if all the Hox genes in an embryo are 
deleted, the body segments in the larva will all be alike. 

To a first approximation, each Hox gene is normally expressed in those regions 
that develop abnormally when that gene is mutated or absent. How does each 
Hox protein give a segment its permanent identity? All the Hox proteins are sim- 
ilar in their DNA-binding regions, but they are very different in the regions that 
interact with the other proteins with which the Hox proteins form transcriptional 
regulatory complexes. The different protein partners act together with the Hox 
proteins to dictate which DNA binding sites will be recognized, as well as whether 
the effect on transcription at those sites will be activation or repression. Acting 
in this way, the Hox proteins modulate the actions of many other transcription 
regulators. Hundreds of genes are under this type of Hox-modulated control, 
including genes for cell-cell signaling, transcriptional regulation, cell polarity, 
cell adhesion, cytoskeletal function, cell growth, and cell death, all conspiring (in 
ways that are not yet understood) to give each segment its distinctive Hox-depen- 
dent character. 


Hox Genes Are Expressed According to Their Order in the 
Hox Complex 


How, then, is the expression of the Hox genes themselves regulated? The coding 
sequences of the eight Hox genes in the Antennapedia and Bithorax complexes in 
Drosophila are interspersed amid a much larger quantity of regulatory DNA. This 
DNA includes binding sites for the products of the egg-polarity and segmentation 
genes, thereby serving as an interpreter of the multiple items of spatial informa- 
tion supplied to it by all these transcription regulators. The net result is that the 
particular set of Hox genes transcribed is appropriate for each location along the 
A-P body axis. 

The pattern of Hox gene expression exhibits a remarkable regularity that sug- 
gests an additional form of control. The sequence in which the genes are ordered 
along the chromosome, in both the Antennapedia and the Bithorax complexes, 
corresponds almost exactly to the order in which they are expressed along the 
A-P axis of the body (Figure 21-24). This hints at some process of gene activation, 
perhaps dependent on chromatin structures that propagate along the Hox com- 
plexes, switching on one Hox gene after another according to their order along the 
chromosome. The most “posterior” of the Hox genes that are expressed in a cell 
generally dominates, driving down expression and activity of the “anterior” genes 
and dictating the character of the segment. The gene regulatory mechanisms 
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underlying these phenomena are still not well understood, but their conse- 
quences are profound. We shall see that the serial organization of gene expression 
in the Hox complex is a fundamental feature that has been highly conserved in the 
course of animal evolution. 


Trithorax and Polycomb Group Proteins Enable the Hox 
Complexes to Maintain a Permanent Record of Positional 
Information 


The spatial pattern of expression of the genes in the Hox complex is set up by sig- 
nals acting early in development, but the consequences are long lasting. Although 
the pattern of expression undergoes complex adjustments as development pro- 
ceeds, the Hox complexes serve to stamp each cell and its progeny with a perma- 
nent record of the A-P position that the cell occupied in the early embryo. In this 
way, the cells of each segment are equipped with a long-term memory of their 
location along the A-P axis of the body. This memory trace is somehow imprinted 
on the Hox complexes, and it governs the segment-specific identity not only of the 
larval segments, but also of the structures of the adult fly. 

The molecular mechanism of this memory of positional information relies on 
two types of regulation. One is from the Hox genes themselves: many of the Hox 
proteins autoactivate the transcription of their own genes, thereby helping to keep 
the genes on indefinitely. Another crucial input is from two large, complemen- 
tary sets of proteins, called the Trithorax group and the Polycomb group, which 
stamp the chromatin of the Hox complex with a heritable record of its embryonic 
state of activation or repression. These are key general regulators of chromatin 
structure that can be shown to be critical for cell memory: if genes of the Trithorax 
or Polycomb group are defective, the pattern of expression of the Hox genes is set 
up correctly at first, but it is not correctly maintained as the embryo grows older. 

The two sets of regulators act in opposite ways. Trithorax group proteins are 
needed to maintain the transcription of Hox genes in cells where transcription has 
already been switched on. In contrast, Polycomb group proteins form stable com- 
plexes that bind to the chromatin of the Hox complex and maintain the repressed 
state in cells where Hox genes have not been activated at the critical time (Figure 
21-25). How such changes in chromatin can store developmental cell memory is 
discussed in Chapters 4 and 7. 


The D-V Signaling Genes Create a Gradient of the Transcription 
Regulator Dorsal 


As with the patterning along the Drosophila A-P axis just discussed, the patterning 
along the dorsoventral (D-V) axis begins with maternal gene products that define 
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Figure 21-24 The patterns of 
expression compared to the 
chromosomal locations of the 
genes of the Hox complex. The 
diagram shows the sequence of 
genes in each of the two subdivisions 
of the chromosomal complex. This 
corresponds, with minor deviations, 
to the spatial sequence in which 

the genes are expressed, shown 

in the photograph of a Drosophila 
embryo at the so-called germ band 
retraction stage, about 10 hours after 
fertilization. The embryo has been 
stained by in situ hybridization with 
differently labeled probes to detect 
the MRNA products of different Hox 
genes in different colors. (Photograph 
courtesy of William McGinnis, adapted 
from D. Kosman et al., Science 
305:846, 2004. With permission from 
AAAS.) 
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Figure 21-25 The role of genes of the Polycomb group. (A) Photograph of 
a wild-type Drosophila embryo. (B) Photograph of a mutant embryo defective 
for the gene Extra sex combs (Esc) and derived from a mother also lacking 
this gene. The gene belongs to the Polycomb group. Essentially all segments 
have been transformed to resemble the most posterior abdominal segment. 
In the mutant, the pattern of expression of the homeotic selector genes, 
which is roughly normal initially, is unstable in such a way that all these genes 
soon become switched on all along the body axis. (From G. Struhl, Nature 
293:36-41, 1981. With permission from Macmillan Publishers Ltd.) 


this axis in the egg (see Figure 21-17), and it then progresses through zygotic gene 
products that further subdivide the D-V axis in the embryo. 

Initially, a protein that is produced by follicle cells underneath the future ven- 
tral region of the embryo leads to the localized activation of a transmembrane 
receptor, called Toll, on the ventral side of the egg membrane. The various mater- 
nal genes required for this process are called D-V egg-polarity genes. (Curiously, 
Drosophila Toll and vertebrate Toll-like proteins also operate in innate immune 
responses, as discussed in Chapter 24). The localized activation of Toll controls 
the distribution of Dorsal, a transcription regulator of the NFKB family discussed 
in Chapter 15. The Toll-regulated activity of Dorsal, like that of NFkB, depends on 
the translocation of Dorsal from the cytosol, where it is held in an inactive form, 
to the nucleus, where it regulates gene expression (see Figure 15-62). In the newly 
laid egg, both Dorsal mRNA and protein are distributed uniformly in the cytosol. 
After the nuclei in the syncytial blastoderm have migrated to the surface of the 
embryo, but before cellularization (see Figure 21-15), Toll receptor activation on 
the ventral side induces a remarkable redistribution of the Dorsal protein. On the 
dorsal side, the protein remains in the cytosol, but ventrally it becomes concen- 
trated in the nuclei, with a smooth gradient of nuclear localization between these 
two extremes (Figure 21-26). 

Once inside the nucleus, the Dorsal protein acts as a morphogen and turns on 
or off the expression of different sets of genes depending on Dorsal’s concentra- 
tion. The expression of each responding gene depends on its regulatory DNA— 
specifically, on the number and affinity of the binding sites that this DNA contains 
for Dorsal and other transcription regulators. In this way, the regulatory DNA 
interprets the positional signal provided by the nuclear Dorsal protein gradient, 
so as to define a D-V series of territories—distinctive bands of cells that run the 
length of the embryo. Most ventrally—where the nuclear concentration of Dorsal 
protein is highest—it switches on, for example, the expression of a gene called 
Twist, which is specific for mesoderm. Most dorsally, where the nuclear concen- 
tration of Dorsal protein is lowest, the cells switch on a gene called Decapenta- 
plegic (Dpp). And in an intermediate region, where the nuclear concentration of 
Dorsal protein is high enough to repress Dpp but too low to activate Twist, the 
cells switch on another set of genes, including one called Short gastrulation (Sog) 
(Figure 21-27A). 

Products of the genes directly regulated by the Dorsal protein generate in turn 
more local signals, which define finer subdivisions along the D-V axis. These sig- 
nals act during cellularization and take the form of conventional extracellular 
dorsalized 
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Figure 21-26 The concentration gradient 
of Dorsal protein in the nuclei of the 
blastoderm. In wild-type Drosophila 
embryos, the protein is present in the 
dorsal cytoplasm and absent from the 
dorsal nuclei; ventrally, it is depleted in the 
cytoplasm and concentrated in the nuclei. 
In a mutant in which the Toll pathway is 
activated everywhere and not just ventrally, 
Dorsal protein is everywhere concentrated 
in the nuclei; the result is a ventralized 
embryo. Conversely, in a mutant in which 
the Toll signaling pathway is inactivated, 
Dorsal protein everywhere remains in the 
cytoplasm and is absent from the nuclei; 
the result is a dorsalized embryo. (From 

S. Roth, D. Stein and C. Nusslein-Volhard, 
Cell 59:1189-1202, 1989. With permission 
from Elsevier.) 


no Polycomb activity 


1166 Chapter 21: Develooment of Multicellular Organisms 


vitelline membrane 
perivitelline space 













Dpp mRNA 


Sog mRNA 






Twist MRNA 





secreted Dpp protein extraembryonic tissue dorsal 
epidermis 





neurogenic 


VENTRAL ectoderm 


secreted Sog 
protein 


signal proteins. In particular, Dpp codes for a secreted TGFB-family protein, 
which forms a local morphogen gradient in the dorsal part of the embryo. Sog 
encodes another secreted protein that is produced by the neurogenic ectoderm 
(which gives rise to the nervous system) and acts as an antagonist of Dpp protein. 
The opposing diffusion gradients of these two signal proteins create a steep gra- 
dient of Dpp activity: the highest Dpp activity levels, in combination with certain 
other factors, cause development of the most dorsal tissue of all—an extraembry- 
onic membrane. Intermediate levels cause development of dorsal epidermis; and 
the absence of Dpp activity allows the development of neurogenic ectoderm (Fig- 
ure 21-27B). 





A Hierarchy of Inductive Interactions Subdivides the Vertebrate 
Embryo 


The molecular genetic analysis of Drosophila development has uncovered how 
a cascade of transcription regulators and signaling pathways subdivides the 
embryo. The same principle of progressive pattern refinement is used during 
the development of all animal embryos, including vertebrates. Remarkably, con- 
servation is not restricted to the general strategy of pattern formation, but also 
extends to many of the molecules involved. 

As mentioned previously, the earliest phases of vertebrate development are 
surprisingly variable, even between closely related species, and it is even hard 
to say precisely how the axes of an early fly embryo correspond to those of an 
early frog or mouse embryo. Nevertheless, we shall see that amid this display of 
evolutionary plasticity, some features of early development turn out to be highly 


Figure 21-27 How morphogen gradients 
guide a patterning process along the 
dorsoventral axis of the Drosophila 
embryo. (A) Initially, a gradient of Dorsal 
protein defines three broad territories 

of gene expression, marked here by 

the expression of three representative 
genes— Dpp, Sog, and Twist. (B) Slightly 
later, the cells expressing Dop and Sog 
secrete, respectively, the signal proteins 
Dpp (a TGFB family member) and Sog (an 
antagonist of Dpp). These two proteins 
then diffuse and interact with one another 
(and with certain other factors) to create the 
dorsoventral (D-V) territories shown. 
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conserved. The same is true of later developmental stages also, often to an aston- 
ishing degree. From our own anatomy, it is obvious that we are cousins to birds 
and fish. But looking at molecular mechanisms, we see that we are cousins to flies 
and worms too. 

In the following pages, we discuss how vertebrate embryos are patterned by 
the interplay of signaling molecules and transcription regulators. We begin by dis- 
cussing the formation and patterning of the embryonic axes in amphibians, tak- 
ing the frog Xenopus as our example. We have already broached this topic earlier 
in the chapter. Here, we pick up the thread and draw comparisons with the fly. 

As noted earlier, the origins of the embryonic axes and the three germ layers 
in the frog can be traced back to the blastula (see Figure 21-3A). By labeling indi- 
vidual blastomeres, we can track cells through all their divisions, transformations, 
and migrations and see what they become and where they come from. The pre- 
cursors of ectoderm, mesoderm, and endoderm are arranged in order along the 
animal-vegetal axis of the blastula: the endoderm derives from the most vege- 
tal blastomeres, the ectoderm from the most animal, and the mesoderm from a 
middle set. Within each of these territories, the cells have diverse fates according 
to their positions along the D-V axis of the later embryo. For ectoderm, epider- 
mal precursors are located ventrally, and future neurons are found dorsally; for 
mesoderm, precursors for notochord, muscle, kidney, and blood are arranged 
from dorsal to ventral. All this can be represented by a fate map that shows which 
cell types derive from which regions of the blastula (Figure 21-28). The fate map 
confronts us with the central question: how are the cells in different positions 
driven toward their different fates? We have already explained how maternal fac- 
tors deposited in the developing frog egg define its animal-vegetal axis, and how 
cortical rotation triggered by fertilization defines the orientation of the dorsoven- 
tral axis (see Figure 21-14). But how does the establishment of axes lead on to the 
subdivision of the embryo into the future body parts? 

The maternal gene products lead to the formation of signaling centers on the 
vegetal and dorsal sides of the embryo. The dorsal signaling center in particular 
has a special place in the history of developmental biology. Experiments in the 
early twentieth century identified it as a small cluster of cells, located on the dor- 
sal side of the amphibian embryo, with an extraordinary property: when the cells 
were transplanted to an opposite site, they could trigger a radical reorganization 
of the neighboring tissue, causing it to form a second whole-body axis (Figure 
21-29). The discovery of this signaling center, called the Organizer, led the way 
to a pioneering analysis of the chain of inductive interactions that establish the 
framework of the vertebrate body. 

In contrast to the Drosophila syncytial embryo, the fertilized frog egg under- 
goes rapid cleavage divisions that result in an embryo consisting of thousands of 
cells. Patterning must therefore be mediated by extracellular signal molecules 
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Figure 21-28 Blastula fate map in a 

frog embryo. The endoderm derives 

from the most vegetal blastomeres (yellow), 
the ectoderm from the most animal (b/ue), 
and the mesoderm from a middle set 
(green) that contributes also to endoderm 
and ectoderm. Different cell types 

derive from different positions along the 
dorsoventral axis. 


Figure 21-29 Induction of a secondary 
axis by the Organizer. An amphibian 
embryo receives a graft of a small cluster 
of cells taken from a specific site, called 
the Organizer region, on the dorsal side 

of another embryo at the same stage. 
Signals from the graft organize the behavior 
of neighboring cells of the host embryo, 
causing development of a pair of conjoined 
(Siamese) twins. See Movie 21.4. [After 

J. Holtfreter and V. Hamburger, in Analysis 
of Development (B.H. Willier, RA. Weiss 
and V. Hamburger, eds), pp. 230-296. 
Philadelphia: Saunders, 1955.] 
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that diffuse through the embryo from cell to cell, not by transcription regulators 
that move through the cytoplasm of a syncytium. Not surprisingly, the Organizer 
is now known to be a major source of secreted protein signals. 


A Competition Between Secreted Signaling Proteins Patterns the 
Vertebrate Embryo 


The signal molecules that pattern the frog embryo along the animal-vegetal (A-V) 
axis belong to the TGFf family: they are secreted by a signaling center at the veg- 
etal pole and form concentration gradients along the A-V axis. The Nodal protein 
acts over a relatively short range: cells near the vegetal pole are exposed to high 
levels of it and respond by switching on genes that promote the development of 
endoderm; cells further away are exposed to lower levels and activate genes that 
promote the formation of mesoderm. The cells at the vegetal pole that produce 
Nodal also produce a more rapidly diffusing TGFB-like protein called Lefty, which 
antagonizes Nodal. The result is a high ratio of Lefty to Nodal at the animal pole, 
where Lefty predominates and Nodal signaling is blocked; this causes the cells 
there to develop as ectoderm (Figure 21-30A). Thus, a mid-range activation by 
Nodal, combined with a long-range inhibition by Lefty, sets up the pattern of pro- 
genitors along the A-V axis for the three germ layers—endoderm, mesoderm, and 
ectoderm. 

The frog’s dorsal signaling system uses a different set of secreted signals from 
that of the vegetal signaling system to subdivide the germ-layer territories accord- 
ing to location along the D-V axis of the embryo. It exerts its influence by secreting 
two inhibitory signal proteins, called Chordin and Noggin. These antagonize the 
action of bone morphogenetic proteins (BMPs; members of yet another subclass 
of the TGFB family), which themselves are secreted throughout the embryo. In 
this way, Chordin and Noggin form a dorsal-to-ventral gradient that blocks BMP 
signaling on the dorsal side but allows it to remain high on the ventral side (Figure 
21-30B). Ectodermal cells that experience high levels of BMP signaling are driven 
to epidermal fates, whereas cells that experience little or no BMP signaling remain 
neural. 

Knowing the signals that specify the three germ layers and various tissue 
types of the vertebrate body, one can reproduce this specification in a culture 
dish. Frog cells taken from the animal-pole region of the embryo, for example, 
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Figure 21-30 How Nodal and bone 
morphogenic protein (BMP) signaling 
pattern the embryonic axes. Nodal and its 
antagonist Lefty pattern the animal-vegetal 
axis, while BMP and its antagonists Chordin 
and Noggin pattern the dorsoventral axis. 
(A) In the animal pole region, where Nodal 
levels are low relative to Lefty, Lefty blocks 
Nodal from binding to its receptors. In the 
vegetal region, there is an excess of Nodal, 
resulting in Nodal pathway activation. 

(B) Along the dorsoventral axis, BMP is 
widely present but Chordin and Noggin 

are concentrated at the dorsal side: there, 
they bind to BMP and block its binding to 
receptors. The resulting patterns of Nodal 
and BMP activity are illustrated at the 
bottom of the figure. 
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will differentiate into blood (a ventral mesodermal tissue) when diverted from 
their original fate by exposure to intermediate concentrations of Nodal and high 
concentrations of BMP. Similarly, mouse or human embryonic stem cells can be 
coaxed to generate specific cell types by exposing them in culture to appropriate 
combinations of signal molecules. In this way, the insights gained through studies 
of animal development can be used to generate the cell types needed for regener- 
ative medicine, as we discuss in the next chapter. 


The Insect Dorsoventral Axis Corresponds to the Vertebrate 
Ventral-Dorsal Axis 


The signaling systems that pattern the D-V axis in Drosophila and in vertebrates 
are similar. In Drosophila, as we saw, Dpp and its inhibitor Sog are responsible, 
whereas in vertebrates, BMP and its inhibitors Chordin and Noggin do the job. 
Dpp is a member of the BMP family, while Sog is a homolog of Chordin. Both in 
flies and frogs, high levels of the inhibitors define the region that is neurogenic, 
and high levels of BMP/Dpp activity define the region that is not. These and other 
molecular parallels strongly suggest that this aspect of body patterning has been 
conserved in evolution from insects to vertebrates. Curiously, however, the axis is 
inverted: dorsal in the fly corresponds to ventral in the vertebrate (Figure 21-31). 
At some point in evolution, it seems that the ancestor of one of these classes of 
animals took to living life upside-down. 


Hox Genes Control the Vertebrate A-P Axis 


The conservation of developmental mechanisms between Drosophila and verte- 
brates extends beyond the D-V signaling system. Hox genes are found in almost 
every animal species studied, where they are often grouped in complexes similar 
to the insect Hox complex. In mice and humans, for example, there are four such 
complexes—called the HoxA, HoxB, HoxC, and HoxD complexes—each on a dif- 
ferent chromosome. Individual genes in each complex can be recognized by their 
sequences as counterparts of specific members of the Drosophila set. Indeed, 
mammalian Hox genes can function in Drosophila as partial replacements for the 
corresponding Drosophila Hox genes. It appears that each of the four mamma- 
lian Hox complexes is, roughly speaking, the equivalent of one complete insect 
Hox complex (that is, an Antennapedia complex plus a Bithorax complex) (Figure 
21-32). 

The ordering of the genes within each vertebrate Hox complex is essentially the 
same as in the insect Hox complex, suggesting that all four vertebrate complexes 
originated by duplications of a single primordial complex and have preserved its 
basic organization. Most tellingly, when the expression patterns of the Hox genes 
are examined in the vertebrate embryo, it turns out that the members of each 
complex are expressed in a head-to-tail series along the axis of the body, just as 
they are in Drosophila. As in Drosophila, vertebrate Hox gene expression patterns 
are often aligned with vertebrate segments. This alignment is especially clear in 
the hindbrain (see Figure 21-32), where the segments are called rhombomeres. 

The products of the vertebrate Hox genes, the Hox proteins, specify positional 
values that control the A-P pattern of parts in the hindbrain, neck, and trunk (as 
well as some other parts of the body). As in Drosophila, when a posterior Hox gene 
is artificially expressed in an anterior region, it can convert the anterior tissue to 
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Figure 21-31 The vertebrate body plan 
as a dorsoventral inversion of the insect 
body plan. Note the correspondence with 
regard to the circulatory system as well as 
the gut and nervous system. In insects, 
the circulatory system is represented by 

a tubular heart and a main dorsal blood 
vessel, which pumps blood out into the 
tissue spaces through one set of apertures 
and receives blood back from the tissues 
through another set. Unlike vertebrates, 
insects have no system of capillary vessels 
to contain the blood as it percolates 
through the tissues. Nevertheless, heart 
development depends on homologous 
genes in vertebrates and insects, 
reinforcing the relationship between the 
two body plans. (After E.L. Ferguson, Curr. 
Opin. Genet. Dev. 6:424—431, 1996. With 
permission from Elsevier.) 


1170 Chapter 21: Development of Multicellular Organisms 





Ubx AbdA AbdB 


Lab Pb (Zen) Dfd Scr (Ftz) Antp 


Drosophila 
Hox complex 


Pili | 


Hox1 Hox2 Hox3 Hox4 Hox5 Hox6 (central) Hox7 (posterior) 











ancestral 
Hox complex 


l 


mammalian | 1 | | 









































































































































































































































Hox complex Al A2 A3 A4 A5 A6 A7 A9 A10 A11 A13 

Hoxa — EL HHHH E I I Ie 

B1 B2 B3 B4 B5 B6 B7 B8 B9 B13 

He oR — 

C4 C5 C6 C8 C9 C10 C11 C12 C13 

HoxC m O O O 

D1 D3 D4 D8 D9 D10 Dif D12 Dis 

HoxD — — H |. M A 
spinal cord 


hindbrain 






nN 
derm 


meso 


a posterior character. Conversely, loss of posterior Hox genes allows the posterior 
tissue where they are normally expressed to adopt an anterior character (Figure 
21-33). Because of a redundancy between genes in the four Hox gene clusters, 
the transformations observed in mouse Hox mutants are not always so straight- 
forward as those in the fly, and they are often incomplete. Nonetheless, it seems 
clear that the fly and the mouse use essentially the same molecular machinery to 
impart individual characteristics to successive regions along at least a part of the 
A-P axis. 





Some Transcription Regulators Can Activate a Program That 
Defines a Cell Type or Creates an Entire Organ 


Just as there are genes that regulate pattern formation and segmental identity, 
there are genes whose products act as triggers for the development of a specific 
cell type or even a specific organ, initiating and coordinating the whole complex 
program of gene expression that is required. An example is the MyoD/myogenin 


Figure 21-32 The Hox complexes of an 
insect and a mammal, compared and 
related to body regions. The genes of the 
Antennapedia and Bithorax complexes of 
Drosophila are shown in their chromosomal 
order in the top line. The corresponding 
genes of the four mammalian Hox 
complexes are shown below, also in 
chromosomal order. The gene expression 
domains in fly and mammal are indicated 

in a simplified form by color in the cartoons 
of animals above and below. There is a 
remarkable parallelism. However, the details 
of the patterns depend on developmental 
stage and vary somewhat from one 
mammalian Hox complex to another. Also, 
in many cases, genes shown here as 
expressed in an anterior domain are also 
expressed more posteriorly, overlapping the 
domains of more posterior Hox genes. 

The complexes are thought to have 
evolved as follows: first, in some common 
ancestor of worms, flies, and vertebrates, 

a single primordial homeotic selector gene 
underwent repeated duplication to form 

a series of such genes in tandem—the 
ancestral Hox complex. In the Drosophila 
sublineage, this single complex became 
split into separate Antennapedia and 
Bithorax complexes. Meanwhile, in the 
lineage leading to the mammals, the whole 
complex was repeatedly duplicated to give 
four Hox complexes. The parallelism is not 
perfect because apparently some individual 
genes have been duplicated and others 
lost. Still others have been co-opted for 
different purposes (genes in parentheses in 
the top line) over the time that has elapsed 
since the complexes diverged. (Based on a 
diagram courtesy of William McGinnis.) 
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family of transcription regulators that we encountered in Chapter 7. These pro- 
teins drive cells to differentiate into muscle, expressing muscle-specific actins 
and myosins and all the other specialized cytoskeletal, metabolic, and membrane 
proteins that a muscle cell needs. Analogously, members of the Achaete/Scute 
family of transcription regulators drive cells to become neural progenitors. In 
both these examples, the proteins belong to the basic helix-loop-helix (bHLH) 
class of transcription regulators (see p. 377), and the same is true for many of the 
other proteins that induce the differentiation of particular cell types. These master 
transcription regulators exert their powerful differentiation-inducing activity by 
binding to many different regulatory sites in the genome and thereby controlling 
the expression of large numbers of downstream target genes. In one well-studied 
case, that of an Achaete/Scute family member called Atonal homolog 1 (Atoh1), 
the number of direct target genes in the mouse genome is more than 600. It is 
important to note, however, that even such powerful drivers of cell differentiation 
can have radically different effects according to the context and history of the cells 
in which they act: Atoh1, for example, drives differentiation of certain classes of 
neurons in the brain, of sensory hair cells in the inner ear, and of secretory cells in 
the lining of the gut. 

Other genes encoding transcription regulators can drive the formation and 
assembly of the multiple cell types that constitute an entire organ. A famous 
example is the transcription regulator Eyeless. When it is artificially expressed in a 
patch of cells in the leg precursors of Drosophila, a well-organized eye-like organ 
develops on the leg, with the various eye cell types correctly arranged (see Figure 
7-35B); conversely, loss of the Eyeless gene results in flies that lack eyes. More- 
over, loss of the Eyeless homolog Pax6 in vertebrates likewise leads to loss of eye 
structures. Similar organ-selector proteins are known for foregut, heart, pancreas, 
and other organs. They are all master transcription regulators that directly regu- 
late hundreds of target genes, the products of which then specify and construct 
the different elements of the appropriate organ. However, as in the example of 
Atoh1, they usually exert their specific effect only in combination with the right 
partners, which are only expressed in cells that were appropriately primed during 
their earlier development. 


Notch-Mediated Lateral Inhibition Refines Cellular Spacing 
Patterns 
After the establishment of the basic body plan and the generation of organ precur- 


sors, many further steps of pattern refinement are required to achieve the adult 
pattern of terminally differentiated cells in tissues and organs. As we discussed 
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Figure 21-33 Control of anteroposterior 
pattern by Hox genes in the mouse. 
(A,B) A normal mouse (wild type) has 
about 65 vertebrae, differing in structure 
according to their position along the 

body axis: 7 cervical (neck), 13 thoracic 
(with ribs), 6 lumbar [bracketed by yellow 
asterisks in (B)], 4 sacral [bracketed by red! 
asterisks in (B)], and about 35 caudal (tail). 
(A) shows a side view and (B) shows a 
dorsal view; for clarity, the limbs have been 
removed in each picture. 

(C) The HoxA70 gene is normally 
expressed in the lumbar region (together 
with its paralogs HoxC10 and HoxD10); 
here it has been artificially expressed in 
the developing vertebral tissue all along 
the body axis. As a result, the cervical and 
thoracic vertebrae are all converted to a 
lumbar character. (D) Conversely, when 
HoxA10 is removed along with HoxC 70 
and HoxD10, vertebrae that should 
normally have a lumbar or sacral character 
take on a thoracic character instead. (A and 
C, from M. Carapuco et al., Genes Dev. 
19:2116-2121, 2005. With permission 
from Cold Spring Harbor Laboratory 
Press; B and D, from D.M. Wellik and 
M.R. Capecchi, Science 301:363-367, 
2003.) 


1172 Chapter 21: Develooment of Multicellular Organisms 





mechanosensory 
bristle 








dying cell 





earlier, lateral inhibition mediated by Notch signaling is crucial for both cell diver- 
sification and fine-grained patterning in an enormous variety of tissues in all ani- 
mals. 

One example is the development of sensory bristles in Drosophila, most easily 
seen on the fly’s back, but also present on most ofits other exposed surfaces. Each 
of these is a miniature sense organ, consisting of a sensory neuron and a small set 
of supporting cells. Some bristles respond to chemical stimuli, others to mechan- 
ical stimuli, but they are all constructed in a similar way (Figure 21-34). The pro- 
neural genes Achaete and Scute mentioned earlier mark the patches of epidermis 
within which bristles will form. Mutations that eliminate the expression of these 
genes at some of their usual sites block development of bristles at just those sites, 
and mutations that cause expression in abnormal sites cause bristles to develop 
there. 

The initial cells expressing the proneural genes are called proneural cells, and 
they are primed to take the neurosensory pathway of differentiation, but which 
of the cells will actually do so depends on competitive interactions among them. 
In the first round of these interactions, a single cell within each small group of 
proneural cells is picked to serve as the progenitor of the bristle. This single cell is 
called the sensory mother cell. It becomes distinct from the other cells of the clus- 
ter through lateral inhibition mediated by the Notch signaling pathway. This oper- 
ates in the way we discussed earlier. The cells in the proneural cluster initially all 
express both the transmembrane receptor Notch and its transmembrane ligand 
Delta, along with proteins that regulate the signaling activity of Delta. Wherever 
Delta activates Notch, an inhibitory signal is transmitted that diminishes the ten- 
dency of the Notch-activated cell to specialize as a sensory mother cell.At first, 
all the cells in the cluster inhibit one another. However, receipt of the signal in a 
given cell diminishes that cell’s ability to fight back by delivering the inhibitory 
Delta signal in return. This creates a competitive situation, from which a single 
cell in each cluster—the future sensory mother cell—eventually emerges as win- 
ner, sending a strong inhibitory signal to its immediate neighbors but receiving 
no such signal in return (Figure 21-35). If a cell that would normally become a 
sensory mother cell is genetically disabled from doing so, a neighboring proneu- 
ral cell, freed from lateral inhibition, will become a sensory mother cell instead. 

The sensory mother cell goes through a short program of further divisions to 
generate the set of cells that form the final bristle. Notch signaling acts repeat- 
edly at successive stages in this program to drive the descendants of the sensory 
mother cell along different pathways and assign them to their various specialized 
fates. However, it does so in conjunction with additional mechanisms that bias 
the outcome of the competition mediated by lateral inhibition. Determinants that 


Figure 21-34 The basic structure of a 
mechanosensory bristle. The lineage of 
the four cells of the bristle—all descendants 
of a single sensory mother cell—is shown 
on the left. The sensory mother cell, once 
it is specified, generates this set of cells 
through a short program of division cycles. 
In each generation of the progeny, lateral 
inhibition operates again to drive the 
newborn cells toward different fates: one 
of the ultimate progeny will become the 
neuron; another, the shaft of the bristle; 
others, supporting cells of various sorts. 
As the sensory mother cell and its progeny 
divide, certain proteins are allocated 
preferentially to one of each pair of 
newborn sister cells, biasing the outcome 
of the lateral-inhibition competition 
mediated by Notch signaling. 
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are asymmetrically localized inside the dividing cells have this role in sensory 
bristle development. They are also important in other contexts, as we now discuss. 


Asymmetric Cell Divisions Make Sister Cells Different 


Cell diversification does not always have to depend on extracellular signals: in 
some cases, sister cells are born different as a result of an asymmetric cell division, 
during which some significant set of molecules is divided unequally between 
them. This asymmetrically segregated molecule (or set of molecules) then acts as 
a determinant for one of the cell fates by directly or indirectly altering the pattern 
of gene expression within the daughter cell that receives it (see Figure 21-12). We 
have already encountered the asymmetric segregation of molecules in the context 
of the early frog embryo: VegT RNA is localized in the vegetal region of the fertil- 
ized egg. Following cell division, only vegetal daughter cells will inherit VegT RNA. 

Asymmetric divisions often occur at the beginning of development, but they 
are also encountered at some later stages. As mentioned for the sensory bristle, 
they can set the scene for an exchange of Notch signals between the daughter 
cells, with the signaling occurring after the cells have become separate and rein- 
forcing the differences between them. In the central nervous system, asymmetric 
divisions have a key role in generating the very large numbers of neurons and 
glial cells that are needed. A special class of cells becomes committed as neural 
precursors, but instead of differentiating directly as neurons or glial cells, these 
undergo a long series of asymmetric divisions through which a succession of 
additional neurons and glial cells are added to the population. The process is best 
understood in Drosophila, although there are many hints that something similar 
occurs also in vertebrate neurogenesis. 

In the embryonic central nervous system of Drosophila, the nerve-cell pre- 
cursors, or neuroblasts, are initially singled out from the neurogenic ectoderm 
by a typical lateral-inhibition mechanism that depends on Notch. Each neuro- 
blast then divides repeatedly in an asymmetric fashion (Figure 21-36). At each 
division, one daughter remains as a neuroblast, while the other, which is much 
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Figure 21-35 Lateral inhibition. (A) The 
basic mechanism of Notch-mediated 
competitive lateral inhibition, illustrated for 
just two interacting cells. In this diagram, 
the absence of color on proteins or 
effector lines indicates inactivity. (B) The 
outcome of the same process operating 
in a larger patch of cells. At first, all cells in 
the patch are equivalent, expressing both 
the transmembrane receptor Notch and 
its transmembrane ligand Delta. Each cell 
has a tendency to specialize (as a sensory 
mother cell), and each sends an inhibitory 
signal to its neighbors to discourage them 
from also specializing in that way. This 
creates a competitive situation. As soon 
as an individual cell gains any advantage in 
the competition, that advantage becomes 
magnified. The winning cell, as it becomes 
more strongly committed to differentiating 
as a sensory mother cell, also inhibits its 
neighbors more strongly. Conversely, as 
these neighbors lose their capacity to 
differentiate as sensory mothers, they also 
lose their capacity to inhibit other cells from 
doing so. Lateral inhibition thus makes 
adjacent cells follow different fates. 
Although the interaction is thought to be 
normally dependent on cell-cell contacts, 
the future sensory mother cell may be able 
to deliver an inhibitory signal to cells that 
are more than one cell diameter away —for 
example, by sending out long protrusions 
to touch them. 


1174 Chapter 21: Develooment of Multicellular Organisms 


after 4 more rounds 
ectoderm of neuroblast division 





neuroblast 





ganglion mother cell neuron glial cell 





smaller, becomes specialized as a ganglion mother cell. Each ganglion mother cell Figure 21-36 Neuroblasts and 
will divide only once, giving a pair of neurons, or a neuron plus a glial cell, ora @symmetric cell division in the central 
wol olaticell ith Notch diated int ti Haloimerodaueuved ht nervous system of a fly embryo. The 
pair of glial cells, with Notch-mediated interactions helping to drive the daughters guroblast originates as a specialized 
along different paths. The neuroblast itself becomes smaller at each division, asit ectodermal cell. It is singled out by lateral 
parcels out its substance into one ganglion mother cell after another. Eventually, inhibition and emerges from the basal 
typically after about 12 cycles, the process halts, presumably because the neuro- (internal) face of the ectoderm. It then 
blast becomes too small to pass the cell-size checkpoint in the cell-division cycle. 909S through repeated division cycles, _ 
Later, in the larva, neuroblast divisions resume, but now they are accompanied a a E S 
areh ae area on _ Pp of ganglion mother cells. Each ganglion 
by cell growth, permitting the process to continue indefinitely and to generate the mother cell divides just once to give a 


much larger numbers of neurons and glial cells required in the adult fly. pair of differentiated daughters (typically a 
neuron plus a glial cell). 


Differences in Regulatory DNA Explain Morphological Differences 


In the preceding sections, we have seen that animals contain the same essential 
cell types, have a similar collection of genes, and share many of the molecular 
mechanisms of pattern formation. But how can we square this with the radical 
differences that we see in the body structures of animals as diverse as a worm, a 
fly, a frog, and a mouse? We asserted earlier, in a general way, that these differ- 
ences usually seem to reflect differences in the regulatory DNA that calls into play 
the components of the conserved basic kit of parts. We must now examine the 
evidence a little more closely. 

When we compare animal species with similar basic body plans—different 
vertebrates, for example, such as fish, birds, and mammals—we find that corre- 
sponding genes usually have similar sets of regulatory elements: the regulatory 
DNA sequences have been well conserved and are recognizably homologous in 
the different animals. The same is true if we compare different species of nema- 
tode worms or insects. But, when we compare vertebrate regulatory regions with 
those of worms or flies, it is hard to see any such resemblance. The protein-cod- 
ing sequences are unmistakably similar, but the corresponding regulatory DNA 
sequences appear mostly very different, suggesting that the differences in body 
plans mainly reflect differences in regulatory DNA. Although variations in the 
proteins themselves also contribute, differences in regulatory DNA would be 
enough to generate radically different tissues and body structures even if the pro- 
teins were the same. 

It is not yet possible to trace the genetic steps that have led to all the spectacu- 
lar diversity of animals. Their lineages have diverged over hundreds of millions of 
years, and in most cases too many changes have occurred for us to be able to say 
that this or that feature results from this or that mutation. The picture is clearer, 
however, for more recent evolutionary events. Studies of both closely related ani- 
mal populations and plant populations whose members have different morphol- 
ogies have revealed that dramatic developmental effects can result from subtle 
changes in regulatory DNA. 

A well-studied example is the morphological diversity found in stickleback 
fish. After the last ice age ended about 10,000 years ago, marine sticklebacks col- 
onized many newly formed freshwater streams and lakes. Marine sticklebacks 
extend sharp spines from their pelvic skeleton. These spines are thought to help 
protect the fish from soft-mouthed fish predators. In contrast, several popula- 
tions of freshwater sticklebacks have lost these spines, usually in lakes that lack 
such predators. The different morphologies reflect differences in control of the 
expression of a transcription regulator called Pitx1. Whereas marine sticklebacks 
express the Pitx] gene in the pelvic bone precursor cells that will form the spikes, 
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freshwater sticklebacks have lost this expression as a result of a change at the Pitx1 
locus. These changes do not lie in the coding sequence. Instead, each is a small 
deletion of a block of adjacent regulatory DNA that controls Pitx1 expression spe- 
cifically in the pelvic cells (Figure 21-37). 

The Pitx1 protein has important functions elsewhere in the body, so that the 
DNA sequences that encode this protein must be retained. The regulatory DNA 
responsible for Pitxl expression at these other sites is also unchanged in the two 
populations of sticklebacks. The evolution of pelvis development in sticklebacks 
shows how the modular nature of regulatory DNA elements that we encountered 
in Chapter 7 (see Figure 7-29) allows independent modification of the different 
parts of the body, even when formation of those body parts depends on the same 
proteins. 

In the recent evolution of plants, changes of body structure can be traced in 
a similar way to changes in regulatory DNA. For example, these account for a 
large part of the dramatic difference between the wild teosinte plant and its mod- 
ern descendant, maize, through some 10,000 years of mutation and selection by 
Native Americans. 


Summary 


Drosophila has been the foremost model organism for the study of the genetics of 
animal development. Its embryonic pattern is initiated by the products of mater- 
nal-effect genes called egg-polarity genes, which operate by setting up graded dis- 
tributions of transcription regulators in the egg and early embryo. The gradient of 
Bicoid protein along the A-P axis, for example, helps initiate the orderly expression 
of gap genes, pair-rule genes, and segment-polarity genes. These three classes of seg- 
mentation genes, through a hierarchy of interactions, become expressed in some 
regions of the embryo and not others, progressively subdividing the embryo along 
the A-P axis into a regular series of repeating modular units called segments. 

Superimposed on the pattern of gene expression that repeats itself in every seg- 
ment, there is a serial pattern of expression of Hox genes that confer on each seg- 
ment a different identity. These genes are grouped in complexes and are arranged in 
a sequence that matches their sequence of expression along the A-P axis of the body. 

Although Hox gene expression is initiated in the embryo, it is subsequently main- 
tained by the action of chromatin-binding proteins of the Polycomb and Trithorax 
group, which stamp the chromatin of the Hox complex with a heritable record of 
its embryonic state of repression or activation, respectively. Hox complexes homol- 
ogous to that of Drosophila are found in virtually every type of animal, where they 
help pattern the A-P axis of the body. 

Signaling gradients are also set up along the dorsoventral (D-V) axis. Initially, 
Toll signaling generates a nuclear gradient of Dorsal protein, which induces an 
extracellular signaling gradient of the TGF{-family protein Dpp and its antagonist, 
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Figure 21-37 Morphological diversity 

in stickleback fish is caused by 
changes in regulatory elements. 

(A-D) Pelvic spines are present in marine 
(A) but not in freshwater (C) populations. 
Correspondingly, Pitx7 is expressed in 

the pelvic area in marine (B) but not in 
freshwater (D) fish. The lack of expression 
in the pelvic area of freshwater populations 
is caused by mutations in an enhancer 
element. Other enhancers and sites of 
expression for Pitx7 are the same in marine 
and freshwater sticklebacks. (Courtesy of 
Michael D. Shapiro.) 
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Sog. This creates a gradient of Dpp activity that helps refine the assignment of differ- 
ent characters to cells at different positions along the D-V axis. 

In Xenopus, the polarity of the egg and the site of sperm entry set up the embry- 
onic axes. A gradient generated by the TGF{-family protein Nodal induces different 
fates along the animal-vegetal axis, whereas BMP and Chordin—proteins homol- 
ogous to Drosophila Dpp and Sog, respectively—control the patterning of the D-V 
axis. This axis is inverted, so that dorsal in the fly corresponds to ventral in the frog. 

Transcription regulators control the formation of specific cell types. Members 
of the MyoD/myogenin family drive the process of muscle cell determination, coor- 
dinating the many components required, whereas Achaete/Scute transcription 
regulators control neural fate. Other genes encoding such master transcriptional 
regulators can regulate the formation of entire organs. Eyeless, for example, is both 
necessary and sufficient to generate eye structures in Drosophila. 

To refine the anatomical pattern within such an organ, the cells interact locally, 
both by diffusible inductive signals and by short-range mechanisms. Often, the cells 
compete with one another by lateral inhibition. This process results in activation of 
the Notch signaling pathway in one cell and inhibition in its neighbors, generating 
two different cell types. Asymmetric cell divisions, in which daughter cells inherit 
different molecular determinants from the mother cell, provide an additional way 
to organize a fine-grained diversity of cell types. 

Evidence from recent evolutionary events indicates that anatomical changes are 
mostly driven by changes in regulatory DNA sequences that determine when and 
where developmental genes are expressed. How the striking diversity in body struc- 
tures has evolved over longer times remains largely unknown, although it seems 
likely that similar principles apply. 


DEVELOPMENTAL TIMING 


Developmental events unfold over minutes, hours, days, weeks, months, or even 
years, with each organism following its own strict timetable. The cascades of 
inductive interactions and transcriptional regulatory events described earlier take 
time, as signals are transmitted and transcription regulators are synthesized and 
then bind to DNA to activate or repress their target genes. At the beginning of this 
chapter, we compared development with an orchestral performance. There are 
many players, and each must do the right thing at the right time; yet there is no 
leader or conductor to set the tempo and coordinate the timing of all the differ- 
ent events. Each developmental process must thus occur at an appropriate rate, 
tuned by evolution to fit with the timing of other processes in the embryo or in 
the environment. The control of timing is one of the most important problems in 
developmental biology, but also one of the least understood. 


Molecular Lifetimes Play a Critical Part in Developmental Timing 


Developmental processes are complex, but they are built up from simple steps. A 
first challenge is to understand the timing of these steps. How long does it take, 
for example, to switch the expression of a gene on or off? This is not like throwing 
a light switch: it involves delays. First, it takes time to make an mRNA molecule: 
the RNA polymerase must travel the length of the gene, the primary RNA tran- 
script must be spliced and otherwise processed, and the resulting mRNA must 
be exported from the nucleus and delivered to the site where it will be translated. 
This adds up to what one might call the gestation time of the individual molecule. 
Second, it takes time for the individual mRNA molecules to accumulate to their 
fully effective concentration; as explained in Chapter 15, this accumulation time is 
dictated by the average lifetime of the molecules—the longer they last, the higher 
their ultimate concentration, and the longer the time taken to attain it. Similar 
delays occur at the next step, where the mRNA is translated into protein: synthesis 
of each individual protein molecule involves a gestation delay, and attainment of 
an effective concentration of protein molecules involves an accumulation delay 
that depends on the protein’s lifetime. The time for the whole gene switching pro- 
cess is just the sum of the gestation delays and the accumulation delays (basically, 
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the molecular lifetimes) for both the mRNA and the protein molecules. Somewhat 
counterintuitively, it is the combined length of these delays, rather than the rate 
of molecular synthesis (the number of molecules synthesized per second), that 
chiefly determines the switching time. 

The same additive principle applies to long cascades of gene switching, where 
gene A activates gene B, and gene B activates gene C, and so on. It also applies 
in other circumstances, such as in signaling pathways where one protein directly 
regulates the activation of the next. In all these cases, molecular lifetimes, along 
with gestation delays, play a key part in determining the pace of development. 
The lifetimes of mRNA and protein molecules are enormously variable, from a few 
minutes or hours to days or more, explaining much of the variation we see in the 
tempo of developmental events. 

Gene switching delays, however, are not the be-all and end-all of develop- 
mental timing. Development involves many other kinds of delay that contribute 
to timing. Chromatin structure takes time to remodel. Inductive signals take time 
to diffuse across a field of cells (see Figure 21-9). Cells take time to move and rear- 
range themselves in space. Nevertheless, the timing of gene switching plays a fun- 
damental part in developmental timing, as illustrated in an especially clear and 
striking way by a gene-expression oscillator that controls the segmentation of the 
vertebrate body axis, as we now explain. 


A Gene-Expression Oscillator Acts as a Clock to Control 
Vertebrate Segmentation 


The main body axis of all vertebrates has a repetitive, periodic structure, seen in 
the series of vertebrae, ribs, and segmental muscles of the neck, trunk, and tail. 
These segmental structures originate from the mesoderm that lies as a long slab 
on either side of the embryonic midline. This slab becomes broken up into a reg- 
ular repetitive series of separate blocks, or somites—cohesive groups of cells, sep- 
arated by clefts (Figure 21-38A). The somites form (as bilateral pairs) one after 
another, in a regular rhythm, starting in the region of the head and ending in the 
tail. Depending on the species, the final number of somites ranges from less than 
40 (in a frog or a zebrafish) to more than 300 (in a snake). 

The posterior, most immature part of the mesodermal slab, called the preso- 
mitic mesoderm, supplies the required cells: as the cells proliferate, this mesoderm 
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Figure 21-38 Somite formation in the 
chick embryo. (A) A chick embryo at 40 
hours of incubation. (B) How the temporal 
oscillation of gene expression in the 
presomitic mesoderm becomes converted 
into a spatial alternating pattern of gene 
expression in the formed somites. In the 
posterior part of the presomitic mesoderm, 
each cell oscillates with a cycle time of 90 
minutes. As cells mature and emerge from 
the presomitic region, their oscillation is 
gradually slowed down and finally brought 
to a halt, leaving them in a state that 
depends on the phase of the cycle they 
happen to be in at the critical moment. 

In this way, a temporal oscillation of gene 
expression traces out an alternating spatial 
pattern. (A, from Y.J. Jiang, L. Smithers and 
J. Lewis, Curr. Biol. 8:R868-R871, 1998. 
With permission from Elsevier.) 
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retreats tailward, extending the embryo (Figure 21-38B). In the process, it deposits 
a trail of somites formed from cells that group together into blocks as they emerge 
from the anterior end of the presomitic region. The special character of the pre- 
somitic mesoderm is maintained by a combination of fibroblast growth fac- 
tor (FGF) and Wnt signals, produced by a signaling center at the tail end of the 
embryo, and the range of these signals seems to define the length of the preso- 
mitic mesoderm. The somites emerge with clocklike timing, but what determines 
the rhythm of the process? 

In the posterior part of the presomitic mesoderm, the expression of certain 
genes oscillates in time. Snapshots of gene expression taken by fixing embryos 
for analysis at different times in the oscillation cycle reveal what is happening, 
and the oscillations can now also be observed in time-lapse movies of embryos 
containing fluorescent reporters of individual oscillating genes. One new somite 
pair is formed in each oscillation cycle, and, in mutants where the oscillations fail 
to occur, somite segmentation is disrupted: the cells may still break up, belatedly, 
into separate clusters, but they do so in a haphazard, irregular way. The gene-ex- 
pression oscillator controlling regular segmentation is called the segmentation 
clock. The length of one complete oscillation cycle depends on the species: it is 30 
minutes in a zebrafish, 90 minutes in a chick, 120 minutes in a mouse. 

As cells emerge from the presomitic mesoderm to form somites—in other 
words, as they escape from the influence of the FGF and Wnt signals—their oscil- 
lation stops. Some become arrested in one state, some in another, according to 
the phase of the oscillation cycle at the time they leave the presomitic region. In 
this way, the temporal oscillation of gene expression in the presomitic mesoderm 
leaves its trace in a spatially periodic pattern of gene expression in the matur- 
ing mesoderm; this in turn dictates how the tissue will break up into physically 
separate blocks, through effects on the pattern of cell-cell adhesion (see Figure 
21-38B). 

How does the segmentation clock work? The first somite oscillator genes to 
be discovered were Hes genes, which are key components of the Notch signaling 
pathway. They are directly regulated by the activated form of Notch, and they code 
for inhibitory transcription regulators that inhibit the expression of other genes, 
including Delta. As well as regulating other genes, the products of Hes genes can 
directly regulate their own expression, creating a remarkably simple negative 
feedback loop. Autoregulation of certain specific Hes genes (depending on spe- 
cies) is thought to be the basic generator of the oscillations of the somite clock. 
Although the machinery has been modified in various ways in different species, 
the underlying principle seems to be conserved. When the key Hes gene is tran- 
scribed, the amount of Hes protein product builds up until it is sufficient to block 
Hes gene transcription; synthesis of the protein ceases; the protein then decays, 
permitting transcription to begin again; and so on, cyclically (Figure 21-39). The 
period of oscillation, which determines the size of each somite, depends on the 
delay in the feedback loop. This equals the sum of the gestation delays and accu- 
mulation delays (that is, the molecular lifetimes) of the Hes mRNA and protein 
molecules, according to the additive principle discussed earlier. Mathematical 
modeling (see Chapter 8) allows us to relate these basic molecular parameters to 
the cycle time of the segmentation clock: to a first approximation, the cycle period 
is simply equal to twice the total delay in the negative feedback loop, and thus 
twice the sum of the delays occurring at each step of the loop. 

The feedback loop just described is intracellular, and each cell in the preso- 
mitic mesoderm can generate oscillations on its own. But these oscillations at the 
single-cell level are somewhat erratic and imprecise, reflecting the fundamentally 
noisy, stochastic nature of the control of gene expression, as discussed in Chapter 
7. Amechanism is needed to keep all the cells in the presomitic mesoderm that 
will form a particular somite oscillating in synchrony. This is achieved through 
cell-cell communication via the Notch signaling pathway, to which the Hes genes 
are coupled. The gene regulatory circuitry is such that in this context Notch sig- 
naling does not drive neighboring cells to be different, as in lateral inhibition, but 
does just the opposite: it keeps them in unison. In mutants where Notch signaling 
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Figure 21-39 Delayed negative feedback giving rise to oscillating gene expression. 

(A) A single gene, coding for a transcription regulator that inhibits its own expression, can behave 
as an oscillator. For oscillation to occur, there must be a delay (or several delays) in the feedback 
Circuit, and the lifetimes of the mRNA and protein (which contribute to the delay) must be short 
compared with the total delay. The total delay determines the period of oscillation. It is thought that 


a feedback circuit like this, based on a pair of redundantly acting genes called Her1 and Her7 in 
the zebrafish —or their counterpart, Hes7, in the mouse—is the pacemaker of the segmentation 
clock governing somite formation. (B) The predicted oscillation of Her? and Her7 mRNA and 
protein, computed using rough estimates of the feedback circuit parameters appropriate to 

this gene in the zebrafish. Concentrations are measured as numbers of molecules per cell. The 
predicted period is close to the observed period, which is 30 minutes per somite in the zebrafish 


(depending on temperature). 


fails, including mutants defective in Delta or Notch itself, the cells drift out of syn- 
chrony and somite segmentation is again disrupted. This leads to gross deformity 
of the vertebral column—an extraordinary display of the consequences of the 
noisy temporal control of gene expression at the single-cell level, writ large in the 
structure of the vertebrate body as a whole. 


Intracellular Developmental Programs Can Help Determine the 
Time-Course of a Cell’s Development 


Although signaling between cells plays an essential part in driving the progress of 
development, this does not mean that cells always need signals from other cells to 
prod them into changing their character as development proceeds. Some of these 
changes are intrinsic to the cell (like the ticking of the segmentation clock) and 
depend on intracellular developmental programs that can operate even when the 
cell is removed from its normal environment. 

The best-understood example is in the development of neural precursor cells, 
or neuroblasts, in the embryonic Drosophila central nervous system. These cells, 
as we saw, are initially singled out from the neurogenic ectoderm of the embryo 
by a typical lateral-inhibition mechanism that depends on Notch, and they then 
proceed through an entirely predictable series of asymmetric cell divisions to 
generate ganglion mother cells that divide to form neurons and glial cells (see 
Figure 21-36). The neuroblast changes its internal state as it goes through its set 
program of divisions, generating different cell types with a reproducible sequence 
and timing. These successive changes in neuroblast specification occur through 
the sequential expression of specific transcription regulators. For example, most 
embryonic neuroblasts sequentially express the transcription regulators Hunch- 
back, Krüppel, Pdm, and Cas in a fixed order (Figure 21-40). When a neuroblast 
divides, the set of transcription regulators expressed at that time is inherited by 
the ganglion mother cell and its neural progeny; thus, the differentiated neural 
cells are endowed with different characters according to their time of birth. 

Remarkably, when neuroblasts are taken from an embryo and maintained in 
culture, isolated from their normal surroundings, they step through much the 
same stereotyped developmental program as if they had been left in the embryo. 
Moreover, many of the neuroblast transitions occur even when cell division is 
blocked. The neuroblasts seem to have a built-in timer that determines when each 
of the transcription regulators is expressed, and this timer can continue to run in 
the absence of cell division. The molecular basis of the timing is largely unknown; 
in part, at least, it must depend on the time taken for gene switching, as described 
above; but it may well also depend on slow progressive changes in chromatin 
structure. These too can serve to measure the passage of time in the embryo. 
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Figure 21-40 Temporal patterning 

of neuroblast fate in Drosophila. 
Hunchback, Kruppel, Pdm, and Cas 

are transcription regulators that are 
expressed consecutively in the cell lineage 
of neuroblasts during development of the 
Drosophila nervous system. At Successive 
time steps, correlated with cell division, 

the neuroblast switches its pattern of 

gene expression. Each neuroblast division 
produces one daughter that remains a 
neuroblast and expresses the updated set 
of genes, and one ganglion mother cell that 
maintains the expression of this gene set 
and differentiates into specific cell types 
accordingly. (After B.J. Pearson and 

C.Q. Doe, Nature 425:624-628, 2003.With 
permission from Macmillan Publishers.) 
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Cells Rarely Count Cell Divisions to Time Their Develooment 


Many specialized cells in animals develop from proliferating progenitor cells 
that stop dividing and terminally differentiate after a limited number of cell divi- 
sions. In these cases, differentiation is coordinated with withdrawal from the cell 
cycle, but it is usually not known how the coordination is achieved. It has often 
been suggested that the cell-division cycle might serve as an intracellular timer 
to control the timing of cell differentiation. The cell cycle would be the ticking 
clock that sets the tempo of other developmental processes, with maturational 
changes in gene expression being dependent on cell-cycle progression. Most of 
the evidence, however, indicates that this tempting idea is wrong. Although there 
are examples where cells change their maturation state with each division and 
the change depends on cell division, this is not the general rule. As we just saw for 
neuroblasts in the Drosophila embryo, cells in developing animals often carry on 
with their normal timetable of maturation and differentiation even when cell divi- 
sion is artificially blocked; necessarily, some abnormalities occur, if only because 
a single undivided cell cannot differentiate in two ways at once. But it seems that 
most developing cells can change their state without a requirement for cell divi- 
sion. Developmental control genes can switch the cell-division-cycle machinery 
on or off, and it is the dynamics of these genes, rather than the cell cycle, that sets 
the tempo of development. 


MicroRNAs Often Regulate Developmental Transitions 


Genetic screens are useful for tracking down the genes involved in almost any bio- 
logical process, and they have been used to search for mutations that alter devel- 
opmental timing. Such screens were performed in the nematode Caenorhabdi- 
tis elegans (Figure 21-41). This worm is small, relatively simple, and precisely 
structured. The anatomy of its development is highly predictable and has been 
described in extraordinary detail, so that one can map out the exact lineage of 
every cell in the body and see exactly how the developmental program is altered 
in a mutant. Genetic screens in C. elegans revealed mutations that disrupt devel- 
opmental timing in a particularly striking way: in these so-called heterochronic 
mutants, certain cells in a larva at one stage of development behave as though 
they were in a larva at a different stage of development, or cells in the adult carry 
on dividing as though they belonged to a larva (Figure 21-42). 

Genetic analyses showed that the products of the heterochronic genes act 
in series, forming regulatory cascades. Unexpectedly, two genes at the top of 
their respective cascades, called Lin4 and Let7, were found to code not for pro- 
tein but instead for microRNAs (miRNAs)—short, untranslated, regulatory RNA 
molecules, 21 or 22 nucleotides long. These act by binding to complementary 
sequences in the noncoding regions of mRNA molecules transcribed from other 
heterochronic genes, thereby repressing their translation and promoting their 
degradation, as discussed in Chapter 7. Increasing levels of Lin4 miRNA govern the 
progression from first-stage larva cell behaviors to third-stage larva cell behaviors. 
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Increasing levels of Let7 miRNA govern the progression from late larva to adult. 
In fact, Lin4 and Let7 were the first miRNAs to be described in any animal: it was 
through developmental genetic studies in C. elegans that the importance of this 
whole class of molecules for gene regulation in animals was discovered. 

More generally, in many animals, miRNAs help regulate the transitions 
between different stages of development. For example, in flies, fish, and frogs, the 
maternal mRNAs that are loaded into the egg in the mother are removed during 
early development when the genome of the embryo begins to be transcribed; at 
this stage, the embryo begins to express specific miRNAs that target many mater- 
nal mRNAs for translational repression and degradation. 

Thus, miRNAs can sharpen developmental transitions by blocking and remov- 
ing mRNAs that define an earlier developmental stage. But how is the timing of 
miRNA expression itself controlled? In the case of the miRNAs that disable mater- 
nal mRNAs in frogs and fish, expression is activated at the end of the series of 
rapid, synchronous divisions that cleave the fertilized egg into many smaller cells. 
As the division rate of these blastomeres slows, widespread transcription of the 
embryo’s genome begins (Figure 21-43). This event, where the embryo’s own 
genome largely takes over control of development from maternal macromole- 
cules, is called the maternal-zygotic transition (MZT), and it occurs with roughly 
similar timing in most animal species, with the exception of mammals. 

One trigger for the MZT appears to be the nuclear-to-cytoplasmic ratio. 
During cleavage, the total amount of cytoplasm in the embryo remains constant, 
but the number of cell nuclei increases exponentially. As a critical threshold is 
reached in the ratio of cytoplasm to DNA, the cell cycles lengthen and transcrip- 
tion is initiated. Thus, haploid embryos undergo the MZT one cell cycle later than 
diploid embryos, which contain twice as much DNA per cell. According to one 
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Figure 21-42 Heterochronic mutations 
in the Lin14 gene of C. elegans. Only the 
effects on one of the many altered lineages 
are shown. A loss-of-function (recessive) 
mutation in Lin14 causes premature 
occurrence of the pattern of cell division 
and differentiation characteristic of a late 
larva, so that the animal reaches its final 
state prematurely and with an abnormally 
small number of cells. The gain-of-function 
(dominant) mutation has the opposite 
effect, causing cells to reiterate patterns 

of cell divisions characteristic of the first 
larval stage, continuing through as many as 
five or six molt cycles. The cross denotes 
a programmed cell death. Green lines 
represent cells that contain Lint 4 protein 
(which binds to DNA), red lines those that 
do not. (Adapted from V. Ambros and 

H.R. Horvitz, Science 226:409-416, 1984. 
With permission from the authors; and 

P. Arasu, B. Wightman and G. Ruvkun, 
Genes Dev. 5:1825-1833, 1991. With 
permission from the authors.) 


Figure 21-43 The maternal-zygotic 
transition in a zebrafish embryo. 
Maternal mRNAs are deposited by the 
mother into the egg and drive early 
development. These mRNAs are degraded 
during different stages of embryogenesis, 
including blastula and gastrula stages, but 
a relatively abrupt change occurs at the 
maternal-zygotic transition (MZT). Before 
this, the embryonic (zygotic) genome is 
transcriptionally inactive; afterward, zygotic 
genes start to be transcribed. In zebrafish 
embryos, the zygotic genome begins to be 
activated at the 512-cell stage. 
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model, the nuclear-to-cytoplasmic ratio might be measured through the titration 
of a transcription repressor against the increasing amount of nuclear DNA. The 
total amount of repressor would stay constant during cleavage divisions, but the 
amount of repressor per genome would decrease, falling by a half with each round 
of DNA synthesis, until loss of repression allowed the zygotic genome to become 
transcriptionally active. The newly synthesized transcripts include the miRNAs 
that recognize many of the transcripts deposited in the egg by the mother, direct- 
ing their translational repression and rapid degradation. 


Hormonal Signals Coordinate the Timing of Developmental 
Transitions 


We have so far emphasized timing mechanisms that operate locally and sepa- 
rately in the different parts of the embryo, or in specific subsystems of the molec- 
ular control machinery. Evolution has tuned each of these largely independent 
processes to run at an appropriate rate, matched to the needs of the organism as 
a whole. For some purposes, however, this is not enough: a global coordinating 
signal is required. This is especially true where changes have to occur throughout 
the body in response to a cue that depends on the environment. For example, 
when an insect or amphibian undergoes metamorphosis—the transition from 
larva to adult—almost every part of the body is transformed. The timing of meta- 
morphosis depends on external factors such as the supply of food, which deter- 
mines when the animal reaches an appropriate size. All the bodily changes have 
to be triggered together at the right time, even though they are occurring in widely 
separated sites. The coordination in such cases is provided by hormones—signal 
molecules that spread throughout the body. 

The metamorphosis of amphibians provides a spectacular example. During 
this developmental transition, amphibians switch from an aquatic to a terres- 
trial life. Larva-specific organs such as gills and tail disappear, and adult-specific 
organs such as legs form. This dramatic transformation is triggered by thyroid 
hormone, produced in the thyroid gland. If the gland is removed or if thyroid hor- 
mone action is blocked, metamorphosis does not occur, although growth contin- 
ues, producing a giant tadpole. Conversely, a dose of thyroid hormone given to a 
tadpole by an experimenter can trigger metamorphosis prematurely. 

The thyroid hormone is distributed through the vascular system and induces 
changes throughout the animal by binding to intracellular nuclear hormone 
receptors, which regulate hundreds of genes. This does not mean, however, that 
target tissues all respond in the same way to the hormone: organs differ not only 
in their levels of thyroid hormone receptors and levels of extracellular proteins 
that locally regulate the amount of active hormone, but also in the sets of genes 
that respond. Thyroid hormone induces muscle in the limbs to grow and muscle 
in the tail to die. The timing of the responses also differs: for example, the legs 
form early in response to a very low concentration of circulating hormone, but it 
requires a high level of the hormone to induce resorption of the tail. 

A surge of thyroid hormone triggers metamorphosis, but how is the timing of 
the surge controlled? One mechanism depends on coupling hormone synthesis 
to the size of the thyroid gland, which reflects the size of the tadpole. Only when 
the gland attains a certain size does it produce enough thyroid hormone to initi- 
ate metamorphosis. However, environmental cues other than nutrition also play a 
part: conditions such as temperature and light are sensed by the nervous system, 
which regulates the secretion of another tier of hormones (neurohormones) that 
stimulate the secretion of thyroid hormone. Thus, tadpole-intrinsic factors such 
as size combine with environmental factors to determine when metamorphosis 
begins. 


Environmental Cues Determine the Time of Flowering 


Another striking example of environmentally controlled developmental timing is 
the flowering of plants. Flowering involves a transformation of the behavior of the 
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cells at the growing apex of the plant shoot—the apical meristem. During ordinary 
vegetative growth, these cells behave as stem cells, generating a steady succession 
of new leaves and new segments of stalk. In flowering, the meristem cells switch 
to making the components of a flower, with its sepals and petals, its stamens car- 
rying pollen, and its ovary containing the female gametes. 

To time the switch correctly, the plant has to take account of both past and 
present conditions. One important cue, for many plants, is day length. To sense 
this, the plant uses its circadian clock—an endogenous 24-hour rhythm of gene 
expression—to generate a signal for flowering only when there is light for the 
appropriate part of the day. The clock itself is influenced by light, and the plant 
in effect uses the clock to compare past to present lighting conditions. Important 
parts of the genetic circuitry underlying these phenomena have been identified, 
including the phytochromes and cryptochromes that act as light receptors (dis- 
cussed in Chapter 15). The flowering signal that is carried from the leaves to the 
stem cells via the vasculature depends on the product of Flowering locus T (Ft). 

But this signal will trigger flowering only if the plant is in a receptive condi- 
tion from prior long-term cold exposure. Many plants need winter before they 
will flower—a process called vernalization. Cold over a period of weeks or months 
progressively reduces the level of expression of a remarkable gene called Flower- 
ing locus C (Flic). Flc encodes a transcriptional repressor that suppresses expres- 
sion of the Ft flowering promoter. 

How does vernalization shut down Fic so as to lift the block to flowering? The 
effect involves a noncoding RNA called Coolair that overlaps with the Fic gene and 
is produced when the temperature is low (Figure 21-44). Together with cold-in- 
duced chromatin modifiers, including Polycomb-group proteins, Coolair coordi- 
nates the switching of Fic chromatin to a silent state (discussed in Chapters 4 and 
7). The degree of silencing depends on the length of cold exposure enabling the 
plants to distinguish the odd chilly night from the whole of winter. 

The effect on the chromatin is long lasting, persisting through many rounds 
of cell division even as the weather grows warmer. Thus vernalization creates a 
persistent block in production of Flc, enabling the Ft signal to be generated when 
day length is sufficiently long. 
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Mutations affecting the regulation of Fic expression alter the time of flowering 
and thus the ability ofa plant to flourish in a given climate. The whole control sys- 
tem governing the switch to flowering is thus of vital importance for agriculture, 
especially in an era of rapid climate change. 

The example of vernalization suggests a general point about the role of chro- 
matin modification in developmental timing. The plant uses changes in chromatin 
to record its experience of prolonged cold. It may be that in other organisms—ani- 
mals as well as plants—slow, progressive changes in chromatin structure pro- 
vide long-term timers for those mysterious developmental processes that unfold 
slowly, over a period of days, weeks, months, or years. Such chromatin timers may 
be among the most important clocks in the embryo, but as yet we understand very 
little about them. 


Summary 


Developmental timing is controlled at many levels. It takes time to switch a gene on 
or off, and this time delay depends on the lifetimes of the molecules involved, which 
can vary widely. Cascades of gene regulation involve cascades of delays. Feedback 
loops can give rise to temporal oscillations in gene expression, and these may serve 
to generate spatially periodic structures. During vertebrate segmentation, for exam- 
ple, expression of the Hes genes oscillates, and one new pair of somites is formed 
during each oscillation cycle. Hes genes encode transcription repressor proteins that 
can act back on expression of the Hes genes themselves. This negative feedback gen- 
erates oscillations with a period that reflects the delay in the autoregulatory gene 
switching loop. The period of oscillation of this “segmentation clock” controls the 
sizes of the somites. Notch signaling between neighboring cells synchronizes their 
oscillations: when Notch signaling fails, the cells drift out of synchrony because of 
genetic noise in their individual clocks, and the segmental organization of the ver- 
tebral column is disrupted. 

Timing does not always depend on cell-cell interactions; many developing ani- 
mal cells have intrinsic developmental programs that play out even in isolated cells 
in culture. Neuroblasts in Drosophila embryos, for example, go through set pro- 
grams of asymmetric divisions, generating different neural cell types at each divi- 
sion with a predictable sequence and timing, through a cascade of gene switching 
events. Studies in both vertebrates and invertebrates show that such programs are 
rarely governed by the timing of cell division and can unfold even when cell division 
is blocked. MicroRNAs produced at critical moments sharpen developmental tran- 
sitions by blocking the translation and promoting the degradation of specific sets 
of mRNAs. Global coordination of developmental timing is achieved by hormones: 
as a tadpole grows, for example, thyroid hormone levels surge and trigger its meta- 
morphosis into a frog. Environmental control of developmental timing is especially 
striking in plants and reveals the presence of molecular timers that act over the long 
term. In vernalization, for example, prolonged cold induces changes in chromatin 
that chart the passage through winter so as to allow flowering only in the spring. 
Slow, progressive changes in chromatin structure are likely to be important timers 
in the long-term programming of development in animals too. 


MORPHOGENESIS 


The specialization of cells into distinct types at specific times is important, but it 
is only one aspect of animal development. Equally important are the movements 
and deformations that cells go through to assemble into tissues and organs with 
specific shapes and sizes. Like developmental timing, this process of morphogen- 
esis (“form generation”) is less well understood than the processes of differential 
gene expression and inductive signaling that lead to cell-type specialization. The 
cell movements can be readily described, but the underlying molecular mecha- 
nisms that coordinate the movements are much harder to decipher. 

In Chapter 19, we saw how cells cohere to form epithelial sheets or surround 
themselves with extracellular matrix to create connective tissues. We also dis- 
cussed how the basic features of tissues, such as the polarity of epithelia, arise 
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from the properties of individual cells. In this section, we consider how the re- 
arrangements of cells during animal development give shape to the embryo and 
to all the individual organs and appendages of the body. 

A small number of cell processes are basic to morphogenesis. Individual cells 
can migrate through the embryo along defined tracks. They can crawl over one 
another in a coordinated way to elongate, constrict, or thicken a tissue. They can 
segregate from their neighbors and form physically separate groups. They can 
change their shape so as to deform an epithelial sheet into a tube or a vesicle. 
By stretching out while holding on to their companions, specialized sets of cells 
can form growing tubular networks such as the system of blood or lymph vessels. 
Mass migrations, as occur in gastrulation, can transform the entire topology of 
the embryo. Underlying all these processes are changes in cell shape and changes 
in cell contacts—either with other cells or with extracellular matrix. We begin by 
considering the migration of individual cells. 


Cell Migration Is Guided by Cues in the Cell’s Environment 


The birthplace of cells is often far from their ultimate location in the body. Our 
skeletal muscles, for example, derive from muscle cell precursors, or myoblasts, in 
somites, from which they migrate into the limbs and other regions. The routes that 
the migrant cells follow and the selection of sites that they colonize determine the 
eventual pattern of muscles in the body. The embryonic connective tissues form 
the framework through which the myoblasts travel, and these tissues provide the 
cues that guide myoblast distribution. No matter which somite they come from, 
the myoblasts that migrate into a forelimb bud will form the pattern of muscles 
appropriate to a forelimb, and those that migrate into a hindlimb bud will form 
the pattern appropriate to a hindlimb. It is the connective tissue that provides the 
patterning information. 

As a migrant cell travels through the embryonic tissues, it repeatedly extends 
surface projections that probe its immediate surroundings, testing for cues to 
which it is particularly sensitive by virtue of its specific assortment of cell-surface 
receptor proteins. Inside the cell, these receptors are connected to the cortical 
actin and myosin cytoskeleton, which moves the cell along. Some extracellular 
matrix molecules, such as the protein fibronectin, provide adhesive sites that help 
the cell advance; others, such as chondroitin sulfate proteoglycan, inhibit loco- 
motion and repel immigration. The nonmigrant cells along the migration pathway 
may likewise have inviting or repellent macromolecules on their surface; some 
may even extend filopodia to make their presence known. 

Among the many guiding influences, a few stand out as especially import- 
ant. In particular, many types of migrating cells are guided by chemotaxis that 
depends on a G-protein-coupled receptor (called CXCR4), which is activated by 
an extracellular ligand called CXCL12. Cells expressing this receptor can snuffle 
their way along tracks marked out by CXCL12 (Figure 21-45). Chemotaxis toward 
sources of CXCL12 plays a major part in guiding the migrations of lymphocytes 
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Figure 21-45 CXCL12 guides migrating 
germ cells. Zebrafish germ cells migrate to 
domains that express CXCL12. As the sites 
of CXCL12 expression change, cells follow 
the CXCL12 track and are guided to the 
region where the gonad develops at a later 
developmental stage. (A) At the 4-somite 
stage, germ cells move from a position 

that is close to the midline to more lateral 
regions where CXCL12 is expressed. (B) As 
the CXCL12 expression retracts, germ cells 
are guided to more posterior positions. 
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and various other white blood cells; of neurons in the developing brain; of myo- 
blasts entering limb buds; of primordial germ cells as they travel toward the devel- 
oping gonads; and of cancer cells when they metastasize. 

Detailed studies of primordial-germ-cell migration have shown that CXCL12 
signaling does not induce cell migration per se but rather serves to control its 
direction. In the absence of CXCL12 signaling, germ cells still display the mem- 
brane blebbing associated with cell migration, but the position of the cell front 
where blebs form is randomly chosen (Figure 21-46); if CXCL12 signaling is 
intact, blebbing is more frequent on the side of the cell that faces the source of 
CXCL12, resulting in directional migration. 


The Distribution of Migrant Cells Depends on Survival Factors 


The final distribution of migrant cells depends not only on the routes they take, 
but also on whether they survive the journey and thrive in the environment they 
find at the journey’s end. Specific sites provide survival factors needed for specific 
types of migrant cells to survive. 

Among the most important sets of migrant cells in the vertebrate embryo are 
those of the neural crest. They arise from the border region between the part of the 
ectoderm that will form epidermis and the part that will form the central nervous 
system. As the neural ectoderm rolls up to form the neural tube, the neural crest 
cells break loose from the epithelial sheet along this border region and set out on 
their long migrations (see Figure 19-8 and Movie 21.5). They settle ultimately in 
many sites and give rise to a surprising diversity of cell types. Some lodge in the 
skin and specialize as pigment cells; still others form skeletal tissue in the face. 
Still others will differentiate into the neurons and glial cells of the peripheral ner- 
vous system—not only in the sensory ganglia that lie close to the spinal cord, but 
also, following a much longer migration, in the wall of the gut. 

The neural crest cells that give rise to the pigment cells of the skin and those 
that develop into the nerve cells of the gut depend on a secreted peptide called 
endothelin-3, which is produced by tissues along the migration pathways and acts 
as a survival factor for the migrating crest cells. In mutants with a defect in the 
gene for endothelin-3 or its receptor, many of these migrating crest cells die. As 
a result, the mutant individuals have nonpigmented (albino) patches of skin and 
a deficit of nerve cells in the intestine, especially its lower end, the large bowel, 
which becomes abnormally distended for lack of proper neural control—a poten- 
tially lethal condition called megacolon. 


Figure 21-46 Directional migration by 
local blebbing. Germ cells migrate via 
protrusions that define the leading edge 

of the cell. The persistence and site of the 
protrusions are biased toward higher levels 
of CXCL12. Thus, germ cells migrate up 
the CXCL12 gradient. 
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Figure 21-47 Effect of mutations in the 
Kit gene. Both the baby and the mouse 
are heterozygous for a loss-of-function 
mutation that leaves them with only half the 
normal quantity of Kit gene product. In 
both cases, pigmentation is defective 
because pigment cells depend on the gene 
product as a receptor for a survival factor. 
(Courtesy of R.A. Fleischman, from 

R.A. Fleischman et al., Proc. Natl Acad. 
Sci. USA 88:10885-10889, 1991.) 





Another important survival signal for many types of migratory cells, including 
primordial germ cells, blood cell precursors, and neural-crest-derived pigment 
cells, depends on a receptor tyrosine kinase called Kit. This is expressed on the 
surface of the migrant cells, and a protein ligand, called Steel factor, is produced 
by the cells of the tissue through which the cells migrate and/or in which they 
come to settle. Individuals with mutations in the genes for either of these proteins 
have deficits in pigmentation, blood cells, and germ cells (Figure 21-47). 


Changing Patterns of Cell Adhesion Molecules Force Cells Into 
New Arrangements 


Patterns of gene expression govern embryonic cell movements in many ways. 
They regulate cell motility, cell shape, and the production of proteins that guide 
migration. Importantly, they also determine the sets of adhesion molecules that 
the cells display on their surface. Through changes in its surface molecules, a cell 
can break old attachments and make new ones. Cells in one region may develop 
surface properties that make them cohere with one another and become segre- 
gated from a neighboring group of cells with different surface chemistry. 

Experiments done half a century ago on early amphibian embryos showed 
that the effects of selective cell-cell adhesion can be so powerful that they can 
bring about an approximate reconstruction of the normal structure of an early 
postgastrulation embryo after the cells have been artificially dissociated and 
mixed up. When these cells are reaggregated into a random mixture, the cells 
spontaneously sort themselves out according to their original germ-layer origins 
(Figure 21-48). As discussed in Chapter 19, cadherin proteins have a central role 
in the sorting process (see Figure 19-9). Cadherins belong to a large and varied 
family of Ca**-dependent cell-cell adhesion proteins, and they and other cell-cell 
adhesion proteins are differentially expressed in the various tissues of the early 
embryo. Antibodies against these proteins interfere with the normal selective 
adhesion between cells of a similar type. 

Changes in the patterns of expression of the various cadherins correlate closely 
with the changing patterns of association among cells during various develop- 
mental processes, including gastrulation, neural tube formation, and somite for- 
mation. These cell rearrangements are likely to be regulated and driven in part by 





Figure 21-48 Sorting out by adhesion. Cells from different parts of an early 
amphibian embryo will sort out according to their origins. In the classical 
experiment shown here, mesoderm cells (green), neural plate cells (blue), and 
epidermal cells (red) have been disaggregated and then reaggregated in a 
random mixture. They sort out into an arrangement reminiscent of a normal 
embryo, with a “neural tube” internally, epidermis externally, and mesoderm in 
between. (Modified from P.L. Townes and J. Holtfreter, J. Exp. Zool. 128:53- 
120, 1955. With permission from Wiley-Liss.) 
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the cadherin pattern. In particular, cadherins appear to have a major role in con- 
trolling the formation and dissolution of epithelial sheets and clusters of cells (see 
Movie 19.1). They not only glue one cell to another but also provide anchorage 
for intracellular actin filaments at the sites of cell-cell adhesion. In this way, the 
pattern of stresses and movements in the developing tissue is regulated according 
to the pattern of cell adhesions. 


Repulsive Interactions Helo Maintain Tissue Boundaries 


The different types of cadherins enable different types of cells to cohere selec- 
tively: cells expressing one type of cadherin will maximize their contact with cells 
expressing the same cadherin and thereby segregate from other cells, creating 
specific tissue boundaries. Cell mixing can be inhibited and boundaries created 
and maintained in another way as well: cells of different types can sometimes 
actively repel one another. The bidirectional activation of Eph receptors and eph- 
rins discussed in Chapter 15 often mediates such repulsion, acting at interfaces 
between different groups of cells to keep the groups from mixing, and repelling 
invasion by inappropriate visitors. Ephrin-Eph signaling operates, for example, 
at the boundaries of the rhombomeres discussed earlier. Neighboring rhombo- 
meres express complementary combinations of ephrins and Eph receptors, and 
this keeps the cells in adjacent rhombomeres strictly segregated, with a boundary 
between them that is sharply defined (Figure 21-49). 


Groups of Similar Cells Can Perform Dramatic Collective 
Rearrangements 


Cadherin-mediated cell sorting and ephrin-Eph-mediated repulsion exemplify 
how differences in cell-surface properties can drive tissue arrangements, causing 
cells that express different sets of genes to separate from one another. However, 
groups of cells that are all similar can also undergo dramatic rearrangements. 
During frog gastrulation, for example, cells in one region of the surface epithe- 
lium invaginate and migrate as a sheet into the interior of the embryo and con- 
verge toward the embryonic midline. The movement is driven mainly by an active 
rearrangement of the migrating cells, called convergent extension. Here the cells 
crawl over one another in a coordinated way, displacing their neighbors as they 
migrate, causing the cell sheet to narrow along one axis (converge) and elongate 
along another (extend). Strikingly, small, square fragments of tissue from the 
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Figure 21-49 Sorting out by repulsion. Ephrin—Eph signaling in hindbrain segmentation in a chick 
embryo. Each pair of rhombomeres (Segments in the hindbrain) is associated with a branchial 

arch (a modified gill rudiment) to which it sends innervation. Rhombomeres are distinguished from 
one another by expression of different Hox genes (See Figure 21-32). Mutual repulsion (red bars) 
between cells that express EphrinB2 in rhombomere 4 and EphAd4 in rhombomere 5 creates a 
sharp boundary. 
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appropriate region of the embryo, isolated in culture, will spontaneously narrow 
and elongate, just as they would in the embryo (Figure 21-50). The alignment 
of the cell movements depends on the same signaling pathway that is involved 
in generating planar cell polarity within developing epithelia, as we discuss next. 


Planar Cell Polarity Helos Orient Cell Structure and Movement in 
Developing Epithelia 


Cells within an epithelium always have an apical-basal polarity (discussed in 
Chapter 19), but the cells of many epithelia show an additional polarity at right 
angles to this axis: the cells are all arranged as if they had an arrow written on 
them, pointing in a specific direction in the plane of the epithelium. This type of 
polarity is called planar cell polarity. In the wing of a fly, for example, each epi- 
thelial cell has a tiny asymmetrical projection, called a wing hair, on its surface, 
and the hairs all point toward the tip of the wing. Similarly, in the inner ear of a 
vertebrate, each mechanosensory hair cell has a precisely oriented asymmetric 
bundle of actin-filled, rodlike protrusions called stereocilia sticking up from its 
apical plasma membrane as a detector of sound and of forces such as gravity. Tilt- 
ing the bundle in one direction causes ion channels in the membrane to open, 
electrically activating the cell; tilting in the opposite direction has the opposite 
effect. For the ear to function properly, the hair cells must be oriented correctly. 
Planar cell polarity is also important in the respiratory tract, where every ciliated 
cell must orient the beating of its cilia so as to sweep mucus upward, away from 
the lungs. 

Screens for mutants with misoriented wing hairs in Drosophila have identified 
a set of genes that is critical for planar cell polarity. Some of these genes code for 
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Figure 21-50 Convergent extension 
and its cellular basis. (A) Schematic 
diagram of cell behaviors that underlie 
convergent extension. The cells form 
lamellipodia, with which they attempt to 
crawl over one another. Alignment of the 
lamellipodial movements along a common 
axis leads to convergent extension. The 
process depends on the Wnt-Frizzled/ 
planar-cell-polarity signaling pathway and is 
cooperative, presumably because cells that 
are already aligned exert forces that tend to 
align their neighbors in the same way. 
(B—G) The pattern of convergent 
extension of dorsal mesoderm during 
zebrafish gastrulation at 8.8 (B, E), 9.3 (C, 
F), and 11.3 (D, G) hours after fertilization. 
Cells that will give rise to the notochord 
are labeled in green, and cells that will give 
rise to somites and muscle are labeled in 
blue. The notochord and somite domains 
are spatially separate from the start of the 
recording (B, E), but their boundaries are 
at first barely visible and only a little later 
become obvious. Convergence narrows 
the notochord domain to a width of about 
two cells at the last time point (D, G). 
(A, after J. Shih and R. Keller, Development 
116:901-914, 1992; B-G, after 
N.S. Glickman et al., Develooment 
130:873-887, 2003. With permission from 
The Company of Biologists.) 
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Figure 21-51 Planar cell polarity. (A) Wing hairs on the wing of a fly. Each cell in the wing epithelium forms a small, spiky 
protrusion or “hair” at its apex, and all the hairs point the same way, toward the tip of the wing. This reflects a planar polarity in 
the structure of each cell. (B) Sensory hair cells in the inner ear of a mouse similarly have a well-defined planar polarity, manifest 
in the oriented pattern of stereocilia (actin-filled protrusions) on their surface. The detection of sound depends on the correct, 
coordinated orientation of the hair cells. (C) A mutation in the gene Flamingo in the fly, coding for a nonclassical cadherin, 
disrupts the pattern of planar cell polarity in the wing. (D) A mutation in a homologous Flamingo gene in the mouse randomizes 
the orientation of the planar cell polarity vector of the hair cells in the ear. The mutant mice are deaf. (A and C, from J. Chae et 
al., Development 126:5421-5429, 1999. With permission from The Company of Biologists; B and D, from J.A. Curtin et al., 
Curr. Biol. 13:1129-1138, 2003. With permission from Elsevier.) 


components of the Wnt signaling pathway, others code for specialized members 
of the cadherin superfamily, while the functions of others are uncertain. These 
components of planar-cell-polarity signaling are assembled at cell-cell junctions 
in the epithelium in such a way as to exert a polarizing influence that can propa- 
gate from cell to cell. Essentially the same system of proteins controls planar cell 
polarity in vertebrates; mice deficient in homologs of the Drosophila planar polar- 
ity genes have a variety of defects, including incorrectly oriented hair cells in the 
inner ear, making them deaf (Figure 21-51). 


Interactions Between an Epithelium and Mesenchyme Generate 
Branching Tubular Structures 


Animals require specialized types of epithelial surfaces for many functions, 
including excretion, absorption of nutrients, and gas exchange. Where large sur- 
faces are required, they are often organized as branching tubular structures. The 
lung is an example. It originates from epithelial buds that grow out from the floor 
of the foregut and invade neighboring mesenchyme to form the bronchial tree, a 
system of tubes that branch repeatedly as they extend. Endothelial cells that form 
the lining of blood vessels invade the same mesenchyme, thereby creating a sys- 
tem of closely apposed airways and blood vessels, as required for gas exchange in 
the lung (Figure 21-52). This whole process of branching morphogenesis depends 
on signals that pass in both directions between the growing epithelial buds and 
the mesenchyme. Genetic studies in mice indicate that FGF proteins and their 
receptor tyrosine kinases play a central part in these signaling processes. FGF sig- 
naling has various roles in development, but it is especially important in the many 
interactions that occur between a developing epithelium and mesenchyme. Figure 21-52 The airways of the lung, 
In the case of lung development, FGF10 is expressed in clusters of mesen- shown in a cast of the adult human 
chyme cells that lie near the tips of the growing epithelial tubes, and its receptoris Bronchial tree. Resins of different colors 
s . y . ; i : : have been injected into different branches 

expressed in the invading epithelial cells. In FGF10-deficient mutant mice, a pri- Sf the tree of ainways. From A. Warwiek 
mary bud of lung epithelium is formed but fails to grow out into the mesenchyme and P.L. Williams, Gray’s Anatomy, 35th ed. 
to create a branching bronchial tree. Conversely, a microscopic bead soaked in Edinburgh: Longman, 1973.) 
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Figure 21-53 Branching morphogenesis 
se of the lung. How FGF10 and Sonic 
hedgehog are thought to induce the growth 
and branching of the buds of the bronchial 
tree. Many other signal molecules, such as 
BMP4, are also expressed in this system, 
and the suggested branching mechanism 
is only one of several possibilities. 

As indicated, FGF10 protein is 
expressed in clusters of mesenchyme 
cells near the tips of the growing epithelial 
tubes, and its receptor is expressed in 
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direction, from the epithelial cells at the 
tios of the buds back to the mesenchyme. 
The patterns of gene expression and their 
timing suggest that the Sonic hedgehog 


FGF10 and placed near embryonic lung epithelium in culture will induce a bud Signal may serve to shut off FGF10 
expression in the mesenchyme cells closest 


to form and grow out from the epithelium toward the bead. Evidently, the epithe- = oy. growing ip oia bud spitingihe 
lium invades the mesenchyme only by invitation, in response to FGF10. FGF10-secreting cluster into two separate 
But what makes the growing epithelial tubes of the lung branch repeatedly clusters, which in turn cause the bud to 
as they invade the mesenchyme? This depends on a Sonic hedgehog signal that  ÞPranch into two. 
is sent in the opposite direction, from the epithelial cells at the tips of the buds 
back to the mesenchyme, as shown in Figure 21-53. In mice lacking Sonic hedge- 
hog, the lung epithelium grows and differentiates, but it forms a sac instead of a 
branching tree of tubules. 
FGF signaling acts in a remarkably similar way in the formation of the air-ex- 
change system of insects, which consists of a pattern of fine, air-filled channels 
called tracheae and tracheoles. These originate from the epidermis covering the 
surface of the body and extend inward to invade the underlying tissues, branch- 
ing and narrowing as they go (Figure 21-54). The FGF acts on cells at the tips of 
the advancing tracheae, causing them to extend filopodia and migrate toward the 
source of the FGF signal. Because the tip cells remain connected to the remain- 
der of the tracheal epithelium, the pulling force that they generate elongates the 
tracheal tube. 
Initially, the pattern of FGF production in fly embryos is defined by the D-V 
and A-P patterning systems discussed earlier. In later stages of development, how- 
ever, FGF expression is induced by transcription regulators called hypoxia-induc- 
ible factors (HIFs) that are activated by hypoxia (low oxygen levels). In this way, 
hypoxia stimulates the formation of finer and finer and more extensively branched 
trachea, until the oxygen supply is sufficient to stop the process. Hypoxia and HIFs 
have similar roles in vertebrates, especially in the development of blood vessels, 
as we shall see in the next chapter. 
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Figure 21-54 Branching morphogenesis of airways in a fly. (A) Drosophila embryonic tracheal 
system. (B) FGF (produced in Drosophila by the Branchless gene) signals from surrounding cells 
to the tracheal epithelium and activates its FGF receptors, leading to filopodia formation and tube 
elongation. [A, from G. Manning and M.A. Krasnow, in The Development of Drosophila 

(A. Martinez-Arias and M. Bate, eds), Vol. 1, pp. 609-685. New York: Cold Spring Harbor 
Laboratory Press, 1993.] 
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An Epithelium Can Bend During Development to Form a Tube or Figure 21-55 The forms of cell behavior 
; involved in tube formation. Folding 
Vesicle generates the neural tube, budding 


: ‘ ’ underlies the formation of lungs and 
The creation of systems of tubes such as blood vessels and airways is a complex trachea, cord Rolo wing occur canna 


process, and it can involve various additional forms of cell behavior, as sketched the formation of mammalian salivary 
in Figure 21-55. glands, cell hollowing is involved in the 
As explained in Chapter 19, the process that converts an epithelial sheet into a formation of tracheal terminal cell tubes, 

tube depends on contraction of specific bundles of actin filaments. With the help 209 Cel! assembly generates the heart tube 
. ; es f f that forms at the earliest stage of heart 

of myosin motor proteins, actin filament bundles can shorten, causing the epithe- gave opment. 

lial cells to narrow at their apex. These actin bundles are connected from cell to 

cell by adherens junctions, and if their contraction is coordinated along a specific 

axis, the result will be that the sheet bends and rolls up into a tube (Figure 21-56). 

The vertebrate neural tube, which we discuss in the last section of this chapter, 

originates in this way. 
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Figure 21-56 Bending of an epithelial sheet to form a tube. Contraction of apical bundles of actin filaments linked from cell to cell via adherens 
junctions causes the epithelial cells to narrow at their apex. Depending on whether the contraction is oriented along one axis of the sheet or is equal 
in all directions, the epithelium will either roll up into a tube or invaginate to form a vesicle. (A) Diagram showing how an apical contraction along one 
axis of an epithelial sheet can cause the sheet to form a tube. (B) Scanning electron micrograph of a cross section through the trunk of a two-day 
chick embryo, showing the formation of the neural tube by the process diagrammed in (A). (B, courtesy of Jean-Paul Revel.) 
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Summary 


Animal development involves dramatic cell movements, including the guided 
migration of individual cells, the adhesion and repulsion of groups of cells, and the 
complex extension, branching, or rolling up of epithelial tissues. Migrant cells, such 
as those of the neural crest, break loose from their original neighbors and travel 
through the embryo to colonize new sites. Many migrant cells, including primor- 
dial germ cells, are guided by chemotaxis dependent on the receptor CXCR4 and 
its ligand CXCL12. In general, cells that have similar adhesion molecules on their 
surfaces cohere and tend to segregate from other cell groups with different surface 
properties. Selective cell-cell adhesion is often mediated by cadherins; repulsion is 
often driven by ephrin-Eph signaling. Within an epithelial sheet, cells can rearrange 
themselves to drive epithelial convergence and extension, as in gastrulation. Many 
movements are coordinated through a Wnt-dependent planar-polarity signaling 
pathway that is also responsible for orienting cells correctly in various types of epi- 
thelium. Elaborate branched tubular structures, such as the airways of the lung, 
are generated through bidirectional signaling between an epithelial bud and the 
mesenchyme that it invades, in a process called branching morphogenesis. Epithe- 
lial tubes and vesicles can originate in various ways, most simply by the rolling up 
and pinching off of a segment of epithelium, as in the formation of the neural tube. 
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One of the most fundamental aspects of animal development is one we know 
surprisingly little about—how the size of an animal or an organ is determined. 
Why, for example, do we grow to be so much larger than a mouse? Even within a 
species, size can vary greatly; a Great Dane, for instance, can weigh over 40 times 
more than a Chihuahua (Figure 21-57). 

Three variables define the size of an organ or organism: the number of cells, 
the size of the cells, and the quantity of extracellular material per cell. Size differ- 
ences can arise from changes in any of these factors (Figure 21-58). If we compare 
a mouse with a human, for example, we find that the difference lies chiefly in the 
number of cells, there being roughly 3000 times more cells in a human, corre- 
sponding to a body that is roughly 3000 times more massive. Wild and cultivated 
species of food plants, on the other hand, often differ in body size chiefly because 
of differences of cell size. 

The challenge, therefore, is to understand how cell numbers, cell size, and 
extracellular matrix production are regulated. First of all, we need to identify the 
signals that drive or inhibit growth. Then we need to discover how the signals 
themselves are regulated. In many cases, the size of an organ or of the body as a 
whole seems to be controlled homeostatically, so that the correct size is reached 
and maintained even in the face of drastic disturbances. This suggests that the 
developing structure somehow senses its own size and uses this information to 
regulate the signals for its own growth or shrinkage. In most cases, the nature of 
this feedback control remains a profound mystery. 

In other cases, the duration of growth and the final size seem to be dictated 
by intracellular programs that take no cognizance of the size the structure has 
attained. These intracellular programs, too, present many mysteries, as we saw in 
our discussion of developmental timing. Very often, it seems, the sizes and pro- 
portions of body parts must depend on combinations of size-measuring feedback 
controls and intracellular programs, as well as on environmental influences such 
as nutrition. 

The variation in control strategies is nicely illustrated by some classic trans- 
plantation experiments. If several fetal thymus glands are transplanted into a 
developing mouse, each grows to its characteristic adult size. In contrast, if multi- 
ple fetal spleens are transplanted, each ends up smaller than normal, but collec- 
tively they grow to the size of one adult spleen. Thus, thymus growth is regulated 
by local mechanisms intrinsic to the individual organ, whereas spleen growth is 
controlled by a feedback mechanism that senses the quantity of spleen tissue in 
the body as a whole. In neither case is the mechanism known. 
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Figure 21-57 Members of the same 
species can have dramatically different 
sizes. The Chihuahua weighs 

2-5 kilograms, whereas a Great Dane 
weighs 45-90 kilograms. (Courtesy of 
Deanne Fitzmaurice.) 
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The Proliferation, Death, and Size of Cells Determine 
Organism Size 


The nematode worm C. elegans illustrates the different ways in which size differ- 
ences can arise. This creature follows an astonishingly precise and predictable 
developmental program. Each individual of a given sex is generated by almost 
exactly the same sequences of cell divisions and cell deaths, and consequently 
has precisely the same number of somatic cells—959 in the adult hermaphrodite 
(the sex of the majority of these animals)—although the number of germ cells is 
more variable from worm to worm. The stereotyped development makes it pos- 
sible to trace somatic cell lineages in exhaustive detail. More than 1000 cell divi- 
sions generate 1090 somatic cells during hermaphrodite development, but 131 of 
these cells undergo apoptotic cell death. Thus, precise regulation of both cell divi- 
sion and cell death determines the final numbers of somatic cells in the worm. In 
fact, genetic screens in C. elegans identified the first genes responsible for apop- 
tosis and its regulation—thereby revolutionizing our molecular understanding of 
this form of programmed cell death (discussed in Chapter 18). 

The final number of somatic cells in the adult worm is already present at sex- 
ual maturity (around three days after fertilization), after which no more somatic 
cells are generated. Yet the worm continues to grow, doubling in size between 
sexual maturity and death 2-3 weeks later. This doubling results from somatic cell 
growth: although the cells no longer divide, they continue to go through rounds of 
DNA synthesis; this endoreplication of the genome makes the cells polyploid. As in 
all organisms, the size of a cell is proportional to its ploidy—that is, the number of 
genome copies that it contains: a doubling of ploidy roughly doubles cell volume. 
By artificial manipulation of somatic cell ploidy, and thereby somatic cell size, the 
size of the worm as a whole can be increased or decreased. Thus the worm’s final 
size is set by a combination of programmed cell divisions and cell deaths, along 
with regulation of the sizes of individual cells through changes in ploidy. 

In plants, as in animals, cell size increases as ploidy increases (Figure 21-59). 
This effect has been exploited in the agricultural breeding of plants for large size: 
most of the major fruits and vegetables that we consume are polyploid. 


Animals and Organs Can Assess and Regulate Total Cell Mass 


The size of an animal or organ depends on both cell number and cell size—that 
is, on total cell mass. Remarkably, many animals and organs can somehow assess 
their total cell mass and regulate it, providing evidence for feedback controls of 
the sort highlighted earlier in our introductory account of general principles of 
growth control. In contrast with C. elegans, if cell size is artificially increased or 
decreased in these cases, cell numbers adjust to maintain a normal total cell mass. 
This has been beautifully illustrated by experiments done long ago in salamanders, 
where cell size can be manipulated by altering the animal’s ploidy. As shown in 
Figure 21-59E, salamanders of different ploidies end up being the same size with 
very different numbers of cells. The individual cells in a pentaploid salamander, 


Figure 21-58 Determinants of organ 
size. 
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Figure 21-59 Effects of ploidy on cell size and organ size. In all organisms, from bacteria to humans, cell size is proportional 
to ploidy—the number of copies of the genome per cell. This is illustrated for (A-D) Arabidopsis flowers and (E) for salamanders. 
In each case, the upper panels show cells in a specific tissue [a petal for Arabidopsis, a pronephric (kidney) tubule for the 
salamander]; the lower panels show the gross anatomy — flowers for Arabidopsis, the whole body for the salamander. In the 
case of Arabidopsis flowers, increase in cell size increases organ size. By contrast, the salamander and its individual organs 
attain their normal standard size regardless of ploidy, because large cell size is compensated for by fewer cells. This indicates 
that the size of an organism or organ in this species is not controlled simply by counting cell divisions or cell numbers; size 
must somehow be regulated at the level of total cell mass. [A—D, from C. Breuer et al., Plant Cell 19:3655-3668, 2007. With 
permission from the American Society of Plant Biologists; E, adapted from G. Fankhauser, in Analysis of Development 

(B.H. Willier, RA. Weiss and V. Hamburger, eds), pp. 126-150. Philadelphia: Saunders, 1955.] 


for example, are about five times the size of those in a haploid salamander, but 
there are only one-fifth as many cells. This scaling operates not only in the body as 
a whole, but in its individual organs. 

The imaginal discs of Drosophila provide another striking example of homeo- 
static size control. These are epithelial pouches that grow by cell proliferation 
during the larval period and, during the pupal stage, form the organs and extrem- 
ities of the adult fly (Figure 21-60). Experiments have been chiefly done on the 
wing imaginal disc. Mutations in components of the cell-cycle control machinery 
can be used to speed up or slow down the rate of cell division in the disc. Remark- 
ably, such mutations can result in an excessive number of abnormally small cells 





labial clypeo- dorsal eye + leg wing + haltere genital 
labrum prothorax antenna dorsal thorax 


Figure 21-60 The imaginal discs in 

the Drosophila larva (below) and the 
structures in the adult (above) that they 
give rise to. [After J.W. Fristrom et al., in 
Problems in Biology: RNA in Development 
(E.W. Hanley, ed.), p. 382. Salt Lake City: 
University of Utah Press, 1969.] 
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Figure 21-61 Pituitary dwarf and pituitary giant. The “giant” on the 

right is Robert Ladlow (1914-1940), the tallest recorded man at 8 feet 

11 inches (2.72 m), together with his father, who was almost 6 feet tall 
(1.82 m). The dwarf on the left is General Tom Thumb, which was the stage 
name of Charles Sherwood Stratton (1838-1883). On his 18th birthday, he 
was measured at 2 feet 8.5 inches (82.6 cm) tall, and at his death, he was 
3 feet 4 inches (102 cm).(Images from http://en.wikipedia.org/wiki/ 

File: Robert_Wadlow.jog. © Bettmann/CORBIS.) 


or areduced number of abnormally large cells, respectively, leaving the size (area) 
and patterning of the adult wing practically unchanged. Thus, the size of the disc 
is not regulated so as to contain a set number of cells. Instead, there must be a 
regulatory mechanism that halts growth when the disc’s total cell mass reaches 
the appropriate value, so that the size and pattern of the adult wing that develops 
from the disc are normal. Remarkably, developing discs—or even disc fragments, 
taken out of their normal context and transplanted into the abdomen of an adult 
female—will grow until they reach their normal size. Clearly, the mechanisms 
that regulate disc size are intrinsic to the disc. 

We still have very little idea how organisms or organs assess their total cell 
mass or monitor their own growth. Nevertheless, we are beginning to understand 
some of the signal molecules that drive or halt growth in response to the mysteri- 
ous cues that convey information about the size attained. 


Extracellular Signals Stimulate or Inhibit Growth 


We have already seen how some signals act systemically as hormones to regulate 
the development of the animal as a whole. Some of these serve to regulate growth. 
In mammals, for instance, growth hormone (GH) is secreted by the pituitary 
gland into the bloodstream and stimulates growth throughout the body: excessive 
production of growth hormone leads to gigantism, and too little leads to dwarfism 
(Figure 21-61). Pituitary dwarfs have bodies and organs that are proportionately 
small, unlike achondroplastic dwarfs, for example, whose limbs are dispropor- 
tionately short, usually because of a mutation in a gene encoding an FGF receptor 
that disrupts normal cartilage development (Figure 21-62). 

Growth hormone stimulates growth largely by inducing the liver and other 
organs to produce insulin-like growth factor 1 (IGF1), which acts mainly as a local 
signal within many tissues to increase cell survival, cell growth, cell proliferation, 
or some combination of these, depending on the cell type. Large breeds of dogs 
such as Great Danes owe their great size to high levels of IGF1, while miniature 
breeds such as Chihuahuas have low levels (see Figure 21-57). 

Not all growth-regulating extracellular signals stimulate growth; some inhibit 
it, by promoting cell death or inhibiting cell growth, cell division, or both. Myosta- 
tin is a TGFB family member that specifically inhibits the growth and proliferation 
of myoblasts—the precursor cells that fuse to form the huge, multinucleated cells 
of skeletal muscle. When the Myostatin gene is deleted in mice, muscles grow to 
be several times larger than normal. Remarkably, two breeds of cattle that were 
bred for large muscles have both turned out to have mutations in the Myostatin 
gene; whippet dogs mutant for Myostatin develop similarly (Figure 21-63). 


Figure 21-62 Achondroplasia. This type of dwarfism occurs in one of 
10,000-100,000 births; in more than 99% of cases it results from a mutation 
at an identical site in the genome, corresponding to amino acid 380 in the 
FGF receptor FGFRG (a glycine in the transmembrane domain). The mutation 
is dominant, and almost all cases are due to new, independently occurring 
mutations, implying an extraordinarily high mutation rate at this particular site 
in the genome. The defect in FGF signaling causes dwarfism by interfering 
with the growth of cartilage in developing long bones. (From Velasquez’s 
painting of Sebastian de Morra. © Museo del Prado, Madrid.) 
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Figure 21-63 Myostatin limits muscle growth. A wild-type whippet dog and a bully whippet Figure 21-64 Hippo pathway. Hippo, 
that lacks myostatin. (A, from http://www.merlinanimalrescue.co.uk/dogs/?m=201211; B, from a protein kinase, limits growth by 
http://animalslook.com/schwarzenegger-dog/.) phosphorylation and activation of the kinase 


Warts, which in turn phosphorylates and 
inactivates the transcriptional coactivator 
Yorkie (called Yap in vertebrates). When 
Like TGFP itself, myostatin acts through the Smad intracellular signaling path- unphosphorylated, Yorkie/Yap drives tissue 
way (see Figure 15-57) to inhibit muscle growth specifically. Another intracellular growth: it activates the transcription of 
signaling pathway, called the Hippo pathway, inhibits organ and organism growth the growth-promoting gene Myc, the cell- 

. : : : . cycle progression gene Cyclin E, the anti- 
more generally. It was discovered in Drosophila, but it operates in vertebrates as apoptotic gene Diap, and the microRNA 
well. It inhibits growth both by promoting cell death (by blocking an apoptosis Bantam. Hippo-induced phosphorylation of 
inhibitor) and by inhibiting cell-cycle progression (by inhibiting the expression  Yorkie/Yap blocks this effect. 
of the cell-cycle gene Cyclin E). Some components of the pathway in Drosophila 
are shown in Figure 21-64. The organs of animals that are abnormally resistant to 
Hippo repression can grow to a monstrous size (Figure 21-65). 

It is important to note that in all species nutritional conditions also play a fun- 
damental part in regulating the pace and extent of growth, and in animals they do 
so through hormonal signal networks that are highly conserved between verte- 
brates and invertebrates. Although we do not have space for details here, genetic 
experiments, especially in Drosophila, have begun to unravel the logic of these 
controls, and to indicate how they may operate alongside other machinery, such 
as the Hippo pathway, to determine final size. 


Summary 


The sizes of animals and their organs vary widely and largely depend on total cell 
mass. This in turn depends on the size and number of cells, which are increased 
through cell growth and cell division, respectively. Cell numbers are reduced by 
programmed cell death. Each of these processes depends on both intracellular and 


wild type Yap overactivity wild type Yap overactivity 





(A) mouse liver (B) fly head 


Figure 21-65 Overcoming Hippo repression increases organ size. (A) Livers from control and 
Yyap-overexpressing mice. In these mice, Hippo signaling is insufficient to block Yap. (B) Adult heads 
from control and Yap-overexpressing flies. In the mutant flies, Hippo signaling is unable to block 
Yap. (From J. Dong et al., Cell 130:1120-1133, 2007.With permission from Elsevier.) 
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extracellular signals. The mystery is how these processes are regulated and coordi- 
nated to produce and maintain the characteristic final size of the adult organ or 
animal. 

Some signals such as survival factors, growth factors, and mitogens stimulate 
growth by promoting cell survival, cell growth, and cell division, respectively, while 
other signal molecules do the opposite. Although most of these signals operate 
locally to help sculpt the size and shape of the animal, its organs, and appendages, 
others act as hormones to regulate the growth of the animal as a whole. Nutrients 
can regulate growth through hormonal signals in the entire body. 

Many animals and organs can, by unknown mechanisms, assess their total cell 
mass and regulate it. If, for example, cell size is artificially increased or decreased in 
these cases, cell numbers adjust to maintain a normal total cell mass. Conversely, if 
cell numbers are artificially increased or decreased, cell size adjusts to compensate. 


NEURAL DEVELOPMENT 


The development of the nervous system poses problems that have little parallel 
in other tissues. A typical nerve cell, or neuron, has a structure unlike that of any 
other class of cells, with a long axon and branching dendrites, both of which make 
many synaptic connections to other cells (Figure 21-66). The central challenge of 
neural development is to explain how the axons and dendrites grow out, find their 
right partners, and synapse with them selectively to create a neural network—an 
electrical signaling system—that functions correctly to guide behavior (Figure 
21-67). The problem is formidable: the human brain contains more than 10!! 
neurons, each of which, on average, has to make connections with a thousand 
others, according to a regular and predictable wiring plan. The precision required 
is not so great as in a man-made computer, because the brain performs its com- 
putations in a different way and is more tolerant of vagaries in individual compo- 
nents. But the human brain nevertheless outstrips all other biological structures 
in its organized complexity. 

The components of a typical nervous system—the various classes of neurons, 
glial cells, sensory cells, and muscles—originate in a number of widely separate 
locations in the embryo. Thus, in the first phase of neural development, the dif- 
ferent parts of the nervous system develop according to their own local programs: 
neurons are born and assigned specific characters according to the place and 
time of their birth, under the control of inductive signals and transcription regu- 
lators, by mechanisms of the types we have already discussed. In the next phase, 
newborn neurons extend axons and dendrites along specific routes toward their 
target cells, guided by extracellular signals that attract or repel them. In the third 
phase, neurons form synapses with other neurons or muscle cells, setting up a 
provisional but orderly network of connections. In the final phase, which con- 
tinues into adult life, the synaptic connections are adjusted and refined through 
mechanisms that usually depend on synaptic signaling between the cells involved 
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Figure 21-66 A typical neuron of a 
vertebrate. The arrows indicate the 
direction in which signals are conveyed. 
The neuron shown is a basket cell, a type 
of neuron in the cerebellum. (Adapted from 
S. Ramón y Cajal, Histologie du Système 
Nerveux de l’ Homme et des Vertébrés, 
1909-1911. Paris: Maloine; reprinted, 
Madrid: C.S.I.C., 1972.) 
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Figure 21-67 The complex organization of nerve cell connections. This 
drawing depicts a section through a small part of a mammalian brain—the 
olfactory bulb of a dog—stained by the Golgi technique. The black objects 
are neurons; the thin lines are axons and dendrites, through which the various 
sets of neurons are interconnected according to precise rules. (From 

C. Golgi, Riv. sper. freniat. Reggio-Emilia 1:405-425, 1875.) 


(Figure 21-68). At all stages, neurons are in intimate contact with various types of 
non-neuronal supporting cells—the glial cells. 


Neurons Are Assigned Different Characters According to the Time 
and Place of Their Birth 


We start our account here with the first phase of neural development: the gen- 
eration of neural progenitors and their differentiation into hundreds of different 
neuronal subtypes, along with a much smaller number of glial types. Although the 
nervous system is exceptional in the extent of cell diversity, the process depends 
on the same principles that generate different cell types in other organs. We have 
already discussed some of the underlying machinery in the developing Drosoph- 
ila nervous system. We turn now to vertebrates. 

The vertebrate spinal cord, the brain, and the retina of the eye together con- 
stitute the central nervous system (CNS). They all originate as parts of the neural 
tube, whose formation was described earlier (see Figure 21-56). The brain and 
eyes develop from the anterior neural tube and the spinal cord from the posterior. 

The developmental anatomy is seen at its simplest in the spinal cord. As it 
develops, the epithelium forming the walls of the posterior neural tube becomes 
enormously thickened as the cells proliferate and differentiate, creating a highly 
organized structure of neurons and glial cells, surrounding a small central chan- 
nel. Bands of neurons with different future functions—and expressing different 
genes—are laid out along the dorsoventral axis of the tube. Motor neurons (those 
that control the muscles) are located ventrally, whereas neurons that process sen- 
sory information are found dorsally. This pattern is established by opposing gradi- 
ents of morphogens. These are secreted by specialized groups of cells that run the 
length of the ventral and dorsal midlines of the neural tube (Figure 21-69). The 
two morphogen gradients—consisting of Sonic hedgehog protein from the ventral 
source and BMP and Wnt from the dorsal source—help induce different groups 
of proliferating neural progenitor cells and differentiating neurons to express dif- 
ferent combinations of transcription regulators. These regulators in turn drive the 
production of different combinations of neurotransmitters, receptors, cell-cell 
adhesion proteins, and other molecules, creating terminally differentiated neu- 
rons that will form synaptic connections selectively with the right partners and 
exchange appropriate signals with them. 
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Figure 21-68 The four phases of neural 
development. 
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Extracellular morphogen gradients, however, are not the only way to generate 
cell diversity. As we saw earlier in our discussion of Drosophila neuroblasts (see 
Figure 21-36), different cell types can also be generated by temporal patterning, 
in which an intracellular program changes the character of a progenitor cell over 
time, giving rise to different cell types as development progresses. This mecha- 
nism also seems to operate in vertebrate neurogenesis. The most striking illus- 
tration comes from study of another part of the CNS—the mammalian cerebral 
cortex. 

Although the cerebral cortex is the most complex structure in the human 
body, it has a simple beginning—from the anterior neural tube. As in the spinal 
cord, the cells that form the walls of the tube proliferate, and the neuroepithelium 
thickens and expands as they divide. On a predictable schedule, the divisions 
of the neuroepithelial cells begin to produce a succession of cells committed to 
terminal differentiation as neurons. These future neurons are born close to the 
lumen (the central cavity) of the tube. From here, they migrate outward, losing 
attachment to the lumenal surface and crawling outward along neighboring cells 
that continue to span the full thickness of the neuroepithelium. These latter neu- 
roepithelial cells do double duty, functioning as progenitors of neurons and glia, 
and as supporters of the epithelial architecture. They become stretched out as 
radial glial cells, forming a scaffold that continues to span the neuroepithelium 
even as this grows to an enormous thickness (Figure 21-70). At the same time, 
the radial glial cells continue to divide as neural precursors, giving rise to both 
neurons and glial cells—new radial glial cells as well as glial cells of other types. 
The newborn neurons, migrating along the radial glial cells, find their appropriate 
resting places in the developing cortex, where they mature, and from these sites 
they send out their axons and dendrites. The first-born neurons settle closest to 
their birthplace near the lumen, while neurons born later crawl past them to set- 
tle farther out (Figure 21-71). The successive generations of neurons thus build 
up as a series of cortical layers, ordered by birthdate and endowed with different 
intrinsic characters. 

Strikingly, single cortical progenitor cells isolated in culture generate distinct 
types of cortical neurons and glial cells, with the timing and characteristics appro- 
priate to specific cortical layers. These observations suggest that the neural pro- 
genitors in the developing mammalian cortex, much like the Drosophila neuro- 
blasts, step through an intracellular developmental program that generates the 
ordered succession of different nerve cell types. 


Figure 21-69 A schematic cross section 
of the spinal cord of a chick embryo, 
showing how cells at different levels 
along the dorsoventral axis acquire 
different characters. (A) Signals that direct 
the dorsoventral pattern. Sonic hedgehog 
protein from the notochord and the floor 
plate (the ventral midline of the neural 

tube) and BMP and Wnt proteins from 

the roof plate (the dorsal midline) act as 
morphogens to control gene expression. 
(B) The resulting patterns of cell fates 

in the developing spinal cord. Different 
groups of proliferating neural progenitor 
cells (in the ventricular zone, close to 

the lumen of the neural tube) and of 
differentiating neurons (in the mantle zone, 
further out) express different combinations 
of transcription regulators. Neurons 
expressing different transcription regulators 
will form connections with different partners 
and may make different combinations 

of neurotransmitters and receptors. 

Colors represent different cell types and 
combinations of regulatory proteins. 
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The Growth Cone Pilots Axons Along Specific Routes Toward 
Their Targets 


According to the character assigned to it during its early development, a neuron 
will proceed to make connections with specific partners. This phase of neural 
development involves a type of morphogenesis unique to the nervous system, in 
which axons and dendrites extend along specific routes toward their target cells. 
A typical neuron sends out one long axon and many dendrites, which are usually 
shorter. The axon projects to distant target cells to which the neuron will eventu- 
ally send signals. The dendrites will receive incoming signals from axon terminals 
of other neurons. Axons and dendrites extend by growth at their tip, where one 
sees an irregular, spiky enlargement called a growth cone (Figure 21-72 and 
Movie 21.6). The growth cone is both the engine that produces the crawling move- 
ment and the steering apparatus that directs the tip along the proper path. Cyto- 
skeletal machinery in the growth cone creates active protrusions, in the form of 
filopodia and lamellipodia (see Chapter 16 for details): when such a protrusion 


Pi Aye PADAS AAA last-born 


AAAA NEY A neurons 


l AA A wel 
LUVIN Lay 


yaa Ad ed 
oo tes 


neurons 


first-born 
Na Sresi | neurons 


layers of a 
neurons 





dividing progenitor cell radial glial cell 


1201 


Figure 21-70 Migration of immature 
neurons. Before sending out axons 

and dendrites, newborn neurons often 
migrate from their birthplace and settle in 
another location. The diagrams are based 
on reconstructions from sections of the 
cerebral cortex (part of the neural tube) of 
a monkey and rely on a staining technique 
that picks out at random a small subset of 
the whole dense mass of neuroepithelial 
cells. The neurons go through their final 
cell division close to the inner, lumenal 
face of the neural tube (in the ventricular 
proliferative zone) and then migrate 
outward by crawling along radial glial cells 
that form a scaffold. Each of these latter 
cells extends from the inner to the outer 
surface of the tube, a distance that may be 
as long as 2 cm in the cerebral cortex of 
the developing brain of a primate. 

The radial glial cells can be considered 
as persisting cells of the original columnar 
epithelium of the neural tube that become 
extraordinarily stretched as the wall of 
the tube thickens. They also serve as 
neural stem cells: depending on stage 
and region, the newborn neurons can 
be generated from radial glial cells that 
undergo mitosis while their nuclei are close 
to the inner surface of the tube, or they 
can be generated from a nearby class of 
specialized progenitors in the ventricular 
proliferative zone. (After P. Rakic, J. Comp. 
Neurol. 145:61-84, 1972. With permission 
from John Wiley & Sons, Inc.) 


Figure 21-71 Programmed production 
of different types of neurons at different 
times from dividing progenitors in 

the cerebral cortex of the brain ofa 
mammal. Close to one face of the cortical 
neuroepithelium, progenitor cells divide, in 
stem-cell fashion, to produce successive 
generations of neurons (colored here 

blue, green, red, orange, and black). The 
neurons migrate out toward the opposite 
face of the epithelium by crawling along the 
surfaces of radial glial cells, as shown in 
Figure 21-70. The first-born neurons settle 
closest to their birthplace, while neurons 
born later crawl past them to settle farther 
out. Successive generations of neurons 
thus occupy different layers in the cortex 
and have different intrinsic characters 
according to their birth dates. 
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contacts an unfavorable surface, it withdraws; when it contacts a more favorable Figure 21-72 Internal architecture 
surface, it persists longer, steering the growth cone in that direction. In this way, Of a Neuronal growth cone, as seen 
the growth cone is guided by subtle variations in the properties of the surfaces ee 
SNU E 8 y i a De tees Een mage a meee growth cone forms as an expansion of 
over which it moves. At the same time, it is sensitive to specific signaling mole- the tip of the growing axon. (A) Image by 


cules, which—as we discuss next—can either encourage or hinder its advance. interference-contrast microscopy. 

(B) Immunostaining to show microtubules 
(green). (C) Immunostaining to show 
actin filaments (red). (D) Diagram of the 


Growth cones generally travel toward their targets along predictable routes, ee ee ony 
and push forward by assembly of actin 


according to programs stored in the memory of the particular neuron to which filaments at the leading edge of the 
they belong (Movie 21.7). In the simplest case, a growth cone can take a route that growth cone. Microtubules stabilize the 
has been pioneered by other neurites, which they follow by contact guidance. Asa directional decisions made by the actin-rich 
result, nerve fibers in a mature animal are usually found grouped together in tight Protrusions. Filopodia adhering to the flat 
l : i substratum contract and pull the growth 
parallel bundles (called fascicles or fiber tracts). Such crawling of growth cones on awd (aces by a 
along axons is partly mediated by homophilic cell-cell adhesion molecules— Paul Forscher Laboratory, Yale University, 
membrane glycoproteins that help a cell displaying them to stick to any other cell New Haven, CT.) 
that displays the same molecules. As discussed in Chapter 19, many homophilic 
adhesion molecules fall into one of two main classes: they are members of either 
the immunoglobulin superfamily, such as N-CAM, or the Ca**-dependent cad- 
herin family, such as N-cadherin. Members of both families are generally pres- 
ent on the surfaces of growth cones, of axons, and of various other cell types that 
growth cones crawl over, including glial cells in the central nervous system and 
muscle cells in the periphery of the body. Growth cones also migrate over compo- 
nents of the extracellular matrix. When tested with neurons growing in a culture 
dish, some of the matrix molecules, such as laminin, favor axon outgrowth, while 
others, such as chondroitin sulfate proteoglycans, discourage it. But exactly how 
the matrix functions to guide axons in intact animals remains to be discovered. 
Growth cones are generally guided by a succession of different cues at differ- 
ent stages of their journey, as summarized in Figure 21-73. Many of these cues 
involve specific signaling molecules. Some of these are encountered in the extra- 
cellular matrix, while others are attached to the plasma membrane of cells that 
the growth cones touch. Another important part is played by chemotactic factors; 
these are proteins secreted from cells that act as beacons at strategic points along 
the path—some attracting, others repelling. The trajectory of commissural axons— 
axons that cross from one side of the body to the other—provides a well-studied 
example. 
Commissural axons are a general feature of bilaterally symmetrical animals, 
such as us, because they are required to coordinate behavior of the two sides of 
the body. In the developing spinal cord of a vertebrate, for example, a large num- 
ber of neurons send their axonal growth cones ventrally toward the floor plate 
(the same structure that we encountered earlier as a source of the morphogen 
Sonic hedgehog—see Figure 21-69). The growth cones cross the floor plate and 
then turn abruptly through a right angle to follow a longitudinal path up toward 
the brain, parallel to the floor plate but never again crossing it (Figure 21-74). The 
first stage of the journey depends on a concentration gradient of the signal protein 
Netrin, secreted by the cells of the floor plate: the commissural growth cones sniff 
their way toward its source. 


A Variety of Extracellular Cues Guide Axons to their Targets 
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Figure 21-73 Mechanisms of growth- 
cone guidance. Growth cones use a 
variety of extracellular cues to navigate 

to distant targets. They can adhere to 

the extracellular matrix or to the surfaces 
of other cells, or they can be repelled by 
them; they can crawl, for example, by 
homophilic adhesion along the axons of 
pioneer neurons; and they can be attracted 
or repelled by soluble guidance signals. 
(After E. Kandel et al., Principles of Neural 
Science, 5th ed., New York: McGraw Hill 
Medical, 2012.) 
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If commissural growth cones are attracted to the floor plate, why do they cross 
it and emerge on the other side, instead of staying in the attractive territory? And 
having crossed it, why do they never cross back again? The answers lie in a change 
in the responsiveness of the growth cones during their journey. As the growth 
cones cross the midline, they lose sensitivity to Netrin and become sensitive 
instead to a signal protein called Slit (see Figure 21-74). Slit is also produced by 
the floor plate, but it has the opposite effect to that of Netrin: it repels the growth 
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Figure 21-74 The guidance of commissural axons. (A) The pathway taken by commissural axons in the embryonic spinal cord of a vertebrate. 

(B) Attraction to the midline. The growth cone is first attracted to the floor plate by Netrin, which is secreted by the floor-plate cells and acts on the 
receptor DCC in the axonal membrane. (C) Repulsion from the midline after crossing it. As the growth cone crosses the floor plate, Slit comes into 
play: it binds to its receptors Robo1 and Robo2 and acts as a repellent to keep the growth cone from re-entering the floor plate. In addition, it blocks 
responsiveness to the attractant Netrin. Before crossing the midline, the commissural neurons express Robo3.1, an alternative splice form of Robo3 
that is related to Robo proteins but blocks Slit signaling. As neurites cross the midline, Robos.1 is lost and growth cones become responsive to Slit 
and are repelled from the midline. 
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cones, preventing them from re-entering the midline territory. The responses of 
the growth cone depend on the receptors that it expresses: as commissural neu- 
rons approach the floor plate, the Slit receptors are kept inactive by an inhibitory 
protein (Robo3.1) in the same membrane, allowing the commissural axons to 
grow to the midline without being repelled. Robo3.1 is lost as the growth cones 
cross the midline; now the growth cones become sensitive to repulsion by Slit and 
are thereby prevented from crossing back to the other side. At the same time, sig- 
nals from the Slit receptors interfere with those from the Netrin receptors, making 
the growth cones deaf to the signal that attracted them to the floor plate initially. 
A similar mechanism, using similar proteins, seems to govern midline crossing of 
commissural axons in other animals, including flies and worms. 

The guidance of commissural axons illustrates how axons rarely navigate 
directly to their targets. Instead, they use intermediate targets, or guideposts, and 
switch their sensitivities as they move from one local guidepost to the next, steer- 
ing their way through a complex environment to a far-away destination. 


The Formation of Orderly Neural Maps Depends on Neuronal 
Specificity 


In many cases, neurons of a similar type are laid out in a broad array of different 
positions, but send out axons that come together for their journey and arrive at 
the target region in a tight bundle. There the axons disperse again, to terminate 
at different sites in the target territory. This they do in an orderly way, creating a 
regular mapping from one territory to another—a neural map. 

The axon projection from the eye to the brain provides an important exam- 
ple. The neurons in the retina that convey visual information back to the brain 
are called retinal ganglion cells (RGCs). There are more than a million of them in 
humans, each one reporting on a different part of the visual field. Their axons con- 
verge on the optic nerve head at the back of the eye and travel together along the 
developing optic nerve toward the brain. Their main site of termination, in most 
vertebrates other than mammals, is the optic tectum—a broad expanse of cells in 
the midbrain. In connecting with tectal neurons, the RGC axons distribute them- 
selves in a predictable pattern according to the arrangement of their cell bodies in 
the retina: RGCs that are neighbors in the retina connect with target cells that are 
neighbors in the tectum. The orderly projection creates a retinotopic map of visual 
space on the tectum (Figure 21-75). 

Orderly maps of this sort are found in many brain regions. In the auditory sys- 
tem, for example, the neurons that project from the ear to the brain form a tono- 
topic map in which brain cells receiving information about sounds of different 
pitch are ordered along a line, like the keys of a piano. And in the somatosensory 
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Figure 21-75 The neural map from 

eye to brain in a young zebrafish. 

(A) Diagrammatic view, looking down on 
the top of the head. (B) Fluorescence 
micrograph. Fluorescent tracer dyes have 
been injected into each eye—red into the 
anterior part, green into the posterior part. 
The tracer molecules have been taken up 
by the neurons in the retina and carried 
along their axons, revealing the paths they 
take to the optic tectum in the brain and 
the map that they form there. (Courtesy 
of Chi-Bin Chien, from D.H. Sanes, 

T.A. Reh and W.A. Harris, Develooment 
of the Nervous System. San Diego, CA: 
Academic Press, 2000.) 
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system, neurons conveying information about touch map onto the cerebral cortex 
so as to mark out a “homunculus” —a small, distorted, two-dimensional image of 
the body surface (Figure 21-76). 

The retinotopic map of visual space in the optic tectum is the best characterized 
of all these maps. How does it arise? A famous experiment in the 1940s on frogs 
provided an important clue. If the optic nerve of a frog is cut, it will regenerate. The 
retinal axons grow back to the optic tectum, restoring normal vision. If, however, 
the eye is in addition rotated in its socket at the time of cutting of the nerve, so as 
to put originally ventral retinal cells in the position of dorsal retinal cells, vision is 
still restored, but with an awkward flaw: the animal behaves as though it sees the 
world upside down and left-right inverted (Figure 21-77). If food is dangled in 
front of it, for example, it will lunge perversely backward. This is because the mis- 
placed retinal cells make the connections appropriate to their original, not their 
actual, positions. It seems that the retinal ganglion cells (RGCs) have positional 
values—position-specific biochemical properties representing records of their 
original location in the retina, assigned perhaps by earlier morphogen gradients, 
and making RGCs on opposite sides of the retina intrinsically different. 

Such nonequivalence among neurons is referred to as neuronal specificity. 
It is this intrinsic characteristic that guides the retinal axons to their appropriate 
target sites in the tectum. Those target sites themselves are distinguishable by the 
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Figure 21-76 A map of the body surface 
in the human brain. The surface of the 
body is mapped onto the somatosensory 
region of the cerebral cortex by using an 
orderly system of nerve cell connections 

to pair body sites with the brain sites that 
receive their sensory information. This 
means that the map in the brain is largely 
faithful to the topology of the body surface, 
even though different body regions are 
represented at different magnifications 
according to their density of innervation. 
The homunculus (the “little man” in the 
brain) has big lips, for example, because 
the lips are a particularly large and 
important source of sensory information. 
The map was determined by stimulating 
different points in the cortex of conscious 
patients during brain surgery and recording 
what they said they felt. (After W. Penfield 
and T. Rasmussen, The Cerebral Cortex of 
Man. New York: Macmillan, 1950.) 


Figure 21-77 Neurons in different 
regions of the retina project axons 

to different regions in the tectum. 

(A) Neurons (RGCs) in the anterior retina 
project axons to the posterior tectum (as 
shown in Figure 21-75 for zebrafish). 

(B) Regeneration experiments show 

that retinal neurons have an intrinsic 
oreference for the part of the tectum they 
normally connect to. If the eye is surgically 
rotated when the optic nerve is cut, the 
regenerating retinal axons connect to their 
original targets, creating an inverted map. 
(After E. Kandel et al., Principles of Neural 
Science, 5th ed., New York: McGraw Hill 
Medical, 2012.) 
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Figure 21-78 Selectivity of retinal axons growing over tectal membranes. (A) Diagram of an experiment performed with 
cells from a chick embryo. The culture substratum is coated with alternating stripes of membrane prepared either from posterior 
tectum or from anterior tectum. Axons from posterior retina grow on anterior tectal membrane but are repelled by posterior 
tectal membrane. Axons from anterior retina show different (less selective) behavior. (B) Photograph of results. The retinal axons, 
growing out from the left, are made visible by staining them with a fluorescent marker. The selective pattern of outgrowth shows 
that anterior tectum differs from posterior tectum, and anterior retina correspondingly differs from posterior retina. In the intact 
organism, this serves to orient a retinotopic map; the map is refined by subsequent competitive interactions among the anterior 
and posterior retinal axons, which push the anterior retinal cells off anterior tectal territory. (From J. Walter et al., Development 
101:685-696, 1987.With permission from the Company of Biologists.) 


retinal axons because the tectal cells also carry positional labels. Thus, the neural 
map depends on a correspondence between two systems of positional markers, 
one in the retina and the other in the tectum. 

How are these markers used to make the map? When posterior axons are 
allowed to grow out over a carpet of anterior or posterior tectal membranes in 
a culture dish, they show selectivity. Posterior axons strongly prefer the anterior 
tectal membranes, as in vivo, whereas anterior axons show no preference or pre- 
fer posterior tectal membranes (Figure 21-78). The key difference between ante- 
rior and posterior tectum is not an attractive factor on the anterior tectum but a 
repulsive factor on the posterior tectum, to which posterior retinal axons are sen- 
sitive but anterior retinal axons are not. If a posterior retinal growth cone touches 
posterior tectal membrane, it collapses its filopodia and withdraws. 

In this system, as in others that we have mentioned, the repulsive interactions 
are mediated by ephrin-Eph signaling—specifically, EphrinA-EphA signaling 
for the anteroposterior axis (Figure 21-79). An analogous mechanism based on 
EphB-EphrinB signaling orients the dorsoventral axis of the retinotopic map. 

These mechanisms serve to orient the map along both axes, but they are not 
enough by themselves to ensure accurate point-to-point detail. This is brought 
about through a long process of adjustment that fills in and refines the map 
through interactions among the RGC axon terminals as they compete for territory 
on the tectum. This refinement of the pattern of connections involves electrical 
signaling in the system of developing synapses—a topic that we return to shortly. 


Both Dendrites and Axonal Branches From the Same Neuron 
Avoid One Another 


Axons and dendrites from different neurons can repel one another, or they can 
cohere; they can collaborate to form synapses, or they can compete. Remarkably, 
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axons or dendrites can also repel each other when they arise from a single neu- 
ron. Such self-avoidance prevents the neuron from making purposeless synapses 
with itself; it also helps the cell spread out its processes widely so as to innervate 
a broad territory. 

Self-avoidance poses a problem. If the same self-recognition molecule were 
used in every neuron, all neurons in the brain would repel each other. Some 
classes of neurons do show this sort of mutual repulsion, creating solitary terri- 
tories—a phenomenon called tiling; but in most cases, axons and dendrites from 
different neurons can overlap with one another. How then can the processes put 
out bya single neuron distinguish between self and non-self? This conundrum has 
been partially resolved by the discovery of a remarkable set of proteins that endow 
each neuron with a label unlike that of its neighbors. These are the DSCAM pro- 
teins in Drosophila and the protocadherins in vertebrates. As described in Chap- 
ter 7, DSCAM proteins are extraordinary for the number of isoforms that can be 
generated by alternative RNA splicing—more than 30,000 variants for DSCAM1 
(see Figure 7-57). Diversity arises from alternative exons that code for three highly 
variable extracellular immunoglobulin domains. Each DSCAM1 isoform engages 
in homophilic binding (see Figure 19-5), but remarkably, all the variable domains 
need to be identical for this to occur. Thus, one cell surface will bind to another via 
DSCAM only when the two cell surfaces express identical isoforms. The result of 
binding is repulsion, although the detailed mechanisms are poorly understood. 

If alternative splicing occurs in a random fashion in each cell, neighboring pro- 
cesses from different neurons are unlikely to express the same DSCAMI1 variant, 
so only the processes of the same cell will repel one another. Neurons that lack 
all DSCAM] variants have severe defects in neuronal self-avoidance. Engineering 
Drosophila so that all of its neurons produce a single isoform restores self-avoid- 
ance; but now the processes of neighboring neurons express the same isoform 
and repel each other, resulting in the phenomenon of tiling (Figure 21-80). 

Vertebrate neurons use a similar self-avoidance strategy to pattern their axons 
and dendrites, but instead of DSCAMs, they use protocadherins for self/non-self 
discrimination. The Protocadherin locus encodes 58 related cadherin-like trans- 
membrane proteins that are expressed in different combinations in single neu- 
rons. Homophilic recognition results in self-avoidance of dendrites emanating 
from the same neuron; neighboring dendrites of different neurons express differ- 
ent protocadherins and thus evade repulsion. Thus, although insect DSCAM and 
vertebrate protocadherin proteins share no sequence homology, they mediate 
similar self-avoidance strategies. 
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Figure 21-79 Ephrin signaling orients 
the retinotopic map. (A) Neurons in the 
posterior retina express EphA. As their 
axons reach the tectum, they are repelled 
by high levels of EphrinA protein in the 
posterior tectum and project preferentially 
to the anterior tectum. (B) In EohA-mutant 
mice, posterior retinal axons feel no 

such repulsion and project more widely 
within the tectum. (After E. Kandel et al., 
Principles of Neural Science, 5th ed., New 
York: McGraw Hill Medical, 2012.) 
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Target Tissues Release Neurotrophic Factors That Control Nerve 
Cell Growth and Survival 


Eventually, axonal growth cones reach the target region where they must halt and 
make synapses. These synapses, as a rule, are destined to transmit neural signals 
in one direction, from axon to target cell. The development of synapses, however, 
depends on signaling in both directions: signals from the target tissue not only 
help control which growth cones synapse where (as we discuss shortly), but can 
also regulate how many of the innervating neurons survive. 

Many types of vertebrate neurons are produced in excess; up to 50% or more of 
some of them die soon after they reach their target, even though they appear per- 
fectly normal and healthy up to the time of their death. About half of all the motor 
neurons that send axons to skeletal muscle, for example, die within a few days 
after making contact with their target muscle cells. A similar proportion of the 
sensory neurons that innervate the skin die after their growth cones have arrived 
there. 

This large-scale normal neuronal death often seems to reflect the outcome of 
a competition, in which the target tissue releases a limited amount of a specific 
neurotrophic factor that the neurons innervating the tissue require to survive; 
those that do not get enough die by programmed cell death. If the amount of tar- 
get tissue is increased—for example, by grafting an extra limb bud onto the side 
of the embryo—more limb-innervating neurons survive; conversely, if the limb 
bud is cut off, the same neurons all die (Figure 21-81). In this way, although indi- 
viduals may vary in their bodily proportions, they always retain the right number 
of motor neurons to innervate all their muscles and the right number of sensory 
neurons to innervate their body surface. The strategy of overproduction followed 
by death of surplus cells may seem wasteful, but it provides a simple and effective 
means to adjust the number of innervating neurons according to the amount of 
tissue requiring innervation. 

The first neurotrophic factor to be identified, and still the best characterized, 
is called nerve growth factor (NGF )—the founding member of the neurotrophin 
family of signal proteins. It promotes the survival and growth of specific classes of 
sensory neurons and of sympathetic neurons (a subclass of peripheral neurons 
that control contractions of smooth muscle and secretion from exocrine glands). 


Figure 21-80 DSCAM mediates self- 
avoidance of dendrites. (A) Sensory 
neurons in the Drosophila peripheral 
nervous system extend dendrites along 
the larval body wall. The image shows the 
dendrites of a regular array of photosensing 
neurons (red), which allow the larva 

to detect and avoid harmful light. The 
posterior epidermal cells of each segment 
are labeled in blue. There are many 
neurons, and those shown here spread 
out their dendrites into overlapping fields. 
(B) Mutations at the Dscam locus upset 
the way the various dendrites interact, 
changing the rules of self-avoidance and 
the distribution of innervation. (A, courtesy 
of Chun Han; B, after D. Hattori et al., 
Annu. Rev. Cell Dev. Biol. 24:597-620, 
2008. With permission from Annual 
Reviews.) 
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NGF is produced by the tissues that these neurons innervate. When extra NGF is 
provided, extra sensory and sympathetic neurons survive, just as if extra target 
tissue were present. Conversely, in a mouse with a mutation that inactivates the 
gene for NGF or for its receptor (a receptor tyrosine kinase called TrkA), almost 
all sympathetic neurons and the NGF-dependent sensory neurons are lost. There 
are many neurotrophic factors, only a few of which belong to the neurotrophin 
family, and they act in different combinations to promote the survival and growth 
of different classes of neurons. 


Formation of Synapses Depends on Two-Way Communication 
Between Neurons and Their Target Cells 


At journey’s end, the task of a growth cone is to halt its travels and make synapses 
with specific target cells. Synapses were introduced in Chapter 11, where we dis- 
cussed channels and the electrical properties of membranes. Two main classes 
of synapses are found in vertebrates; those made with muscle cells and those 
made with other neurons. Synapse formation is best understood in the case of 
the highly specialized connections between motor neurons and skeletal muscle 
cells—so-called neuromuscular junctions (see Figure 11-38). During synapse 
formation, the axonal growth cone differentiates into a nerve terminal that con- 
tains synaptic vesicles filled with the neurotransmitter acetylcholine, while ace- 
tylcholine receptors become clustered in the muscle cell plasma membrane at 
the site of synapse formation. A synaptic cleft separates the pre- and postsynaptic 
plasma membranes, and a thin sheet of basal lamina lies in this space between 
them (Figure 21-82). 
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Figure 21-81 The survival of motor 
neurons depends on signals provided 
by the target muscles. (A) Removal of 
the limb bud shortly after arrival of motor 


axons results in the death of motor neurons 


in the spinal cord on the amputated side. 
(B) Transplantation of an extra limb bud 
increases the survival of motor neurons. 
(After E. Kandel et al., Principles of Neural 
Science, 5th ed., New York: McGraw Hill 
Medical, 2012.) 
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Formation of the synapse involves two-way communication between the 
muscle cell and axonal growth cone: each of them, under the influence of the 
other, must reorganize the molecules on its side of the junction. The growth cone 
releases the signal protein Agrin, while the muscle expresses the Agrin receptor 
LRP4. Agrin binding to LRP4 stimulates association of LRP4 with MuSK, a recep- 
tor tyrosine kinase. LRP4 also serves as a signal in the reverse direction, from the 
muscle to the axon (Figure 21-83). During synapse formation, MuSK and LRP4 
cluster in the muscle cell plasma membrane in the general neighborhood of the 
future synapse. As the growth cone approaches, it recognizes LRP4, which stim- 
ulates the differentiation of presynaptic structures in the nerve cell. At the same 
time, Agrin released from the growth cone binds to LRP4 in the muscle cell; this 
activates MuSK, and promotes a more focused clustering of acetylcholine recep- 
tors in the muscle cell membrane. Through these mechanisms, the reciprocal sig- 
naling of LRP4 from muscle to growth cone—and of Agrin from growth cone to 
muscle—induces the coordinated, localized differentiation of pre- and postsyn- 
aptic structures. 

Synapse formation between neurons in the CNS is far more challenging, both 
for the neurons and for the scientists trying to understand the molecular basis of 
its specificity, and it remains poorly understood. 
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Figure 21-82 Formation of the 
neuromuscular junction. (A) The growth 
cone of a motor axon approaches the 
muscle fiber. (B) Initial synapse formation 

is characterized by the accumulation of 
synaptic vesicles at the axon terminal 

and the formation of a specialized basal 
lamina in the synaptic cleft. (C) As the 
neuromuscular junction matures, the 
synaptic cleft accumulates basal lamina 

and extracellular matrix proteins, synaptic 
vesicles cluster at presynaptic release sites, 
and neurotransmitter receptors cluster at 
postsynaptic sites. Schwann (glial) cells 
accompany the motor axon and wrap 
around its terminus outside the region of 
synaptic contact. (D) Transmission electron 
micrograph of the region of synaptic contact. 
[D, courtesy of John Heuser, from J. Electron 
Microsc. 60 (Suppl 1), 2011. With permission 
from Oxford University Press. ] 






axon terminal 


new 
synapse 


ay N 


muscle fiber 


(C) 


Figure 21-83 Reciprocal signaling during neuromuscular synapse differentiation. (A) The Agrin receptor LRP4 and its co-receptor MUSK 
cluster in the muscle cell membrane in the general neighborhood of the future synapse. (B) As the growth cone approaches, it recognizes LRP4, 
which stimulates differentiation of presynaptic structures. Reciprocally, Agrin is released from the nerve terminal, binds to a complex of LRP4 and 
MuSkK in the muscle, and (C) promotes the further and more focused clustering of the LRP4 and acetylcholine receptors in the muscle cell. Although 
the Agrin/MuSK/LRP4 machinery organizes the synapse, the process also depends on electrical signaling via the acetylcholine receptors. It is not 


yet known how LRP4 signals to the motor axon. 
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Synaptic Pruning Depends on Electrical Activity and Synaptic 
Signaling 


The two-way exchange of signals between axon growth cones and muscle cells 
controls the initial formation of neuromuscular junctions, but it is only the first 
step in the establishment of the final pattern of the synaptic connections. Each 
muscle cell at first receives synapses from several motor neurons, but in the end 
it is left innervated by only one. This process of synapse elimination depends 
on active synaptic communication and electrical activity. If synaptic transmis- 
sion is blocked by a toxin that binds to the acetylcholine receptors in the muscle 
cell membrane, or if axonal electrical activity is blocked by a toxin that binds to 
sodium channels in the axon plasma membrane, the muscle cell retains its multi- 
ple innervation beyond the normal time of elimination. 

The phenomenon of activity-dependent synapse elimination is encountered in 
almost every part of the developing vertebrate nervous system (Figure 21-84). It 
has a key role, for example, in the refinement of the retinotopic map discussed 
earlier. Synapses are first formed in abundance and distributed over a broad target 
field; then the system of connections is pruned back and remodeled by competi- 
tive processes that depend on electrical activity and synaptic signaling. The elim- 
ination of synapses in this way is distinct from the elimination of surplus neurons 
by cell death, and it occurs after the period of normal neuronal death is over. Syn- 
apse remodeling during neural development, however, involves more than just 
synapse elimination; it also involves synapse reinforcement, as we discuss next. 


Neurons That Fire Together Wire Together 


Throughout the nervous system, and throughout life, activity-dependent elimi- 
nation and reinforcement of synapses plays a fundamental part in adjusting the 
detailed anatomy of the neural network according to functional requirements. The 
importance of these processes, and their underlying rules, emerged half a century 
ago from a groundbreaking series of experiments on the developing visual system 
of young mammals. 

In the brain of most mammals, axons relaying visual inputs from the two eyes 
are brought together in a specific neuronal layer in the visual region of the cerebral 
cortex. Here, they form two overlapping maps of the external visual field, one as 
perceived through the right eye, the other as perceived through the left. Although 
there may be a tendency for right- and left-eye inputs to be segregated even 
before synaptic communication begins, a large proportion of the axons carrying 
information from the two eyes at early stages form synapses together on shared 
target neurons in the visual cortex. A period of early electrical signaling activity, 
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Figure 21-84 Synapse modification 
and its dependence on electrical 
activity. Experiments in several systems 
indicate that synapses are strengthened or 
weakened by electrical activity according 
to the rule shown in the diagram. The 
underlying principle appears to be that 
each excitation of a target cell tends 

to weaken any synapse where the 
presynaptic axon terminal has been quiet, 
but to strengthen any synapse where 

the presynaptic axon terminal has just 
been active. As a result, any synapse 
that is repeatedly weakened and rarely 
strengthened is eventually eliminated 
altogether. 
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Figure 21-85 Ocular dominance columns in the visual cortex of a monkey’s brain, and their 
sensitivity to visual experience. (A) Normally, stripes of cortical cells driven by the right eye 
alternate with stripes, of equal width, driven by the left eye. The stripes, set up before birth, are 
revealed here by injecting a radioactive tracer molecule into one eye, allowing time for this tracer to 
be transported to the visual cortex, and detecting radioactivity there by autoradiography, in sections 
cut parallel to the cortical surface. (B) If one eye is kept covered after birth, during the sensitive 
period of development, and thus deprived of visual experience, its stripes shrink and those of the 
active eye expand. In this way, the deprived eye may lose the power of vision almost entirely. (From 
D.H. Hubel, T.N. Wiesel and S. LeVay, Philos. Trans. R. Soc. Lond. B Biol. Sci. 278:377-409, 1977. 
With permission from The Royal Society.) 


however, occurring spontaneously and independently in each retina before birth, 
leads to a remarkable pattern of ocular dominance columns in the visual cortex: 
stripes of cells driven by inputs from the right eye alternating with stripes driven 
by inputs from the left eye (Figure 21-85). 

The basis for these phenomena became clear from ingenious experiments 
interfering artificially with visual experience and altering the coordination of 
electrical signaling in the two eyes. These studies, and many others subsequently, 
have highlighted a simple but profoundly important principle that seems to 
govern synapse reinforcement and elimination throughout the nervous system. 
When two (or more) neurons synapsing on the same target cell fire at the same 
time, they reinforce their connections to that cell; when they fire at different times, 
they compete, so that all but one of them tend to be eliminated. This firing rule is 
expressed in the catchphrase “neurons that fire together wire together.” 

The firing rule provides a simple interpretation of the developmental phe- 
nomenon we have just described in the mammalian visual system. A pair of axons 
bringing information from neighboring sites in the left eye will frequently fire 
together, and therefore wire together, as will a pair of axons from neighboring sites 
in the right eye; but a right-eye axon and a left-eye axon will rarely fire together, 
and will instead compete. Indeed, if activity from both eyes is silenced using tox- 
ins that block axonal electrical activity or synaptic signaling, as described above, 
the inputs fail to segregate correctly. 

The segregation of inputs from the two eyes is only the first of a series of activ- 
ity-dependent adjustments of visual connections, whose maintenance is extraor- 
dinarily sensitive to experience early in life. If, during a certain sensitive period 
(ending at about 5 years of age in humans), one eye is kept covered for a time so 
as to deprive it of visual stimulation, while the other eye is allowed normal stimu- 
lation, the deprived eye loses its synaptic connections to the cortex and becomes 
almost entirely, and irreversibly, blind. In accordance with what the firing rule 
would predict, a competition has occurred in which synapses in the visual cortex 
made by inactive axons are eliminated while synapses made by active axons are 
consolidated. In this way, cortical territory is allocated to axons that carry infor- 
mation and is not wasted on those that are silent. 

Activity-dependent synaptic changes are not confined to early life. They 
also occur in the adult brain, where many synapses show both functional and 
morphological alterations with use. This synaptic plasticity is thought to have a 
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fundamental role in learning and memory. Clearly, for the nervous system as for 
other parts of the body, developmental processes do not end at birth, as we dis- 
cuss in the next chapter. 


Summary 


The development of the nervous system proceeds in four phases. First, neurons and 
glial cells are generated from dividing neural progenitor cells. Then, the newborn 
neurons send out axons and dendrites toward their targets. Next, they make syn- 
aptic connections with appropriate target cells so that communication can begin. 
Finally, excessive neurons are eliminated by normal neuronal cell death, after 
which the system of synaptic connections is refined and remodeled according to the 
pattern of electrical and synaptic activity in the neural network. 

Neurons born at different times and places are specialized to express different 
sets of genes, and they have a cell memory that plays a major role in determin- 
ing the connections they will form. Their specialization depends not only on spa- 
tial patterning by morphogens but also on intrinsic developmental programs that 
unfold as the neural progenitors proliferate. Axons and dendrites grow out from 
the neurons by means of growth cones, which follow specific pathways delineated 
by attractive and repellant signals along the way, including cell-surface and extra- 
cellular matrix molecules and soluble signal proteins to which growth cones from 
different classes of neurons respond differently. In many parts of the nervous system, 
neural maps are set up—orderly projections of one array of neurons onto another. 
In the retinotopic system, the map is based on the matching of complementary sys- 
tems of position-specific cell-surface markers—ephrins and Eph receptors—pos- 
sessed by the two sets of cells. Other cell-surface molecules such as DSCAM proteins 
in Drosophila and protocadherins in vertebrates mediate self-avoidance between 
the branches arising from a single neuron, helping the cell spread out its processes. 

The formation of synapses involves back-and-forth signaling between target 
cells and the growth cone. After the growth cones have reached their targets and 
initial connections have formed, individual synapses are eliminated in some places 
and reinforced in others by mechanisms that depend on synaptic and electrical 
activity. These mechanisms adjust the architecture of the neural network according 
to the way in which it is used. 


PROBLEMS 


Which statements are true? Explain why or why not. 


21-1 In the early cleavage stages, when the embryo 
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WHAT WE DON'T KNOW 


e What regulates the pace of 
development? Why does a mouse 
embryo develop faster than a human 
embryo, for example? 


e What are the mechanisms that 
allow cell memory to be stored during 
development, explaining how each 
cell’s history determines its future 
behavior? 


e How do signals move through 
tissues? What are the roles of the 
extracellular matrix and of elongated 
cell projections? 


e How does a cell know exactly where 
it is in a multicellular organism? How 
does it know that its neighbors are the 
correct ones and that, if not, it should 
move or kill itself? 


e How do cells respond to tiny 
gradients of molecules in their 
environment, as required for knowing 
their positions? How are morphogen 
gradients reliably interpreted? 


e What are the genetic changes that 
allow the repurposing of existing body 
parts during evolution’? For example, 
how did bat wings evolve from arms? 


e How do cells use genetic 
instructions to form the shape of 
something as complex as the human 
nose? 


21-5 Changes in the coding regions of genes involved in 


cannot yet feed, the developmental program is driven and 
controlled entirely by the material deposited in the egg by 
the mother. 


21-2 Because of the many later developmental trans- 
formations that produce the elaborately structured organs, 
the body plan set up during gastrulation bears little resem- 
blance to the body plan in the adult. 


21-3 As development progresses, individual cells 
become more and more restricted in the range of cell types 
they can give rise to. 


21-4 At different stages of embryonic development, 
the same signals are used over and over again by different 
cells, but with different biological outcomes. 


development are primarily responsible for the differences 
between species. 


21-6 ‘The cell cycle is the ticking clock that sets the 
tempo of developmental processes, with maturational 
changes in gene expression being dependent on cell-cycle 
progression. 


Discuss the following problems. 


21-/ Name the four processes that are fundamental to 
animal development, and describe each of them in a single 
sentence. 


21-8 What are the three germ layers formed during gas- 
trulation, and what are the principal structures each gives 
rise to in the adult? 


1214 Chapter 21: Develooment of Multicellular Organisms 


21-9 Inthe early Drosophila embryo, there seems to be 
no requirement for the usual forms of cell-cell signaling; 
instead, transcriptional regulators and mRNA molecules 
move freely between nuclei. How can that be? 


21-10 Morphogens play a key role in development, cre- 
ating concentration gradients that inform cells of where 
they are and how to behave. Examine the simple patterns 
represented by the flags in Figure Q21-1. Which do you 
suppose could be created by a gradient of a single mor- 
phogen? Which would require gradients of two morpho- 
gens? Assuming that such patterns were present in a sheet 
of cells, explain how they could be created by morphogens. 





= 





France Norway 





Figure Q21-1 National flags from three countries (Problem 21-10). 


21-11 Two adjacent cells in the nematode worm nor- 
mally differentiate into an anchor cell (AC) and a ventral 
uterine precursor (VU) cell, but which of the two becomes 
the AC and which becomes the VU cell is completely ran- 
dom: the cells have an equal chance of adopting either 
fate, but they always adopt different fates. Mutations of 
Lin12 alter these fates. In hyperactive Lin]2 mutants, both 
cells become VU cells, while in inactive Lin12 mutants, 
both cells become ACs. Thus, Lin12 is central to the deci- 
sion-making process. In genetic mosaics in which one 
precursor cell has the hyperactive Lin12 and the other pre- 
cursor has the inactive Lin12, the cell with the hyperactive 
Lin12 always becomes the VU cell and the cell with inac- 
tive Lin12 always becomes the AC. Assuming that one cell 
sends a signal and the other cell receives it, explain how 
these results suggest that Lin12 encodes a protein required 
to receive the signal. Offer a suggestion for how the fates 
of these two precursor cells are normally decided in wild- 
type worms. 


21-12 It was clear from the early days of studying devel- 
opment that certain “morphogenetic” substances were 
present in the egg and segregated asymmetrically into 
cells of the developing embryo. One such investigation 
in ascidian (sea squirt) embryos examined endodermal 
alkaline phosphatase, which could be visualized by a his- 
tochemical stain. Treatment of embryos with cytochalasin 
B stopped cell division, but did not block expression of 
alkaline phosphatase at the appropriate time. Treatment 
with actinomycin D, which blocks transcription, did not 
interfere with expression of alkaline phosphatase. Treat- 
ment with puromycin, which blocks translation, elim- 
inated expression of alkaline phosphatase. What is the 
likely nature of the morphogenetic substance that gives 
rise to alkaline phosphatase? 


21-13 The mouse HoxA3 and HoxD3 genes are paralogs 
that occupy equivalent positions in their respective Hox 
gene clusters and share roughly 50% identity in their pro- 
tein-coding sequences. Mice with defects in HoxA3 have 
deficiencies in pharyngeal tissues, whereas mice with 
defects in HoxD3 have deficiencies in the axial skeleton, 
suggesting quite different functions for the paralogs. Thus, 
it came as a surprise when it was found that replacing a 
defective HoxD3 gene with the normal HoxA3 gene cor- 
rected the deficiency, as did the reciprocal experiment 
of replacing a mutant HoxA3 gene with a normal HoxD3 
gene. Neither transplaced gene, however, could supply 
its normal function; that is, a normal HoxA3 gene at the 
HoxD3 locus could not correct the deficiency caused by 
a mutant HoxA3 gene at the HoxA3 locus. The same was 
true for the HoxD3 gene. If the HoxA3 and HoxD3 genes 
are equivalent, how do you suppose they can play such 
distinct roles in development? Why do you suppose they 
cannot perform their normal function in a new location? 


21-14 Thesegmentation of somites in vertebrate embryos 
is thought to depend on oscillations in the expression of the 
Hes7 gene. Mathematical modeling explains these oscilla- 
tions in terms of the delays in production of the unstable 
Hes7 protein, which acts as a transcription regulator to 
shut off its own expression. Once Hes7 decays, with a half- 
life of about 20 minutes, its transcription resumes. To test 
this model, you decide to reduce the total delay by remov- 
ing one, two, or all three of the introns from the Hes7 gene 
in mice. Why do you expect that intron removal would 
reduce the delay? What would you predict would happen 
to the oscillation time, and somite formation, if the model 
were correct? 


21-15 ‘The oscillatory clock that drives somite forma- 
tion in vertebrates involves three essential components 
Her7 (an unstable repressor of its own synthesis), Delta (a 
transmembrane signaling molecule), and Notch (a trans- 
membrane receptor for Delta). Notch is bound by Delta on 
neighboring cells, activating the Notch signaling pathway, 
which then activates Her7 transcription. Normally, this 
system works flawlessly to create sharply defined somites 
(Figure Q21-2A). In the absence of Delta, however, only 
the first five somites form normally, and the rest are poorly 
defined (Figure Q21-2B). If a pulse of Delta is supplied 
later, somite formation returns to normal in the regions 
where Delta was present (Figure Q21-2C). A diagram of 
the connections between the components of the clock 
and how they interact in adjacent cells is shown in Figure 
Q21-2D. In the absence of Delta, why do the cells become 
unsynchronized? What is it about the presence of Delta 
that keeps adjacent cells oscillating in synchrony? 


21-16 The extracellular protein factor Decapentaplegic 
(Dpp) is critical for proper wing development in Drosoph- 
ila (Figure Q21-3A). It is normally expressed in a narrow 
stripe in the middle of the wing, along the anterior-pos- 
terior boundary. Flies that are defective for Dpp form 
stunted “wings” (Figure Q21-3B). If an additional copy 
of the gene is placed under control of a promoter that is 
active in the anterior part of the wing, or in the posterior 
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Figure Q21-2 Somite formation in zebrafish embryos (Problem 21-15). 
(A) Wild-type embryos with normal somites. (B) Somite formation in 
embryos lacking Delta. The bracket indicates normal-looking somites 
where they initially form. (C) Somite formation in embryos lacking Delta, 
but receiving a pulse of Delta expression at the time indicated by the 
right-hand bracket. (D) Interactions among components of the oscillatory 
clock in adjacent cells. (Adapted from C. Soza-Ried et al., Development 
141:1780-1788, 2014. With permission from The Company of 
Biologists.) 


REFERENCES 


General 


Carroll SB (2006) Endless Forms Most Beautiful: The New Science of 
Evo Devo. New York: W.W. Norton & Co., Inc. 


Gilbert SF (2013) Developmental Biology, 10th ed. Sunderland, MA: 
Sinauer Associates, Inc. 


Wolpert L & Tickle C (2010) Principles of Development, 3rd ed. Oxford, 
UK: Oxford University Press. 


Overview of Development 


Gurdon JB (2013) The egg and the nucleus: a battle for supremacy 
(Nobel Lecture). Angew. Chem. Int. Ed. Engl. 52, 13890-13899. 


Istrail S & Davidson EH (2005) Logic functions of the genomic cis- 
regulatory code. Proc. Natl Acad. Sci. USA 102, 4954—4959. 

Levine M (2010) Transcriptional enhancers in animal development and 
evolution. Curr Biol. 20, R754-R763. 

Lewis J (2008) From signals to patterns: space, time, and mathematics 
in developmental biology.Science 322, 399-403. 

Meinhardt H & Gierer A (2000) Pattern formation by local self-activation 
and lateral inhibition. Bioessays 22, 753-760. 

Rogers KW & Schier AF (2011) Morphogen gradients: from generation 
to interpretation. Annu. Rev. Cell Dev. Biol. 27, 377-407. 

Shubin N, Tabin C & Carroll S (2009) Deep homology and the origins of 
evolutionary novelty. Nature 457, 818-823. 


1215 


(A) (B) 


anterior 





Figure Q21-3 Effects of Dpp expression on wing development in 
Drosophila (Problem 21-16). (A) Normal Dpp expression. (B) Absence 
of Dpp expression. (C) Additional anterior Dpp expression. 

(D) Additional posterior Dpp expression. (From M. Zecca, K. Basler 
and G. Struhl, Develooment 121:2265-2278, 1995. With permission 
from The Company of Biologists.) 
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sion, cell growth, or both? How can you tell? 


21-17 ‘The highly branched structures of neurons would 
seem to make it almost inevitable that they should make 
unproductive synapses with themselves, yet they manage 
to avoid this outcome very effectively. How is this accom- 
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stem Cells and 
Tissue Renewal 


Cells evolved originally as free-living individuals, and such cells still dominate the 
Earth and its oceans. But the cells that matter most to us, as human beings, are 
specialized members of a multicellular community. These cells have lost features 
needed for independent survival and acquired peculiarities that serve the needs 
of the body as a whole. Although they share the same genome, they are spectacu- 
larly diverse in structure, chemistry, and behavior. There are more than 200 differ- 
ent named cell types in the human body that collaborate with one another to form 
many different tissues, arranged into organs performing widely varied functions. 
To understand them, it is not enough to analyze cells in a culture dish: we need 
also to know how they live, work, and die in their natural habitat, the intact body. 

In Chapters 7 and 21, we saw how the various cell types become different in 
the embryo and how cell memory and signals from their neighbors enable them 
to remain different thereafter. In Chapter 19, we discussed the technology used 
to build multicellular tissues—the devices that bind cells together and the extra- 
cellular materials that give them support. But the adult body is not static: it is a 
structure in dynamic equilibrium, where new cells are continually being born, 
differentiating, and dying. Homeostatic mechanisms maintain a proper balance, 
so that the tissue architecture is preserved despite the constant replacement of 
old cells by new. In this chapter, we focus on these developmental processes that 
continue throughout life. In doing so, we shall illustrate some of the diversity of 
specialized cell types and see how they work together to perform their tasks. 

We shall examine in particular the role played in many tissues by stem cells— 
cells that are specialized to provide a fresh supply of differentiated cells where 
these need to be continually replaced, or when they are required in great num- 
ber for purposes of repair and regeneration. We shall see that while many tissues 
renew and repair themselves, some others do not; there, lost cells are lost forever, 
causing deafness, blindness, dementia, and other ills. 

In the final section of the chapter, we discuss how stem cells can be gener- 
ated and manipulated artificially, and we confront the practical question that 
underlies the current storm of interest in stem-cell technology: How can we use 
our understanding of the processes of cell differentiation and tissue renewal to 
improve upon nature, and make good those injuries and failings of the human 
body that have hitherto seemed to be beyond repair? 


STEM CELLS AND RENEWAL IN EPITHELIAL TISSUES 


Among all the self-renewing tissues in a mammal, the champion—for speed at 
least—is the lining of the small intestine: the long, convoluted portion of the gut 
tube that is chiefly responsible for absorption of nutrients from the gut lumen. To 
introduce stem cells, we take the small intestine as our starting point—not only 
because it renews itself at a greater rate than any other tissue in the body, but also 
because the molecular mechanisms that control its organization are particularly 
well understood. It thereby provides a beautiful illustration of the principles of 
stem-cell systems that have broad applicability. 


1217 


CHAPTER 


IN THIS CHAPTER 


STEM CELLS AND RENEWAL 
IN EPITHELIAL TISSUES 


FIBROBLASTS AND THEIR 
TRANSFORMATIONS: THE 
CONNECTIVE-TISSUE CELL 
FAMILY 


GENESIS AND REGENERATION 
OF SKELETAL MUSCLE 


BLOOD VESSELS, LYMPHATICS, 
AND ENDOTHELIAL CELLS 


A HIERARCHICAL STEM- 
CELL SYSTEM: BLOOD CELL 
FORMATION 


REGENERATION AND REPAIR 


CELL REPROGRAMMING AND 
PLURIPOTENT STEM GELLS 


1218 Chapter 22: Stem Cells and Tissue Renewal 


LUMEN OF GUT 


epithelial cell migration 

from “birth” at the bottom ee 

of the crypt to loss at the villus (no cell division) 
top of the villus i 
(transit time is 







cross section 


3-5 days) of villus 
epithelial 
cells 
crypt . 
Proce absorptive 
eae section a 
connective of crypt — 
tissue 
mucus- 
secreting 
nondividing goblet cells 
differentiated 
cells 
direction of 
movement 
rapidly dividing 
cells (cycle time 
12 hours) 
Y crypt 
LIN X á stem cells 
f (cycle time 
y ~ 24 hours) 
(A) nondividing differentiated 


Paneth cells 


The Lining of the Small Intestine Is Continually Renewed Through 
Cell Proliferation in the Crypts 


The lining of the small intestine (and of most other regions of the gut) is a sin- 
gle-layered epithelium, only one cell thick. This epithelium covers the surfaces 
of the villi that project into the lumen, and it lines the crypts that descend into 
the underlying connective tissue (Figure 22-1). Dividing cells are restricted to 
the crypts, and differentiated cells, no longer dividing, pour out of the crypts in a 
steady stream onto the villi. There are four main types of nondividing differenti- 
ated cells—one absorptive and three secretory (Figure 22-2): 


1. Absorptive cells (also called brush-border cells or enterocytes) have densely 
packed microvilli on their exposed surfaces. Their job is to take up nutri- 
ents from the gut lumen. To this end, they also produce hydrolytic enzymes 
that perform some of the final steps of extracellular digestion. They are the 
majority cell type in the epithelium. 


2. Goblet cells secrete mucus into the gut lumen that covers the epithelium 
with a protective coat. 


3. Paneth cells form part of the innate immune defense system (discussed in 
Chapter 24) and secrete proteins that kill bacteria. 


4. Enteroendocrine cells, of more than 15 different subtypes, secrete serotonin 
and peptide hormones that act on neurons and other cell types in the gut 
wall and regulate the growth, proliferation, and digestive activities of cells 
of the gut and other tissues. 

As if on a conveyor belt, the absorptive, goblet, and enteroendocrine cells 
travel mainly upward from their site of birth in the crypt, by a sliding movement in 
the plane of the epithelial sheet, to cover the surfaces of the villi. Within 3-5 days 
(in the mouse) after emerging from the crypts, the cells reach the tips of the villi, 
where they undergo apoptosis and are finally discarded into the gut lumen (see 
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Figure 22-1 Renewal of the gut lining. 
(A) The pattern of cell turnover and 
proliferation in the epithelium that forms the 
lining of the small intestine. Stem cells (red) 
lie at the crypt base, interspersed among 
nondividing differentiated cells (Paneth 
cells). Progeny of the stem cells move 
mainly upward from the crypts onto the 
villi; after a few quick divisions, they cease 
dividing and differentiate — some of them 
while still in the crypt, most of them as they 
emerge from the crypt. The Paneth cells, 
like the other nondividing differentiated 
cells, are continually replaced by progeny 
of the stem cells, but they migrate 
downward to the crypt base and survive 
there for many weeks. (B) Photograph 

of a section of part of the lining of the 

small intestine, showing the crypts and 

villi. Note the mixture of differentiated cell 
types, all generated from the stem cells; 
these are primarily absorptive cells, with 
mucus-secreting goblet cells (stained rea) 
interspersed among them. Enteroendocrine 
cells (not labeled) are less numerous and 
less easy to identify without special stains. 
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Movie 20.6). The Paneth cells in the crypts are produced in much smaller num- 
bers and have a different migration pattern. They live at the bottom of the crypts, 
where they too are continually replaced, although not so rapidly, persisting for 
several weeks (in the mouse) before undergoing apoptosis and being phagocyto- 
sed by their neighbors. 

The central problem is to understand the processes in the crypt that generate 
a continual supply of all these nondividing, terminally differentiated cell types. 


Stem Cells of the Small Intestine Lie at or Near the Base of Each 
Crypt 


The general pattern of cell proliferation and migration in the gut lining is revealed 
by a simple labeling method that uses injected pulses of tritiated (radioactive) 
thymidine or of a thymidine analog that can be detected in tissue sections. Cells 
that are in S phase of the division cycle incorporate the marker molecule into their 
DNA, and their fate can then be followed over subsequent hours and days. If a 
cell divides after incorporation of the label, the label becomes diluted, halving 
with each cell cycle. This can be quantified. Experiments based on this labeling 
method confirm, first of all, that dividing cells are confined to the crypts and that 
the differentiated cell types listed above do not divide. Second, the most rapidly 
dividing cells, with a cycle time of about 12 hours in the mouse, are shown to lie in 
the middle and upper parts of the crypt, and these cells are all fated to differentiate 
and stop dividing (see Figure 22-1A). Just above the base of the crypt, interspersed 
among the Paneth cells, lie cells that divide more slowly. These are the stem cells, 
which feed some of their progeny into the higher levels of the crypt destined for 
differentiation, while other progeny remain at the crypt base to continue the 
whole process. The rapidly dividing cells above these stem cells are derived from 
them, but already committed to differentiation. These cells are called committed 
precursors or transit amplifying cells, since their divisions serve to amplify the 
number of differentiated cells that ultimately result from each stem-cell division. 


The Two Daughters of a Stem Cell Face a Choice 
Stem cells have a critical role in a variety of tissues, and it is useful to list their 
defining properties: 
1. A stem cell is not itself terminally differentiated: that is, it is not at the end 
of a pathway of differentiation. 
2. It can divide without limit (or at least for the lifetime of the animal). 


3. When it divides, each daughter has a choice: it can either remain a stem 
cell, or itcan embark on a course that commits it to terminal differentiation 
(Figure 22-3). 
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Figure 22-2 The four main differentiated 
cell types found in the epithelial lining of 
the small intestine. All cells are oriented 
with the gut lumen at top. Broad orange 
arrows indicate direction of secretion 

or uptake of materials for each type of 
cell. All of these cells are generated from 
undifferentiated multipotent stem cells 
living near the bottoms of the crypts (See 
Figure 22-1). Absorptive (brush-border) 
cells outnumber the other cell types in the 
epithelium by about 10:1 or more. The 
microvilli on their apical surface provide 

a 30-fold increase of surface area, not 
only for the import of nutrients but also for 
the anchorage of enzymes that perform 
the final stages of extracellular digestion, 
breaking down small peptides and 
disaccharides into monomers that can be 
transported across the cell membrane. 
Goblet cells secrete mucus; these are 

the commonest of the secretory cell 

types. Paneth cells secrete (along with 
some growth factors) cryptdins — proteins 
of the defensin family that kill bacteria. 
Different subtypes of enteroendocrine cells 
secrete serotonin and peptide hormones 
into the gut wall (and thence the blood). 
Cholecystokinin is a hormone released 
from enteroendocrine cells in response 

to the presence of nutrients in the gut. It 
binds to receptors on nearby sensory nerve 
endings, which relay a signal to the brain 
to stop the feeling of hunger once one has 
eaten enough. (After T.L. Lentz, Cell Fine 
Structure. Philadelphia: Saunders, 1971; 
R. Krstic, Illustrated Encyclopedia 

of Human Histology. Berlin: 
Springer-Verlag, 1984.) 
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Stem cells are required wherever there is a recurring need to replace differ- 
entiated cells that cannot themselves divide. Although a stem cell must be able 
to divide, it does not necessarily have to divide rapidly; in fact, many stem cells 
divide at a relatively slow rate. 

Stem cells are of many types, specialized for the genesis of different classes 
of terminally differentiated cells—intestinal stem cells for intestinal epithelium, 
epidermal stem cells for epidermis, hematopoietic stem cells for blood, and so on. 
Each stem-cell system nevertheless raises similar fundamental questions. What 
are the distinguishing features of the stem cell in molecular terms? What condi- 
tions serve to keep the stem cell in its proper place and to maintain its stem-cell 
character? What decides whether a given daughter cell commits to differentiation 
or remains a stem cell? In a tissue where several distinct types of differentiated 
cells must be produced, are they all derived from a single type of stem cell, or is 
there a distinct type of stem cell for each one? 


Wht Signaling Maintains the Gut Stem-Cell Compartment 


For the gut, the beginnings of an answer to these questions came from studies of 
cancer of the colon and rectum (the lower end of the gut, also known as the large 
intestine). Some people have a hereditary predisposition to colorectal cancer and, 
in advance of the invasive disease, develop large numbers of small precancerous 
tumors (adenomas) in the lining of this part of the gut (Figure 22-4). The appear- 
ance of these tumors suggests that they have arisen from intestinal crypt cells that 
have failed to halt their proliferation in the normal way. As discussed in Chapter 
20, the cause has been traced to mutations in the Apc (adenomatous polyposis 
coli) gene: the tumors arise from cells that have lost both gene copies. Because 
Apc codes for a protein that prevents inappropriate activation of the Wnt signaling 
pathway (see Figure 15-60), this loss of Apc is presumed to mimic the effect of 
continual exposure to a Wnt signal. The suggestion, therefore, is that Wnt signal- 
ing normally keeps crypt cells in a proliferative state, and that a cessation of expo- 
sure to Wnt signaling normally makes them stop dividing as they leave the crypt. 


Stem Cells at the Crypt Base Are Multipotent, Giving Rise to the 
Full Range of Differentiated Intestinal Cell Types 


It has long been suspected that all the differentiated cell types in the lining of the 
intestine derive from a single type of stem cell. But firm proof was lacking, and the 
precise nature and location of the stem cells were disputed. 

To solve the problem, and indeed to understand the organization of any stem- 
cell system, we need to discover how its cells are related to one another—who 
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Figure 22-3 The definition of a stem cell. 
Each daughter produced when a stem cell 
divides can either remain a stem cell or 

go on to become terminally differentiated. 
In many cases, the daughter that opts for 
terminal differentiation undergoes additional 
cell divisions before terminal differentiation 
is completed; such cells are called transit 
amplifying cells. 


Figure 22-4 An adenoma in the human 
colon, compared with normal tissue 
from an adjacent region of the same 
person’s colon. The specimen is from a 
patient with an inherited mutation in one of 
his two copies of the Apc gene. A mutation 
in the other Apc gene copy, occurring in 

a colon epithelial cell during adult life, has 
given rise to a clone of cells that behave 
as though the Wnt signaling pathway is 
permanently activated. As a result, the 
cells of this clone form an adenoma —an 
enormous, steadily expanding mass of 
giant cryptlike structures. 
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Figure 22-5 Clonal analysis using a genetic marker. A modern method for tracking cell lineage 
uses transgenic animals containing two transgenes, which together drive expression of a readily 
detected and heritable marker protein in a small subset of stem cells. The first transgene (top) 
carries two adjacent protein-coding sequences, GFP and CreERT2, both expressed under the 
control of the Lgr promoter that is active only in stem cells and not in their differentiated progeny. 
GFP encodes green fluorescent protein (see Chapter 9), which is used here simply to confirm 
expression in the entire stem-cell population. The CreERT2 gene encodes a chimeric form of the 
Cre recombinase called CreERT, which consists of Cre recombinase linked to the estrogen receptor 
protein; this enzyme becomes active as a recombinase only when it binds the artificial estrogen 
analog tamoxifen. 

The second transgene (bottom) carries a marker gene, LacZ, under the control of a promoter 
that is active in all cells. The LacZ gene encodes f-galactosidase, an enzyme that can be detected 
histochemically in tissues (See Figure 7-28). However, LacZ expression in the transgene shown 
here is prevented by a blocking sequence (red) that is flanked by LoxP sites (pink; see Figure 
5-66). When tamoxifen is provided, CreERT becomes active—leading to a recombination event 
that removes the blocking DNA sequence (and leaves one LoxP site behind). As a result, the LacZ 
marker is expressed. Because this change is heritable, the marker continues to be expressed in 
all cells descended from those in which a recombination event has occurred. With a low dose of 
the inducer molecule tamoxifen, it is possible to activate the marker at random in just a few widely 
spaced cells, which, in the course of time, give rise to widely separated and easily distinguished 
clones of progeny (see Figure 22-6). 


is descended from whom, or, equivalently, what progeny will be produced from 
any given cell. This can best be done using a heritable marker that can be acti- 
vated in an individual cell, thus allowing the identification of the clone of progeny 
descended from that cell. A modern method uses transgenic animals to create 
a visible genetic mark in just a few widely spaced cells, which, in the course of 
time, give rise to widely separated and easily distinguished clones of progeny, as 
explained in Figure 22-5. 

A search among genes that are strongly upregulated in response to Wnt sig- 
naling revealed one, called Lgr5, that is expressed in gut stem cells specifically. 
The technique described in Figure 22-5 can be used to create a genetic mark in a 
random subset of Lgr5-expressing cells—a mark that is inherited by the progeny 
of each cell. These Lgr5 cells divide with a cycle time of about 24 hours, and within 
a few days marked clones are seen extending from the crypt bases up along the 
sides of the villi. After as long as 60 days or more, many of these clones still persist, 
retaining one or more members at the crypt base and extending all the way up to 
the tips of the villi (Figure 22-6). Moreover, each single clone typically contains 
all the major differentiated gut cell types—absorptive, goblet, Paneth, and entero- 
endocrine—in their normal proportions. The Lgr5-expressing cells, therefore, are 
true stem cells that are multipotent—that is, able to generate a diverse set of dif- 
ferentiated cell types. 
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The Two Daughters of a Stem Cell Do Not Always Have to 
Become Different 


If the number of stem cells in a crypt is to remain stable, each stem-cell division 
must on average generate one daughter that remains a stem cell and one that 
becomes committed to differentiation. In principle, this could be achieved in at 
least two ways (Figure 22-7). 

One mechanism—the simplest at first sight—would be through asymmetric 
division: processes internal to the dividing stem cell could distribute regulatory 
factors asymmetrically to its two daughters, as occurs in Drosophila neuroblast 
divisions (see Figure 21-36). The factors inherited by one daughter would cause it 
to remain a stem cell, while those inherited by the other would drive it toward dif- 
ferentiation. This strategy would guarantee that the original stem cell would give 
rise to precisely one stem cell in every subsequent cell generation. 

An alternative strategy would be based on a choice that each daughter makes 
independently ofits sister: in normal circumstances, each would have a 50% prob- 
ability of remaining as a stem cell and a 50% probability of commitment to differ- 
entiation. Sometimes the two daughters of a stem cell would thus have opposite 
fates, sometimes the same. The choice that each cell makes might either be sto- 
chastic, like the flip of a coin, or governed by the environment in which the cell 
finds itself. A strategy of independent choices is more flexible than that of strict 
asymmetric division. In particular, environmental factors can control the balance 
of probabilities, adjusting them in favor of the stem-cell option where more stem 
cells are needed, as they often are, either for growth or for damage repair. 

Clonal analysis gives a way to distinguish between the two strategies, since 
they give quite different predictions as to the expected number of clones of differ- 
ent sizes produced from individual stem cells (see Figure 22-7). For the gut, the 
findings seem clear: the independent-choice theory fits the observations, and the 
asymmetric-division theory does not. 


Paneth Cells Create the Stem-Cell Niche 


There are about 15 Lgr5-expressing stem cells in each crypt. They are slim and 
columnar, and they sit at the crypt base interspersed among the Paneth cells (see 
Figure 22-6). This is the intestinal stem-cell niche: the Paneth cells generate sig- 
nals, including a strong Wnt signal, that act over a short range to maintain the 
stem-cell state. Signal proteins from the connective tissue surrounding the crypt 
base help to reinforce the localizing signal from the Paneth cells; Lgr5 itself is a 
receptor for one of these proteins, called R-spondin. 

In the intestine, it seems that the niche created by the Paneth cells has space 
for only a limited number of stem cells, and when these divide, it is a random 


Figure 22-6 Lgr5-expressing stem cells 
and their progeny in the small intestine. 
The method shown in Figure 22-5 was 
used here to mark single intestinal stem 
cells and trace the fates of their progeny. 
The Lgr5 gene encodes a member of the 
family of G-protein-linked transmembrane 
receptors, and it is expressed specifically 
in stem cells near the crypt base. Because 
the Lgr5 promoter was used to drive 
expression of CreERT2, treatment with 

a low dose of tamoxifen resulted in 
occasional stem cells expressing LacZ. 
These cells and all of their progeny could 
subsequently be detected with a blue 
histochemical stain. All of the blue cells in 
these images derive from a single Lgr5- 
expressing stem cell. After 60 days, the 
blue progeny of this cell are seen to extend 
all the way up a villus. These progeny 

can be shown to include all types of 
differentiated cells, as well as persistent 
Lgrd-expressing cells at the crypt base. 
This proves that Lgr5-expressing cells are 
multipotent stem cells. (From N. Barker et 
al., Nature 449:1003-1007, 2007. With 
permission from Macmillan Publishers Ltd.) 
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Figure 22-7 Two ways for a stem cell to produce daughters with different fates: asymmetric division and independent 
choice. (A) The asymmetric-division strategy gives a clone consisting of precisely one stem cell plus a steadily increasing 
number of differentiating cells, in proportion to the number of cell divisions. (B) The independent-choice strategy is more variable 
in its outcome. With a choice made at random by each daughter and with a 50% probability for each one to remain a stem cell 
or differentiate, there is, for example, a 25% chance at the first division that both daughters will differentiate, so that the clone 
eventually goes extinct. Or, at this division or later, a preponderance of daughters may chance to retain stem-cell character, 
creating a clone that persists and increases in size. With the help of some mathematics, the probability distribution of clone 
sizes generated from a single stem cell at any given time can be predicted on this stochastic assumption. The observations in 
the gut and elsewhere fit the stochastic independent-choice strategy, but not the asymmetric-division strategy. 


matter which of them are pushed out of the nest and condemned to differentia- 
tion and which stay in place as stem cells for the future. In most other stem-cell 
systems where the question has been examined, it appears that the fates of the 
daughters of a stem cell are assigned in a similar way, independently and subject 
to influence from the cells’ environment. 


A Single Lgr5-expressing Cell in Culture Can Generate an Entire 
Organized Crypt-Villus System 


The Paneth cells themselves are progeny of the stem cells, suggesting that the 
intestinal stem-cell system is in some way self-maintaining and self-organizing. 
This is demonstrated in a striking way by taking single dissociated Lgr5-expressing 
cells and allowing them to proliferate in culture, embedded in a cell-free matrix 
rich in the basal-lamina component laminin (mimicking basal lamina). The cells 
proliferate, forming at first small, round epithelial vesicles. Within a few days, 
however, one or another of the cells in the vesicle, at random, begins to differen- 
tiate as a Paneth cell. This induces its neighbors to behave as stem cells and initi- 
ates transformation of the simple vesicle into an organized structure, or organoid 
(Figure 22-8A,B). Protrusions resembling crypts grow out into the surrounding 
matrix and contain Paneth cells, Lgr5-expressing stem cells, and the transit ampli- 
fying cells derived from them; these cell types are confined to the cryptlike struc- 
tures. Terminally differentiated, nondividing absorptive cells line the other parts 
of the organoid epithelium, with their microvilli facing the lumen. Goblet and 
enteroendocrine cells are also present, scattered through the epithelium, and the 
whole “minigut” structure, with all its cell types, grows and renews itself in much 
the same way as the lining of the normal intestine. 
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Figure 22-8 Genesis of a minigut from a single Lgr5-expressing cell cultured in a cell-free 
matrix. (A,B) The founder cell first divides to form a small vesicle. At random, one or more of the 
cells in this vesicle differentiates as a Paneth cell (blue). This cell maintains Lgr5 expression (yellow) 
in its immediate neighbors, which persist as stem cells that generate the full range of intestinal cell 
types. (C ) Schematic diagram of the key organizing signals. The Paneth cells organize crypts by 
producing a Wnt signal that acts on neighboring cells and keeps them proliferating in the stem- 
cell state. A repulsive interaction based on ephrin-Eph binding causes the crypt cell types (which 
express EphB, induced by Wnt) to segregate from the nondividing differentiated villus cell types 
(which express EphrinB). Both ephrin and Eph are cell-surface proteins attached to the plasma 
membrane; in many tissues, two cells that contain a different member of this pair repel each other 
when they touch (see Figure 21-49). (Adapted from T. Sato and H. Clevers, Science 340:1190- 
1194, 2013. With permission from AAAS.) 


Ephrin-Eph Signaling Drives Segregation of the Different Gut Cell 
Types 


The remarkable self-organizing behavior of the cultured organoids suggests that 
some interaction among the different epithelial cells drives them to segregate 
from one another. The ephrin-Eph signaling pathway (discussed in Chapter 15) 
appears to be responsible. The cells that live in the crypts express EphB receptor 
proteins, while absorptive, goblet, and enteroendocrine cells, as they begin to dif- 
ferentiate, switch off expression of this receptor and instead switch on expression 
of its ligands, cell-surface proteins of the EphrinB family (Figure 22-8C). In vari- 
ous other tissues, cells expressing Eph proteins are repelled by contacts with cells 
expressing ephrins on their surface (see Figures 21-49 and 21-79). It seems that 
the same is true in the gut lining, and that this mechanism serves to keep the cells 
segregated and in their proper places. In EphB knockout mutants, the populations 
become mixed, so that, for example, Paneth cells wander out onto the villi. 


Notch Signaling Controls Gut Cell Diversification and Helps 
Maintain the Stem-Cell State 


If a single type of stem cell generates all the differentiated cell types in the gut 
lining, what causes the progeny of this stem cell to diversify? Notch signaling has 
this role in many other systems, where it mediates lateral inhibition—a compet- 
itive interaction that drives neighboring cells toward different fates (see Figure 
15-58 and Figure 21-35). All the essential components of the Notch pathway 
are expressed in the crypts; it seems that Wnt signaling maintains them there. If 
Notch signaling is abruptly blocked, within a few days all the cells in the crypts 
differentiate as goblet cells, and absorptive cells cease to be produced; conversely, 
if Notch signaling is artificially activated in all the cells, absorptive cells continue 
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to be generated but no goblet cells are produced. This reflects the lateral inhi- 
bition mechanism operating in normal animals: the nascent goblet (and other 
secretory) cells express the Notch ligand Delta and thereby activate Notch in their 
neighbors, inhibiting them from differentiating as secretory (Figure 22-9). 

Delta-Notch signaling is crucial not only in the transit amplifying population, 
but also at the crypt base: the Paneth cells express Delta and this activates Notch 
in the stem cells, inhibiting differentiation. Without this influence, the stem cells 
lose their special character and differentiate as secretory cells. Thus maintenance 
of the intestinal stem-cell state requires a combination of signals, with both Wnt 
and Notch acting as central players. 


The Epidermal Stem-Cell System Maintains a Self-Renewing 
Waterproof Barrier 


Stem-cell systems are organized in many different ways, but they share some 
underlying principles. Consider the epidermis, for example—the outer, epithelial 
covering of the body. The epidermis undergoes continual renewal, but, unlike the 
lining of the gut, it is multilayered or stratified. Stem cells are located in the basal 
layer, and their progeny move outward toward the exposed surface, differentiating 
as they go. They end up as lifeless scales or sguames, which are eventually shed 
from the surface of the skin (Figure 22-10). Even though the architecture of this 
tissue is very different from that of the intestine, many of the same basic principles 
apply. The stem cells depend for their existence on signals from a specific niche, 
in this case the basal lamina and underlying connective tissue. The daughters of 
stem cells that are committed to differentiation undergo several divisions as tran- 
sit amplifying cells (while still in the basal layer) before differentiating. Finally, a 
stochastic independent-choice mechanism dictates the fates of the daughters of a 
stem-cell division, allowing for increase in the number of stem cells when needed 
for growth or wound healing. Most of the same signaling pathways that organize 
the intestinal stem-cell system are also involved in regulating the epidermal stem- 
cell system, although with different individual roles. 
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Figure 22-9 How Notch signaling, in 
combination with Wnt, maintains stem 
cells and drives cell diversification 

in the intestine. Wnt signaling leads to 
expression of Notch and Delta in the cells 
of the crypt, and Delta-Notch signaling 

in the crypt mediates lateral inhibition 
between adjacent cells. Cells expressing 
higher levels of Delta eventually activate 
Notch in their neighbors, adopt a secretory 
fate, and stop dividing; their neighbors, 
with activated Notch, are prevented 

from differentiating and keep on dividing. 
Essentially the same process operates 

at the crypt base, where the Paneth cells 
express higher levels of Delta to prevent 
stem cells from differentiating, and in 

the transit amplifying population, where 
nascent secretory cells express higher 
levels of Delta. Division continues in the 
Notch-activated cells as they move up the 
crypt, until they escape from the influence 
of Wnt and emerge onto the villi to become 
absorptive cells. 
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Figure 22-10 The multilayered structure of the epidermis, as seen in thin skin of a mouse. (A) The epidermis forms 

the outer covering of the skin, creating a waterproof barrier that is self-repairing and continually renewed. Beneath this lies 

a relatively thick layer of connective tissue, which includes the tough, collagen-rich dermis (from which leather is made) and 

the underlying fatty subcutaneous layer or hypodermis. The cells of the epidermis are called keratinocytes, because their 
characteristic differentiated activity is the synthesis of keratin intermediate filament proteins, which give the epidermis its 
toughness. These cells change their appearance and properties from one layer to the next, progressing through a regular 
program of differentiation. Those in the innermost layer, attached to an underlying basal lamina, are termed basal cells, and it is 
usually only these that divide: the basal cell population includes relatively small numbers of stem cells along with larger numbers 
of transit amplifying cells derived from them. Above the basal cells are several layers of larger prickle cells, shown in top view 

in (B), whose numerous desmosomes — each a site of anchorage for thick tufts of keratin filaments —are just visible in the light 
microscope as tiny prickles around the cell surface. Beyond the prickle cells lies the thin, darkly staining granular cell layer, 
where the cells are sealed together to form a waterproof barrier; this marks the boundary between the inner, metabolically active 
strata and the outermost layer of the epidermis, consisting of dead cells whose intracellular organelles have disappeared. These 
outermost cells are reduced to flattened scales, or squames, filled with densely packed keratin, which are eventually shed from 
the surface of the skin. The time from exit of a cell from the basal layer to its loss by shedding at the surface is a week or two, 
depending on body region and species. 

In addition to the cells destined for keratinization, the deep layers of the epidermis include small numbers of cells (not Shown) 
that invade this tissue and have quite different origins and functions. These immigrants include dendritic cells, called Langerhans 
cells, derived from bone marrow and belonging to the immune system; melanocytes (pigment cells) derived from the neural 
crest; and Merkel cells, which are associated with nerve endings in the epidermis. (B, from R.V. Krstic, Ultrastructure of the 
Mammalian Cell: an Atlas. Berlin: Springer-Verlag, 1979.) 


Tissue Renewal That Does Not Depend on Stem Cells: Insulin- 
Secreting Cells in the Pancreas and Hepatocytes in the Liver 


Some types of cells can divide even though fully differentiated, allowing for 
renewal and regeneration without the use of stem cells. The insulin-secreting cells 
(2 cells) of the pancreas are one example. Their mode of renewal has a special 
importance, because it is the loss of these cells (through autoimmune attack) that 
is responsible for type 1 (juvenile-onset) diabetes; they are also a significant factor 
in the type 2 (adult-onset) form of the disease. The p cells are normally seques- 
tered in cell clusters called islets of Langerhans. These islets contain no obvious 
subset of cells specialized to act as stem cells, yet fresh B cells are continually gen- 
erated within them. Lineage tracing studies, similar to those described above for 
the gut, show that the renewal of this population normally occurs by simple dupli- 
cation of the existing insulin-expressing cells, and not by means of stem cells. 
Another tissue that can renew by simple duplication of fully differentiated cells 
is the liver. The main cell type in the liver is the hepatocyte, a large cell that per- 
forms the liver’s metabolic functions. Hepatocytes normally live for a year or more 
and renew themselves through cell division at a very slow rate. Powerful homeo- 
static mechanisms operate to adjust the rate of cell proliferation or the rate of cell 
death, or both, so as to keep the organ at its normal size or restore it to that size 
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in case of damage. A dramatic effect is seen if large numbers of hepatocytes are 
removed surgically or are killed by poisoning with carbon tetrachloride. Within 
a day or so after either sort of damage, a surge of cell division occurs among the 
surviving hepatocytes, quickly replacing the lost tissue. If two-thirds of a rat’s liver 
is removed, for example, a liver of nearly normal size can regenerate from the 
remainder by hepatocyte proliferation within about two weeks. 

Both the pancreas and the liver contain small populations of stem cells that 
can be called into play as a backup mechanism for production of the differen- 
tiated cell types in more extreme circumstances. This imparts resilience to the 
mechanisms of renewal and repair. 


Some Tissues Lack Stem Cells and Are Not Renewable 


The variety among tissues in the capacity for self-renewal is illustrated in a striking 
way by comparing the olfactory epithelium in the nose, the auditory epithelium 
of the inner ear, and the photoreceptive epithelium of the retina. These three sen- 
sory structures, which like the epidermis develop from the ectodermal layer of 
the early embryo, differ radically in their self-renewal capabilities. The olfactory 
epithelium contains a population of stem cells that give rise to differentiated cells 
that have a limited life-span and are continually replaced. But unlike the epider- 
mis, these differentiated cells (the olfactory receptor cells) are neurons, with cell 
bodies lying in the olfactory epithelium and axons that extend back to the olfac- 
tory lobes in the brain. The continual renewal of this epithelium therefore involves 
continual production of fresh axons, which have to navigate back to the appropri- 
ate sites in the brain. 

In contrast, in mammals at least, the auditory epithelium and the retinal epi- 
thelium lack stem cells, and their sensory receptor cells—the sensory hair cells in 
the ear, the photoreceptors in the retina—are irreplaceable. Ifthey are destroyed— 
whether by too much exposure to loud noise, by looking into the beam of a laser, 
or through degenerative processes in old age—the loss is permanent. 


Summary 


Many tissues in the adult mammalian body are continually renewed by stem cells. 
Stem cells, by definition, are not terminally differentiated and have the ability to 
divide throughout the organism’s lifetime, yielding some progeny that differentiate 
and others that remain stem cells. The lining of the gut renews itself more rapidly 
than any other tissue in the mammalian body and provides a paradigm for the 
workings of stem-cell systems. In the small intestine, there is a continual upward 
flow from crypts, where new cells are generated by cell division, onto villi that are 
composed of nondividing differentiated cells. Wnt signaling maintains cell prolif- 
eration in the crypts, and overactivation of the Wnt pathway gives rise to tumors. 
Stem cells lie at each crypt base and are distinguished by expression of Lgr5 and 
certain other genes. The Lgr5* stem cells are multipotent, each capable of generating 
several different types of differentiated cells as well as new stem cells. The balance 
of fate choices is adjusted according to need, allowing increase in the number of 
stem cells where more are needed for growth or repair. In a suitable cell-free culture 
medium, a single Lgr5* stem cell can generate a self-organizing “minigut,” contain- 
ing all the standard intestinal epithelial cell types. 

Other self-renewing epithelia, such as the epidermis with its multilayered (strat- 
ified) architecture, have stem cells and their differentiating progeny arranged in 
different ways but are governed by similar basic principles. However, tissue renewal 
and repair does not always have to depend on stem cells. Thus, the population of 
insulin-producing cells in the pancreas is enlarged and renewed by simple duplica- 
tion of existing insulin-producing cells. Similarly, in the liver, differentiated hepato- 
cytes remain able to divide throughout life and can dramatically increase their 
division rate when the need arises. At an opposite extreme, some tissues, such as the 
sensory epithelia of the ear and the eye, do not undergo any turnover and are not 
renewable: their cells, once lost, are lost forever. 
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FIBROBLASTS AND THEIR TRANSFORMATIONS: 
THE CONNECTIVE-TISSUE CELL FAMILY 


From epithelia, with their varied patterns of renewal and their enormous variety 
of protective, absorptive, secretory, sensory, and biosynthetic functions, we turn 
now to connective tissues. Connective tissues typically consist of cells dispersed 
in extracellular matrix that they themselves secrete, as discussed in Chapter 19. 
They originate from the mesodermal (middle) layer of the early embryo, sand- 
wiched between ectoderm and endoderm (see Chapter 21, Figure 21-3). 

In the adult body, virtually all epithelia are supported by a connective-tissue 
bed, or stroma; and specialized types of connective tissue, such as bone, carti- 
lage, and tendon, form the supporting framework of the body as a whole. No less 
important than its mechanical role, connective tissue also contains the blood ves- 
sels that bring the oxygen and nourishment on which all cells depend. Cells of 
the immune system roam through connective tissue, passing in and out of blood 
vessels and lymphatics, and providing defence against infection; and through the 
meshes of connective tissue run peripheral nerves. Also embedded in connec- 
tive tissue are the muscles that enable us to move. In these many ways, the cells 
that form connective tissue and synthesize its various types of extracellular matrix 
contribute to the support and repair of almost every tissue and organ. 

Connective-tissue cells belong to a family of cell types that are related by ori- 
gin, and they are often remarkably interconvertible. The family includes fibro- 
blasts, cartilage cells, and bone cells, all of which are specialized for the secretion 
of collagenous extracellular matrix and are jointly responsible for the architec- 
tural framework of the body. The connective-tissue family also includes fat cells 
(adipocytes) and smooth muscle cells. Figure 22-11 illustrates these cell types and 
the interconversions that are thought to occur between them. The adaptability of 
the differentiated character of connective-tissue cells is an important feature of 
responses to many types of damage. 


Fibroblasts Change Their Character in Response to Chemical and 
Physical Signals 


Fibroblasts seem to be the least specialized cells in the connective-tissue family. 
They are dispersed in connective tissue throughout the body, where they secrete 
a nonrigid extracellular matrix that is rich in type I or type III collagen, or both, as 
discussed in Chapter 19. When a tissue is injured, the fibroblasts nearby prolifer- 
ate, migrate into the wound (Movie 22.1), and produce large amounts of collag- 
enous matrix that helps to isolate and repair the damaged tissue. Their ability to 
thrive in the face of injury, together with their solitary lifestyle, may explain why 
fibroblasts are the easiest of cells to grow in culture—a feature that has made them 
a favorite subject for cell biological studies. 
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Figure 22-11 The family of connective- 
tissue cells. Arrows show the 
interconversions that are thought to 

occur within the family. For simplicity, 

the fibroblast is shown as a single cell 
type, but it is uncertain how many types 
of fibroblasts exist and whether the 
differentiation potential of different types is 
restricted in different ways. 
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A class of connective-tissue cells in the bone marrow, called bone marrow stro- 
mal cells, provides an example of radical connective-tissue versatility. These cells, 
which can be regarded as a kind of fibroblast, can be isolated from the bone mar- 
row and propagated in culture. Large clones of progeny can be generated in this 
way from single ancestral stromal cells. Depending on the culture conditions, the 
members of such a clone either can continue proliferating to produce more cells 
of the same type, or can differentiate as fat cells, cartilage cells, or bone cells. The 
fate of the cells depends on physical as well as chemical signals: embedded in a 
stiff, unyielding matrix, they tend to turn into bone cells, whereas in a softer, more 
elastic matrix, they tend to turn into fat cells. This effect is mediated by an intra- 
cellular pathway that responds to tension in actin-myosin bundles and relays a 
signal to specific transcription regulators in the nucleus (Figure 22-12). Because 
of their self-renewing, multipotent character, the bone marrow stromal cells, and 
other cells with similar properties, are referred to as mesenchymal stem cells. 


Osteoblasts Make Bone Matrix 


Cartilage and bone are tissues of very different character; but they are closely 
related in origin, and the formation of the skeleton depends on an intimate part- 
nership between them. 

Cartilage tissue is structurally simple, consisting of cells of a single type— 
chondrocytes—embedded in a more or less uniform, highly hydrated matrix con- 
sisting of proteoglycans and type II collagen (discussed in Chapter 19). The carti- 
lage matrix is deformable, and the tissue grows by expanding as the chondrocytes 
divide and secrete more matrix (Figure 22-13). Bone, by contrast, is dense and 
rigid; it grows by apposition—that is, by deposition of additional matrix on free 
surfaces. Like reinforced concrete, the bone matrix is predominantly a mixture of 
tough fibers (type I collagen fibrils), which resist pulling forces, and solid particles 
(calcium phosphate as hydroxylapatite crystals), which resist compression. The 
bone matrix is secreted by osteoblasts that lie at the surface of the existing matrix 
and deposit fresh layers of bone onto it. Some of the osteoblasts remain free at the 
surface, while others gradually become embedded in their own secretion. ‘This 
freshly formed material (consisting chiefly of type I collagen) is rapidly converted 
into hard bone matrix by the deposition of calcium phosphate crystals in it. 

Once imprisoned in hard matrix, the original bone-forming cell, now called 
an osteocyte, has no opportunity to divide, although it continues to secrete addi- 
tional matrix in small quantities around itself. The osteocyte, like the chondro- 
cyte, occupies a small cavity, or lacuna, in the matrix, but unlike the chondrocyte 
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Figure 22-12 Control of fibroblast 
differentiation by the physical properties 
of the extracellular matrix. On a stiff 
matrix, the cells form strong adhesions, 
spread out, and tend to turn into bone 
cells. On a soft matrix, where the cells are 
unable to form strong anchorages, they fail 
to spread and tend to differentiate as fat 
cells. These effects depend on transcription 
regulators (YAP and TAZ proteins) that 
move into the cell nucleus in response to 
tension developed in actin—myosin bundles 
in the cytoplasm. (Based on S. Dupont et 
al., Nature 474:179-188, 2011.) 


Figure 22-13 The growth of cartilage. 
The tissue expands as the chondrocytes 
divide and make more matrix. The freshly 
synthesized matrix with which each cell 
surrounds itself is shaded dark green. 
Cartilage may also grow by recruiting 
fibroblasts from the surrounding tissue and 
converting them into chondrocytes. 
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it is not isolated from its fellows. Tiny channels, or canaliculi, radiate from each 
lacuna and contain cell processes from the resident osteocyte, enabling it to form 
gap junctions with adjacent osteocytes (Figure 22-14). Blood vessels and nerves 
run through the tissue, keeping the bone cells alive and reacting when the bone 
is damaged. 

A mature bone has a complex and beautiful architecture, in which dense plates 
of compact bone tissue enclose spaces spanned by light frameworks of trabecular 
bone—a filigree of delicate shafts and flying buttresses of bone tissue, with soft 
marrow in the interstices (Figure 22-15). The creation, maintenance, and repair 
of this structure depend not only on the cells of the connective-tissue family 
that synthesize matrix, but also on a separate class of cells called osteoclasts that 
degrade it, as we explain below. 


Bone ls Continually Remodeled by the Cells Within It 


For all its rigidity, bone is by no means a permanent and immutable tissue. 
Running through the hard extracellular matrix are channels and cavities occu- 
pied by living cells, which account for about 15% of the weight of compact bone. 
These cells are engaged in an unceasing process of remodeling: while osteoblasts 
deposit new bone matrix, osteoclasts demolish old bone matrix. This mechanism 
provides for continuous turnover and replacement of the matrix in the interior of 
the bone. 

Osteoclasts (Figure 22-16) are large, multinucleated cells that originate, like 
macrophages, from hematopoietic stem cells in the bone marrow (discussed later 
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Figure 22-14 Deposition of bone matrix 
by osteoblasts. Osteoblasts lining the 
surface of bone secrete the organic matrix 
of bone (osteoid) and are converted into 
osteocytes as they become embedded in 
this matrix. The matrix calcifies soon after 
it has been deposited. The osteoblasts 
themselves are thought to derive from 
osteogenic stem cells that are closely 
related to fibroblasts. 


Figure 22-15 Trabecular and compact 
bone. (A) Low-magnification scanning 
electron micrograph of trabecular bone in a 
vertebra of an adult man. The soft marrow 
tissue has been dissolved away. (B) A 

slice through the head of the femur, with 
bone marrow and other soft tissue likewise 
dissolved away, reveals the compact bone 
of the shaft and the trabecular bone in 

the interior. Because of the way in which 
bone tissue remodels itself in response to 
mechanical load, the trabeculae become 
oriented along the principle axes of stress 
within the bone. (A, courtesy of Alan Boyde; 
B, from J.B. Kerr, Atlas of Functional 
Histology. Mosby, 1999.) 
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Figure 22-16 Osteoclasts. (A) Drawing of an osteoclast in cross section. This giant, multinucleated cell erodes bone matrix. The 
“ruffled border” is a site of secretion of acids (to dissolve the bone minerals) and hydrolases (to digest the organic components of 
the matrix). Osteoclasts vary in shape, are motile, and often send out processes to resorb bone at multiple sites. They develop 
from monocytes and can be viewed as specialized macrophages. (B) An osteoclast on bone matrix, seen by scanning electron 
microscopy. The osteoclast has been crawling over the matrix, eating it away, and leaving a trail of pits where it has done so. 

(A, from R.V. Krstic, Ultrastructure of the Mammalian Cell: An Atlas. Berlin: Springer-Verlag, 1979; B, courtesy of Alan Boyde.) 


in this chapter). The precursor cells are released into the bloodstream and col- 
lect at sites of bone resorption, where they fuse to form the multinucleated osteo- 
clasts, which cling to surfaces of the bone matrix and eat it away. Osteoclasts are 
capable of tunneling deep into the substance of compact bone, forming cavities 
that are then invaded by other cells. A blood capillary grows down the center of 
such a tunnel, and the walls of the tunnel become lined with a layer of osteo- 
blasts (Figure 22-17). These osteoblasts lay down concentric layers of new matrix, 
which gradually fill the cavity, leaving only a narrow canal surrounding the new 
blood vessel. At the same time as some tunnels are filling up with bone, others are 
being bored by osteoclasts, cutting through older concentric systems. 
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Figure 22-17 The remodeling of 
compact bone. Osteoclasts acting 
together in a small group excavate a tunnel 
through the old bone, advancing at a rate 
of about 50 um per day. Osteoblasts enter 
the tunnel behind them, line its walls, and 
begin to form new bone, depositing layers 
of matrix at a rate of 1-2 um per day. At 
the same time, a capillary sorouts down the 
center of the tunnel. The tunnel eventually 
becomes filled with concentric layers of 
new bone, with only a narrow central 

canal remaining. Each such canal, besides 
providing a route of access for osteoclasts 
and osteoblasts, contains one or more 
blood vessels that transport the nutrients 
the bone cells require for survival. Typically, 
about 5-10% of the bone in a healthy adult 
mammal is replaced in this way each year. 
(After Z.F.G. Jaworski, B. Duck and 

G. Sekaly, J. Anat. 133:397-405, 1981. 
With permission from Blackwell Publishing.) 
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Osteoclasts Are Controlled by Signals From Osteoblasts 


The osteoblasts that make the matrix also produce the signals that recruit and 
activate the osteoclasts to degrade it. Disturbance of the balance can lead to osteo- 
porosis, where there is excessive erosion of the bone matrix and weakening of the 
bone, or to the opposite condition, osteopetrosis, where the bone becomes exces- 
sively thick and dense. Hormonal signals have powerful effects on this balance. 
Chronic use of corticosteroid drugs, for example, can cause osteoporosis as a side 
effect; but this can be treated by other drugs that redress the balance, including 
agents that block the factors that osteoblasts secrete to recruit osteoclasts. 

Local controls allow bone to be deposited in one place while it is resorbed 
in another. Through such controls over the process of remodeling, bones are 
endowed with a remarkable ability to adjust their structure in response to long- 
term variations in the load imposed on them. It is this that makes orthodontics 
possible, for example: a steady force applied to a tooth with a brace will cause it to 
move gradually, over many months, through the bone of the jaw, by remodeling of 
the bone tissue ahead of it and behind it. 

Bone can also undergo much more rapid and dramatic reconstruction when 
the need arises. Some cells capable of forming new cartilage persist in the connec- 
tive tissue that surrounds a bone. If the bone is broken, the cells in the neighbor- 
hood of the fracture repair it by a process that resembles the way bones develop 
in the embryo: cartilage is first laid down to bridge the gap and is then replaced 
by bone. The capacity for self-repair, so strikingly illustrated by the tissues of the 
skeleton, is a property of living structures that has no parallel among present-day 
man-made objects. 


Summary 


The family of connective-tissue cells includes fibroblasts, cartilage cells, bone cells, 
fat cells, and smooth muscle cells. Some classes of fibroblasts, such as the mesenchy- 
mal stem cells of bone marrow, seem to be able to transform into any of the other 
members of the family. These transformations of connective-tissue cell type are reg- 
ulated by the composition of the surrounding extracellular matrix, by cell shape, 
and by hormones and growth factors. Cartilage and bone both consist of cells and 
solid matrix that the cells secrete around themselves—chondrocytes in cartilage, 
osteoblasts in bone (osteocytes being osteoblasts that have become trapped within 
the bone matrix). The matrix of cartilage is deformable so that the tissue can grow 
by swelling, whereas bone is rigid and can grow only by apposition. While osteo- 
blasts secrete bone matrix, they also produce signals that recruit monocytes from 
the circulation to become osteoclasts, which degrade bone matrix. Through the 
activities of these antagonistic classes of cells, bone undergoes a perpetual remodel- 
ing through which it can adapt to the load it bears and alter its density in response 
to hormonal signals. Moreover, adult bone retains an ability to repair itself if frac- 
tured, by reactivation of the mechanisms that governed its embryonic development: 
cells in the neighborhood of the break convert into cartilage, which is later replaced 
by bone. 


GENESIS AND REGENERATION OF SKELETAL 
MUSCLE 


The term “muscle” includes many cell types, all specialized for contraction but 
in other respects dissimilar. As noted in Chapter 16, all eukaryotic cells possess a 
contractile system involving actin and myosin, but muscle cells have developed 
this apparatus to a high degree. Mammals possess four main categories of cells 
specialized for contraction: skeletal muscle cells, heart (cardiac) muscle cells, 
smooth muscle cells, and myoepithelial cells (Figure 22-18). These differ in func- 
tion, structure, and development. Although all of them generate contractile forces 
by using organized filament systems based on actin and myosin II, the actin and 
myosin molecules employed have somewhat different amino acid sequences, are 
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differently arranged in the cell, and are associated with different sets of proteins 
that control contraction. 

We focus in this section on skeletal muscle cells, which are responsible for 
practically all movements that are under voluntary control. These cells can be 
very large (2-3 cm long and 100 um in diameter in an adult human) and are often 
called muscle fibers because of their highly elongated shape. Each one is a syn- 
cytium, containing many nuclei within a common cytoplasm. In an intact mus- 
cle, they are bundled tightly together, with fibroblasts (and some fat cells) in the 
interstices between them and blood vessels and nerve fibers running through the 
tissue. The mechanisms of muscle contraction were discussed in Chapter 16. Here 
we consider the unusual strategy by which the multinucleate skeletal muscle cells 
are generated and maintained. 


Myoblasts Fuse to Form New Skeletal Muscle Fibers 


During development, certain cells, originating from the somites of a vertebrate 
embryo at a very early stage, become determined as myoblasts, the precursors 
of skeletal muscle fibers. After a period of proliferation, the myoblasts undergo a 
dramatic change of state: they stop dividing, switch on the expression of a whole 
battery of muscle-specific genes required for terminal differentiation, and fuse 
with one another to form multinucleate skeletal muscle fibers (Figure 22-19). 
Once differentiation and cell fusion have occurred, the cells do not divide and the 
nuclei never again replicate their DNA. 
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Figure 22-18 The four classes of 
muscle cells of a mammal. (A) Schematic 
drawings (to scale). (B-E) Scanning 
electron micrographs. Skeletal muscle 
fibers (B, from a hamster) are giant cells 
with many nuclei and are formed by cell 
fusion. The other types of muscle cells 

are more conventional, generally having 
only a single nucleus. Heart muscle cells 
(C, from a rat) resemble skeletal muscle 
fibers in that their actin and myosin 
filaments are aligned in very orderly arrays 
to form a series of contractile units called 
sarcomeres, so that the cells have a 
striated (striped) appearance. The arrows 
in (C) point to intercalated discs —end-to- 
end junctions between the heart muscle 
cells; skeletal muscle cells in long muscles 
are joined end-to-end in a similar way. 
Smooth muscle cells (D, from the urinary 
bladder of a guinea-pig) are so named 
because they do not appear striated; they 
belong to the connective-tissue family 
and are closely related to fibroblasts. Note 
that the smooth muscle is shown here 

at a lower magnification than the other 
muscle types. The functions of smooth 
muscle vary greatly, from propelling food 
along the digestive tract to erecting hairs 
in response to cold or fear. Myoepithelial 
cells (E, from a secretory alveolus of a 
lactating rat mammary gland) also have no 
striations, but unlike all other muscle cells 
they lie in epithelia and are derived from 
the ectoderm. They form the dilator muscle 
of the eye’s iris and serve to expel saliva, 
sweat, and milk from the corresponding 
glands. (B, courtesy of Junzo Desaki; 

C, from T. Fujiwara, in Cardiac Muscle in 
Handbook of Microscopic Anatomy 

[E.D. Canal, ed.]. Berlin: Springer-Verlag, 
1986; D, courtesy of Satoshi Nakasiro; 

E, from T. Nagato et al., Cell Tissue Res. 
209:1-10, 1980. With permission from 
Springer-Verlag.) 
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single two fused Figure 22-19 Myoblast fusion in culture. 

myoblasts myoblasts multinucleate muscle fibers The culture is stained with a fluorescent 
antibody (green) against skeletal muscle 
myosin, which marks differentiated 
muscle cells, and with a DNA-specific 
dye (blue) to show cell nuclei. (A) A short 
time after a change to a culture medium 
that favors differentiation, just two of the 
many myoblasts in the field of view have 
switched on myosin production and have 
fused to form a muscle cell with two nuclei 
(upper right). (B) Somewhat later, almost 
all the cells have differentiated and fused. 
(C) High-magnification view, showing 
characteristic striations (fine transverse 
stripes) in two of the multinucleate muscle 
cells. (Courtesy of Jacqueline Gross and 
Terence Partridge.) 
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Some Myoblasts Persist as Quiescent Stem Cells in the Adult 


Even though humans do not normally generate new skeletal muscle fibers in adult 
life, they still have the capacity to do so, and existing muscle fibers can resume 
growth when the need arises. Cells capable of serving as myoblasts are retained as 
small, flattened, and inactive cells lying in close contact with the mature muscle 
cell and contained within its sheath of basal lamina (Figure 22-20). If the mus- 
cle is damaged or stimulated to grow, these satellite cells are activated to prolifer- 
ate, and their progeny can fuse to repair the damaged muscle or to allow muscle 
growth. Satellite cells, or some subset of the satellite cells, are thus the stem cells 
of adult skeletal muscle, normally held in reserve in a quiescent state but available 
when needed as a self-renewing source of terminally differentiated cells. 

The process of muscle repair by means of satellite cells is, however, limited 
in what it can achieve. In one form of muscular dystrophy, for example, a genetic 
defect in the cytoskeletal protein dystrophin damages differentiated skeletal mus- 
cle cells. As a result, satellite cells proliferate to repair the damaged muscle fibers. 
This regenerative response is, however, unable to keep pace with the damage, and 
connective tissue eventually replaces the muscle cells, blocking any further pos- 
sibility of regeneration. A decline of capacity for repair likewise contributes to the 
weakening of muscle in the elderly. 
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Figure 22-20 Satellite cells repair skeletal muscle fibers. (A) The specimen is stained with an antibody (red) against a muscle 
cadherin, M-cadherin, which is present on both the satellite cell and the muscle fiber and is concentrated at the site where their 
membranes are in contact. The nuclei of the muscle fiber are stained green, and the nucleus of the satellite cell is stained blue. 

(B) Schematic of the repair of a damaged muscle fiber by proliferation and fusion of satellite cells. (A, courtesy of Terence Partridge.) 
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Summary 


Skeletal muscle fibers are one of four main categories of vertebrate cells specialized 
for contraction, and they are responsible for all voluntary movement. Each skeletal 
muscle fiber is a syncytium and develops by the fusion of many myoblasts. Myoblasts 
proliferate extensively, but once they have fused, they can no longer divide. Fusion 

generally follows the onset of myoblast differentiation, in which many genes encod- 

ing muscle-specific proteins are switched on coordinately. Some myoblasts persist 
in a quiescent state as satellite cells in adult muscle; when a muscle is damaged, 

these cells are reactivated to proliferate and to fuse in order to replace the muscle 
cells that have been lost. They are the stem cells of skeletal muscle, and exhaustion of 
their regenerative capacity is responsible for some forms of muscular dystrophy as 
well as for the decline of muscle mass in old age. 


BLOOD VESSELS, LYMPHATICS, AND ENDOTHELIAL 
CELLS 


Almost all tissues depend on a blood supply, and the blood supply depends on 
endothelial cells, which form the linings of the blood vessels. Endothelial cells 
have a remarkable capacity to adjust their number and arrangement to suit local 
requirements. They create an adaptable life-support system, extending by cell 
migration into almost every region of the body. If it were not for endothelial cells 
extending and remodeling the network of blood vessels, tissue growth and repair 
would be impossible. Cancerous tissue is as dependent on a blood supply as is 
normal tissue, and this has led to a surge of interest in endothelial cell biology, 
in the hope that it may be possible to block the growth of tumors by attacking the 
endothelial cells that bring them nourishment. 


Endothelial Cells Line All Blood Vessels and Lymphatics 


The largest blood vessels are arteries and veins, which have a thick, tough wall 
of connective tissue and many layers of smooth muscle cells (Figure 22-21). The 
inner wall is lined by an exceedingly thin single sheet of endothelial cells, the 
endothelium, separated from the surrounding outer layers by a basal lamina. The 
amounts of connective tissue and smooth muscle in the vessel wall vary accord- 
ing to the vessel’s diameter and function, but the endothelial lining is always pres- 
ent. In the finest branches of the vascular tree—the capillaries and sinusoids—the 
walls consist of nothing but endothelial cells and a basal lamina (Figure 22-22), 
together with a few scattered pericytes. Related to vascular smooth muscle cells, 
pericytes wrap themselves around the small vessels and strengthen them (Figure 
22-23). 
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Figure 22-21 Diagram of a small artery 
in cross section. The endothelial cells 
form the endothelial lining, which although 
inconspicuous, is the fundamental 
component. Compare with the capillary in 
Figure 22-22. 


Figure 22-22 Capillaries. Electron 
micrograph (left) of a cross section of a 
small capillary in the pancreas. The wall 
is formed by a single endothelial cell 
surrounded by a basal lamina, as seen 
most clearly in the drawing to the right. 
(From R.P. Bolender, J. Cell Biol. 61:269- 
287, 1974. With permission from The 
Rockefeller University Press.) 
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Less obvious than the blood vessels are the lymphatic vessels. These carry no 
blood and have much thinner and more permeable walls than the blood vessels. 
They provide a drainage system for the fluid (lymph) that seeps out of the blood 
vessels, as well as an exit route for white blood cells that have migrated from blood 
vessels into the tissues. Less happily, they can also provide the path by which 
cancer cells escape from a primary tumor to invade other tissues. The lymphat- 
ics form a branching system of tributaries, all ultimately discharging into a single 
large lymphatic vessel, the thoracic duct, which opens into a large vein close to the 
heart. Like blood vessels, lymphatics are lined with endothelial cells. 

Thus, endothelial cells line the entire blood and lymphatic vascular system, 
from the heart to the smallest capillary, and they control the passage of materi- 
als—and the transit of white blood cells—into and out of the bloodstream. Arter- 
ies, veins, capillaries, and lymphatics all develop from small vessels constructed 
primarily of endothelial cells and a basal lamina: connective tissue and smooth 
muscle are added later where required, under the influence of signals from the 
endothelial cells. 


Endothelial Tip Cells Pioneer Angiogenesis 


To understand how the vascular system comes into being and how it adapts to the 
changing needs of tissues, we have to understand endothelial cells. How do they 
become so widely distributed, and how do they form channels that connect in just 
the right way for blood to circulate through the tissues and for lymph to drain back 
to the bloodstream? 

Endothelial cells originate at specific sites in the early embryo from precursors 
that also give rise to blood cells. From these sites, the early embryonic endothelial 
cells migrate, proliferate, and differentiate to form the first rudiments of blood 
vessels—a process called vasculogenesis. Subsequent growth and branching of 
the vessels throughout the body occurs mainly by proliferation and movement of 
the endothelial cells of these first vessels, in a process called angiogenesis. 

Angiogenesis occurs in a broadly similar way in the young organism as it grows 
and in the adult during tissue repair and remodeling. We can watch the behavior 
of the cells in naturally transparent structures, such as the cornea of the eye or the 
fin of a tadpole, or in tissue culture, or in the embryo. The embryonic retina, which 
blood vessels invade according to a predictable timetable, provides a convenient 
example for experimental study. Each new vessel originates as a capillary sprout 
from the side of an existing capillary or small venule (Figure 22-24). At the tip 
of the sprout, leading the way, is an endothelial cell with a distinctive character. 
This tip cell has a pattern of gene expression somewhat different from that of the 
endothelial stalk cells following behind it, and while they divide, it does not. The 
tip cell’s most striking feature is that it puts out many long filopodia, resembling 
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Figure 22-24 Angiogenesis. (A) A new blood capillary forms by the sprouting of an endothelial 
cell from the wall of an existing small vessel. An endothelial tip cell, with many filopodia, leads the 
advance of each capillary sprout. The endothelial stalk cells trailing behind the tip cell become 
hollowed out to form a lumen. (B) Blood capillaries sprouting in the retina of an embryonic mouse 
that had a red dye injected into the bloodstream, revealing the capillary lumen opening up behind 
the tip cell (Movie 22.2). (B, from H. Gerhardt et al., J. Cell Biol. 161:1163-1177, 2003. With 
permission from the author.) 
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Figure 22-23 Pericytes. The scanning 
electron micrograph shows pericytes 
wrapping their processes around a small 
blood vessel (a post-capillary venule) in the 
mammary gland of a cat. Pericytes are also 
present around capillaries, but are much 
more sparsely distributed there. (From 

T. Fujiwara and Y. Uehara, Am. J. Anat. 
170:39-54, 1984. With permission from 
Wiley-Liss.) 
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those of a neuronal growth cone. The column of stalk cells behind it, meanwhile, 
becomes hollowed out to form a lumen. 

The endothelial tip cells that pioneer the growth of normal capillaries not only 
look like neuronal growth cones, but also respond similarly to signals in the envi- 
ronment. In fact, many of the same guidance molecules are involved, including 
the netrins, slits, and ephrins mentioned in our account of neural development 
in the previous chapter. The corresponding receptors are expressed in the tip cells 
and guide the vascular sprouts along specific pathways in the embryo, often in 
parallel with nerves. Perhaps the most important of the guidance molecules for 
endothelial cells, however, is one that is chiefly dedicated to the control of vascu- 
lar development: vascular endothelial growth factor, or VEGE 


Tissues Requiring a Blood Supply Release VEGF 


Almost every cell, in almost every tissue of a vertebrate, is located within 
50-100 um ofa blood capillary. What mechanism ensures that the system of blood 
vessels branches into every nook and cranny? How is it adjusted so perfectly to 
the local needs of the tissues, not only during normal development but also in 
pathological circumstances? Wounding, for example, induces a burst of cap- 
illary growth in the neighborhood of the damage, to satisfy the high metabolic 
requirements of the repair process (Figure 22-25). Local irritants and infections 
also cause a proliferation of new capillaries, most of which regress and disappear 
when the inflammation subsides. Less benignly, a small sample of tumor tissue 
implanted in the cornea, which normally lacks blood vessels, causes blood vessels 
to grow quickly toward the implant from the vascular margin of the cornea; the 
growth rate of the tumor increases abruptly as soon as the vessels reach it. 

In all these cases, the invading endothelial cells respond to signals produced 
by the tissue that they invade. The signals are complex, but a key part is played 
by vascular endothelial growth factor (VEGF). The regulation of blood vessel 
growth to match the needs of the tissue depends on the control of VEGF produc- 
tion, through changes in the stability of its mRNA and in its rate of transcription. 
The latter control is relatively well understood. A shortage of oxygen, in practically 
any type of cell, causes an increase in the intracellular level of a transcription fac- 
tor called hypoxia-inducible factor 1a (HIF1a@). HIFla stimulates transcription 
of Vegf (and of other genes whose products are needed when oxygen is in short 
supply). The VEGF protein is secreted, diffuses through the tissue, and acts on 
nearby endothelial cells, stimulating them to proliferate, to produce proteases 
to help them digest their way through the basal lamina of the parent capillary or 
venule, and to form sprouts. The tip cells of the sprouts detect the VEGF gradient 
and move toward its source. As the new vessels form, bringing blood to the tissue, 
the oxygen concentration rises. The HIF1q activity then declines, VEGF produc- 
tion is shut off, and angiogenesis comes to a halt (Figure 22-26). 
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Figure 22-25 New capillary formation in 
response to wounding. Scanning electron 
micrographs of casts of the system of 
blood vessels surrounding the margin of 
the cornea show the reaction to wounding. 
The casts are made by injecting a resin 

into the vessels and letting the resin set; 
this reveals the shape of the lumen, as 
opposed to the shape of the cells. Sixty 
hours after wounding, many new capillaries 
have begun to sprout toward the site of 
injury, which is just above the top of the 
picture. Their oriented outgrowth reflects 

a chemotactic response of the endothelial 
cells to an angiogenic factor released at the 
wound. (Courtesy of Peter C. Burger.) 
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LOW O, Figure 22-26 The regulatory mechanism 
controlling blood vessel growth 
| according to a tissue’s need for oxygen. 
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signals from Endothelial Cells Control Recruitment of Pericytes 
and Smooth Muscle Cells to Form the Vessel Wall 


The vascular network is continually remodeled as it grows and adapts. A newly 
formed vessel may enlarge; or it may sprout side branches; or it may regress. 
Smooth muscle and other connective-tissue cells that pack themselves around 
the endothelium (see Figure 22-23) help to stabilize vessels as they enlarge. This 
process of vessel wall formation begins with recruitment of pericytes. Small num- 
bers of these cells travel outward in company with the stalk cells of each endo- 
thelial sprout. The recruitment and proliferation of pericytes and smooth muscle 
cells to form a vessel wall depend on platelet-derived growth factor-B (PDGF-B) 
secreted by the endothelial cells and on PDGF receptors in the pericytes and 
smooth muscle cells. In mutants lacking this signal protein or its receptor, these 
vessel wall cells are missing in many regions. As a result, the embryonic blood 
vessels develop microaneurysms—microscopic pathological dilatations—that 
eventually rupture, as well as other abnormalities, reflecting the importance of 
signals exchanged in both directions between the exterior cells of the wall and the 
endothelial cells. 


Summary 


Endothelial cells are the fundamental elements of the vascular system. They form a 
single cell layer that lines all blood vessels and lymphatics and regulates exchanges 
between the bloodstream and the surrounding tissues. New vessels originate as 
endothelial sprouts from the walls of existing small vessels. A specialized motile 
endothelial tip cell at the leading edge of each sprout puts out filopodia that respond 
to gradients of guidance molecules in the environment, leading the growth of the 
sprout in much the same way as the growth cone of a neuron is led. The endothelial 
stalk cells following behind become hollowed out to form a capillary tube. Signals 
from endothelial cells organize the growth and development of the connective- 
tissue cells that form the surrounding layers of the vessel wall. 

A homeostatic mechanism ensures that blood vessels permeate every region of 
the body. Cells that are short of oxygen increase their concentration of hypoxia- 
inducible factor 1a (HIF1a), which stimulates the production of vascular endothe- 
lial growth factor (VEGF). VEGF acts on endothelial cells, causing them to prolifer- 
ate and invade the hypoxic tissue to supply it with new blood vessels. As new vessels 
enlarge, they recruit increasing numbers of pericytes—cells that cling to the outside 
of the endothelial tube and mature into the smooth muscle coat that is needed to 
give the vessel strength. 
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A HIERARCHICAL STEM-CELL SYSTEM: BLOOD CELL 
FORMATION 


The function of blood vessels is to carry blood, and it is to blood itself that we now 
turn. Blood contains many types of cells, with functions that range from the trans- 
port of oxygen to the production of antibodies. Some of these cells stay within the 
vascular system, while others use the vascular system only as a means of trans- 
port and perform their function elsewhere. All blood cells, however, have certain 
similarities in their life history. They all have limited life-spans and are produced 
throughout the life of the animal. Most remarkably, they are all generated ulti- 
mately from a common stem cell, located (in adult humans) in the bone marrow. 
This hematopoietic (blood-making) stem cell is thus multipotent, giving rise to all 
the types of terminally differentiated blood cells as well as some other types of 
cells, such as the osteoclasts in bone, as mentioned earlier. The hematopoietic 
system is the most complex of the stem-cell systems in the mammalian body, and 
it is exceptionally important in medical practice. 


Red Blood Cells Are All Alike; White Blood Cells Can Be Grouped 
in Three Main Classes 


Blood cells can be classified as red or white. The red blood cells, or erythrocytes, 
remain within the blood vessels and transport O2 and CO2 bound to hemoglobin. 
The white blood cells, or leukocytes, combat infection and in some cases phago- 
cytose and digest debris. Leukocytes, unlike erythrocytes, must make their way 
across the walls of small blood vessels and migrate into tissues to perform their 
tasks. In addition, the blood contains large numbers of platelets, which are not 
entire cells but small, detached cell fragments or “minicells” derived from the cor- 
tical cytoplasm of large cells called megakaryocytes. Platelets adhere specifically 
to the endothelial cell lining of damaged blood vessels, where they help to repair 
breaches and aid in blood clotting. 

All red blood cells belong in a single class, following the same developmental 
trajectory as they mature, and the same is true of platelets; but there are many 
distinct types of white blood cells. White blood cells are traditionally grouped into 
three major categories—granulocytes, monocytes, and lymphocytes—based on 
their appearance in the light microscope. 

Granulocytes contain numerous lysosomes and secretory vesicles (or gran- 
ules) and are subdivided into three classes according to the morphology and 
staining properties of these organelles (Figure 22-27). The differences in stain- 
ing reflect major differences of chemistry and function. Neutrophils (also called 
polymorphonuclear leukocytes because of their multilobed nucleus) are the most 
common type of granulocyte; they phagocytose and destroy microorganisms, 
especially bacteria, and thus have a key role in innate immunity to bacterial infec- 
tion, as discussed in Chapter 24 (see Movie 16.1). Basophils secrete histamine 
(and, in some species, serotonin) to help mediate inflammatory reactions; they 
are Closely related to mast cells, which reside in connective tissues but are also 
generated from the hematopoietic stem cells. Eosinophils help to destroy para- 
sites and modulate allergic inflammatory responses. 

Once they leave the bloodstream, monocytes (see Figure 22-27D) mature 
into macrophages, which, together with neutrophils, are the main “professional 
phagocytes” in the body. As discussed in Chapter 13, both types of phagocytic cells 
contain specialized lysosomes that fuse with newly formed phagocytic vesicles 
(phagosomes), exposing phagocytosed microorganisms to a barrage of enzymat- 
ically produced, highly reactive molecules of superoxide (O27) and hypochlorite 
(C1O”, the active ingredient in bleach), as well as to attack by a concentrated mix- 
ture of lysosomal hydrolase enzymes that become activated in the phagosome. 
Macrophages, however, are much larger and longer-lived than neutrophils. They 
recognize and remove senescent, dead, and damaged cells in many tissues, and 
they are unique in being able to ingest large microorganisms such as protozoa. 
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Figure 22-27 White blood cells. (A-D) These electron micrographs show 
(A) a neutrophil, (B) a basophil, (C) an eosinophil, and (D) a monocyte. 
Electron micrographs of lymphocytes are shown in Figure 24-14. Each 

of the cell types shown here has a different function, which is reflected 

in the distinctive types of secretory granules and lysosomes it contains. 
There is only one nucleus per cell, but it has an irregular lobed shape, 

and in (A), (B), and (C) the connections between the lobes are out of the 
plane of section. (E) A light micrograph of a blood smear stained with the 
Romanowsky stain, which colors the white blood cells strongly. (Courtesy 
of Dorothy Bainton.) 





Monocytes also give rise to dendritic cells. Like macrophages, dendritic cells 
are migratory cells that can ingest foreign substances and organisms, but they do 
not have as active an appetite for phagocytosis and instead have a crucial role 
as presenters of foreign antigens to lymphocytes to trigger an immune response. 
Dendritic cells in the epidermis (called Langerhans cells), for example, ingest for- 
eign antigens and carry these trophies back from the skin to present to lympho- 
cytes in lymph nodes. 

There are two main classes of lymphocytes, both involved in immune 
responses: B lymphocytes make antibodies, while T lymphocytes kill virus- 
infected cells and regulate the activities of other white blood cells. In addition, 
there are lymphocyte-like cells called natural killer (NK) cells, which kill some 
types of tumor cells and virus-infected cells. The production of lymphocytes is a 
specialized topic discussed in detail in Chapter 24. Here we concentrate mainly 
on the development of the other blood cells, often referred to collectively as 
myeloid cells. 

Table 22-1 summarizes the various types of blood cells and their functions. 


The Production of Each Type of Blood Cell in the Bone Marrow Is 
Individually Controlled 


Most white blood cells function in tissues other than the blood; blood simply 
transports them to where they are needed. A local infection or injury in any tissue 
rapidly attracts white blood cells into the affected region as part of the inflamma- 
tory response, which helps fight the infection or heal the wound (Movie 22.3). 
The inflammatory response is complex and is governed by many different sig- 
nal molecules produced locally by mast cells, nerve endings, platelets, and white 
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TABLE 22-1 


White blood cells (leukocytes) 


Granulocytes 


Neutrophils (oolymorphonuclear Phagocytose and destroy invading bacteria 

leukocytes) 

Eosinophils Destroy larger parasites and modulate allergic 
inflammatory responses 


Basophils Release histamine (and in some species serotonin) in 
certain immune reactions 


Monocytes Become tissue macrophages, which phagocytose and 
digest invading microorganisms and foreign bodies as 
well as damaged senescent cells 


Lymphocytes 


B cells Make antibodies 


T cells Kill virus-infected cells and regulate activities of other 
leukocytes 
Natural killer (NK) cells Kill virus-infected cells and some tumor cells 
Platelets (cell fragments arising from Initiate blood clotting 
megakaryocytes in bone marrow) 


Humans contain about 5 liters of blood, accounting for 7% of body weight. Red blood cells constitute about 45% of this volume and white blood 
cells about 1%, the rest being the liquid blood plasma. 





blood cells, as well as by the activation of complement (discussed in Chapter 24). endothelial cell white blood cell in capillary 
Some of these signal molecules act on the endothelial lining of nearby capillaries, 
helping white blood cells to first stick and then make an exit from the bloodstream 
into the tissue where they are needed, as described in Chapter 19 (see Figure 
19-28 and Movie 19.2). Damaged or inflamed tissues and local endothelial cells 
secrete other molecules called chemokines, which act as chemoattractants for 
specific types of white blood cells, causing them to become polarized and crawl OO TOTO 
toward the source of the attractant. As a result, large numbers of white blood cells OF INFLAMMATION RELEASED 
enter the affected tissue (Figure 22-28). FROM DAMAGED TISSUE 
Other signal molecules produced during an inflammatory response escape 
into the blood and stimulate the bone marrow to produce more leukocytes and 
release them into the bloodstream. The regulation tends to be cell-type specific: 
some bacterial infections, for example, cause a selective increase in neutrophils, 








while infections with some protozoa and other parasites cause a selective increase Co 
in eosinophils. (For this reason, physicians routinely use differential white blood 
cell counts to aid in the diagnosis of infectious and other inflammatory diseases.) OERE TOWED 

In other circumstances, erythrocyte production is selectively increased— for ATTRACTANTS RELEASED 
example, in response to anemia (lack of hemoglobin) due to blood loss, and in the FROM DAMAGED TISSUE 


process of acclimatization when one goes to live at high altitude, where oxygen is 
scarce. Thus, blood cell formation, or hematopoiesis, necessarily involves complex 
controls, which regulate the production of each type of blood cell individually to 
meet changing needs. 





Figure 22-28 Chemotaxis of white blood cells to damaged tissue. 
A chemoattractive signal released from a site of damage, which is toward 


the bottom of the page, causes white blood cells to exit from the capillary by 


crawling between adjacent endothelial cells, as shown. white blood cells in connective tissue 
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Bone Marrow Contains Multipotent Hematopoietic Stem Cells, 
Able to Give Rise to All Classes of Blood Cells 


In the bone marrow, the developing blood cells and their precursors, including 
the stem cells, are intermingled with one another, as well as with fat cells and 
other stromal cells (connective-tissue cells), which produce a delicate support- 
ing meshwork of collagen fibers and other extracellular matrix components. In 
addition, the whole tissue is richly supplied with thin-walled blood vessels, called 
blood sinuses, into which the new blood cells are discharged. Megakaryocytes are 
also present; these, unlike other blood cells, remain in the bone marrow when 
mature and are one of its most striking features, being extraordinarily large (diam- 
eter up to 60 um) with a highly polyploid nucleus. They normally lie close beside 
blood sinuses, and they extend processes through holes in the endothelial lining 
of these vessels; platelets pinch off from the processes and are swept away into the 
blood (Figure 22-29 and Movie 22.4). 

Because of the complex arrangement of the cells in bone marrow, it is difficult 
to identify in ordinary tissue sections any but the immediate precursors of the 
mature blood cells. There is no obvious visible characteristic by which we can rec- 
ognize the ultimate stem cells. In the case of hematopoiesis, the stem cells were 
first identified by a functional assay that exploited the wandering lifestyle of blood 
cells and their precursors. 

When an animal is exposed to a large dose of x-rays, most of the hematopoietic 
cells are destroyed and the animal dies within a few days as a result of its inability 
to manufacture new blood cells. The animal can be saved, however, by a trans- 
fusion of cells taken from the bone marrow of a healthy, immunologically com- 
patible donor. Among these cells there are some that can colonize the irradiated 
host and permanently reequip it with hematopoietic tissue (Figure 22-30). Such 
experiments prove that the marrow contains hematopoietic stem cells. They also 
show how we can assay for the presence of hematopoietic stem cells and hence 
discover the molecular features that distinguish them from other cells. 

For this purpose, cells taken from bone marrow are sorted (using a fluores- 
cence-activated cell sorter) according to the surface antigens that they display, 
and the different fractions are transfused back into irradiated mice. If a fraction 
rescues an irradiated host mouse, it must contain hematopoietic stem cells. In this 
way, it has been possible to show that the hematopoietic stem cells are character- 
ized by a specific combination of cell-surface proteins, and by appropriate sorting 
we can obtain virtually pure stem-cell preparations. The stem cells turn out to bea 
tiny fraction of the bone marrow population—about 1 cell in 50,000-100,000; but 
this is enough. A single such cell injected into a host mouse with defective hema- 
topoiesis is sufficient to reconstitute its entire hematopoietic system, generating a 
complete set of blood cell types, as well as fresh stem cells. This and other exper- 
iments (using artificial lineage markers) show that the individual hematopoietic 
stem cell is multipotent and can give rise to the complete range of blood cell types, 
both myeloid and lymphoid, as well as to new stem cells like itself (Figure 22-31). 


Figure 22-29 A megakaryocyte among 
other developing blood cells in the bone 
marrow. The megakaryocyte’s enormous 
size results from its having a highly 
polyploid nucleus. One megakaryocyte 
produces about 10,000 platelets, which 
split off from long processes that extend 
through holes in the walls of an adjacent 
blood sinus. 
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mouse survives; the injected stem 
cells colonize its hematopoietic tissues 
and generate a steady supply of 

new blood cells 





Figure 22-30 Rescue of an irradiated 
mouse by a transfusion of bone marrow 
cells. An essentially similar procedure is 
used in the treatment of leukemia in human 
patients by bone marrow transplantation. 
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Commitment Is a Stepwise Process 


Hematopoietic stem cells do not jump directly from a multipotent state into a 
commitment to just one pathway of differentiation; instead, they go through a 
series of progressive restrictions. The first step, usually, is commitment to either a 
myeloid or a lymphoid fate. This is thought to give rise to two kinds of progenitor 
cells, one capable of generating large numbers of all the different types of myeloid 
cells, and the other giving rise to large numbers of all the different types of lym- 
phoid cells. Further steps give rise to progenitors committed to the production of 
just one cell type. The steps of commitment correlate with changes in the expres- 
sion of specific transcription regulators, needed for the production of different 
subsets of blood cells. 


Divisions of Committed Progenitor Cells Amplify the Number of 
Specialized Blood Cells 


Hematopoietic progenitor cells generally become committed to a particular path- 
way of differentiation long before they cease proliferating and terminally differ- 
entiate. The committed progenitors go through many rounds of cell division to 
amplify the ultimate number of cells of the given specialized type. In this way, a 
single stem-cell division can lead to the production of thousands of differentiated 
progeny, which explains why the number of stem cells is such a small fraction 
of the total population of hematopoietic cells. For the same reason, a high rate 
of blood cell production can be maintained even though the stem-cell division 
rate is low. The smaller the number of division cycles that the stem cells them- 
selves have to undergo in the course of a lifetime, the lower the risk of generating 
stem-cell mutations, which would give rise to persistent mutant clones of cells in 
the body—a particular danger in the hematopoietic system where, as discussed 
in Chapter 20, a relatively small accumulation of mutations can be sufficient to 
cause cancer. A low rate of stem-cell division also slows the process of replicative 
cell senescence (discussed in Chapter 17). 
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Figure 22-31 A tentative scheme 

of hematopoiesis. The multipotent 

stem cell normally divides infrequently 

to generate either more multipotent 

stem cells, which are self-renewing, or 
committed progenitor cells, which are 
limited in the number of times that they 
can divide before differentiating to form 
mature blood cells. As they go through 
their divisions, the progenitors become 
progressively more specialized in the 
range of cell types that they can give rise 
to, as indicated by the branching of this 
cell-lineage diagram. In adult mammals, 
all of the cells shown develop mainly in the 
bone marrow— except for T lymphocytes, 
which as indicated develop in the thymus, 
and macrophages and osteoclasts, 
which develop from blood monocytes. 
Some dendritic cells may also derive from 
monocytes. 
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The stepwise nature of commitment means that the hematopoietic system can 
be viewed as a hierarchical family tree of cells. Multipotent stem cells give rise 
to committed progenitor cells, which are specified to give rise to only one or a 
few blood cell types. The committed progenitors divide rapidly, but only a limited 
number of times, before they terminally differentiate into cells that divide no fur- 
ther and die after several days or weeks. Figure 22-31 depicts the hematopoietic 
family tree. It should be noted, however, that variations are thought to occur: not 
all stem cells generate the identical patterns of progeny via precisely the same 
sequence of steps. 


Stem Cells Depend on Contact Signals From Stromal Cells 


Like the stem cells of other tissues, hematopoietic stem cells depend on signals 
from their niche, in this case created by the specialized connective tissue of the 
bone marrow. (This is the site in adult humans; during development, and in non- 
human mammals such as the mouse, hematopoietic stem cells can also make 
their home in other tissues—notably liver and spleen.) When they lose contact 
with their niche, the hematopoietic stem cells tend to lose their stem-cell poten- 
tial (Figure 22-32). Evidently the loss of potency is not absolute or instantaneous, 
however, since the stem cells can still survive journeys via the bloodstream to col- 
onize other sites in the body. 


Factors That Regulate Hematopoiesis Can Be Analyzed in Culture 


While the stem cells depend on contact with bone marrow stromal cells for long- 
term maintenance, their committed progeny do not, or at least not to the same 
degree. These cells can thus be dispersed and cultured in a semisolid matrix of 
dilute agar or methylcellulose, and factors derived from other cells can be added 
artificially to the medium. The semisolid matrix inhibits migration, so that the 
progeny of each isolated precursor cell remain together as an easily distinguish- 
able colony. A single committed neutrophil progenitor, for example, may give rise 
to aclone of thousands of neutrophils. Such culture systems have provided a way 
to assay for the factors that support hematopoiesis and hence to purify them and 
explore their actions. These substances are glycoproteins and are usually called 
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Figure 22-32 Dependence of 
hematopoietic stem cells on contact 
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colony-stimulating factors (CSFs). Some of these factors circulate in the blood 
and act as hormones, while others act in the bone marrow as secreted local medi- 
ators; still others take the form of membrane-bound signals that act through cell- 


cell contact. 

An important example of the latter is a protein called Steel or Stem Cell Fac- 
tor (SCF). This is expressed both in the bone marrow stroma (where it helps to 
define the stem-cell niche) and along pathways of migration, and it occurs both 
in a membrane-bound and a soluble form. It binds to a receptor tyrosine kinase 
called Kit, and it is required during development for guidance and survival not 
only of hematopoietic cells but also of other migratory cell types—specifically, 
germ cells and pigment cells. 


Erythropoiesis Depends on the Hormone Erythropoietin 


The best understood of the CSFs that act as hormones is the glycoprotein erythro- 
poietin, which is produced in the kidneys and regulates erythropoiesis, the forma- 
tion of red blood cells, to which we now turn. 

The erythrocyte is by far the most common type of cell in the blood (see Table 
22-1). When mature, it is packed full of hemoglobin and contains hardly any of 
the usual cell organelles. In an erythrocyte of an adult mammal, even the nucleus, 
endoplasmic reticulum, mitochondria, and ribosomes are absent, having been 
extruded from the cell in the course of its development (Figure 22-33). The eryth- 
rocyte therefore cannot grow or divide, and it has a limited life-span—about 120 
days in humans or 55 days in mice. Worn-out erythrocytes are phagocytosed and 
digested by macrophages in the liver and spleen, which remove more than 10!! 


factor. 


A HIERARCHICAL STEM-CELL SYSTEM: BLOOD CELL FORMATION 


Figure 22-33 A developing red blood cell (erythroblast). The cell is shown 
extruding its nucleus to become an immature erythrocyte (a reticulocyte), 
which then leaves the bone marrow and passes into the bloodstream. The 
reticulocyte will lose its mitochondria and ribosomes within a day or two 

to become a mature erythrocyte. Erythrocyte clones develop in the bone 
marrow on the surface of a macrophage, which phagocytoses and digests 
the nuclei discarded by the erythroblasts. 


senescent erythrocytes in each of us each day. Young erythrocytes actively protect 
themselves from this fate: they have a protein on their surface that binds to an 
inhibitory receptor on macrophages and thereby prevents their phagocytosis. 

A lack of oxygen or a shortage of erythrocytes stimulates specialized cells in the 
kidney to synthesize and secrete increased amounts of erythropoietin into the 
bloodstream. The erythropoietin, in turn, boosts the production of erythrocytes. 
The effect is rapid: the rate of release of new erythrocytes into the bloodstream 
rises steeply 1-2 days after an increase in erythropoietin levels in the bloodstream. 
Clearly, the hormone must act on cells that are close precursors of the mature 
erythrocytes. 

The cells that respond to erythropoietin can be identified by culturing bone 
marrow cells in a semisolid matrix in the presence of erythropoietin. In a few 
days, colonies of about 60 erythrocytes appear, each founded by a single com- 
mitted erythroid progenitor cell. This progenitor depends on erythropoietin for 
its survival as well as its proliferation. It does not yet contain hemoglobin, and it 
is derived from an earlier type of committed erythroid progenitor whose survival 
and proliferation are governed by other factors. 


Multiple CSFs Influence Neutrophil and Macrophage Production 


The two classes of cells dedicated to phagocytosis, neutrophils and macrophages, 
develop from a common progenitor cell called a granulocyte/macrophage (GM) 
progenitor cell. Like the other granulocytes (eosinophils and basophils), neutro- 
phils circulate in the blood for only a few hours before migrating out of capillaries 
into the connective tissues or other specific sites, where they survive for only a few 
days. They then die by apoptosis and are phagocytosed by macrophages. Mac- 
rophages, in contrast, can persist for months or perhaps even years outside the 
bloodstream, where they can be activated by local signals to resume proliferation. 

At least seven distinct CSFs that stimulate neutrophil and macrophage colony 
formation in culture have been defined, and some or all of these are thought to 
act in different combinations to regulate the selective production of these cells 
in vivo. These CSFs are synthesized by various cell types—including endothelial 
cells, fibroblasts, macrophages, and lymphocytes—and their concentration in 
the blood typically increases rapidly in response to bacterial infection in a tissue, 
thereby increasing the number of phagocytic cells released from the bone marrow 
into the bloodstream. 

The CSFs not only operate on the precursor cells to promote the production 
of differentiated progeny, they also activate the specialized functions (such as 
phagocytosis and target-cell killing) of the terminally differentiated cells. CSFs can 
be synthesized artificially and are now widely used in human patients to stimulate 
the regeneration of hematopoietic tissue and to boost resistance to infection. 


The Behavior of a Hematopoietic Cell Depends Partly on Chance 


CSFs are defined as factors that promote the production of colonies of differenti- 
ated blood cells. But precisely what effect does a CSF have on an individual hema- 
topoietic cell? The factor might control the rate of cell division or the number of 
division cycles that the progenitor cell undergoes before differentiating; it might 
act late in the hematopoietic lineage to facilitate differentiation; it might act early 
to influence commitment; or it might simply increase the probability of cell sur- 
vival (Figure 22-34). By monitoring the fate of isolated individual hematopoietic 
cells in culture, it has been possible to show that a single CSF, such as granulocyte/ 
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macrophage CSF, can exert all these effects, although it is still not clear which are 
most important in vivo. 

Studies in vitro indicate, moreover, that there is a large element of chance in 
the way a hematopoietic cell behaves—a reflection, presumably, of “noise” in the 
genetic control system, as discussed in Chapters 7 and 8. If two sister cells are 
taken immediately after a cell division and cultured apart under identical con- 
ditions, they frequently give rise to colonies that contain different types of blood 
cells or the same types of blood cells in different numbers. Thus, both the pro- 
gramming of cell division and the process of commitment to a particular path of 
differentiation seem to involve random events at the level of the individual cell, 
even though the behavior of the multicellular system as a whole is regulated in a 
reliable way. The sequence of cell fate restrictions shown earlier, in Figure 22-31, 
conveys the impression of a program executed with computer-like logic and pre- 
cision. Individual cells may be more varied, quirky, and erratic, and may some- 
times progress by other decision pathways from the stem-cell state toward termi- 
nal differentiation. 


Regulation of Cell Survival Is as Important as Regulation of Cell 
Proliferation 


The default behavior of hematopoietic cells in the absence of CSFs is death by 
apoptosis (discussed in Chapter 18), and the control of cell survival plays a cen- 
tral part in regulating the numbers of blood cells. The amount of apoptosis in the 
vertebrate hematopoietic system is enormous: billions of neutrophils die in this 
way each day in an adult human, for example. In fact, most neutrophils produced 
in the bone marrow die there without ever functioning. This futile cycle of pro- 
duction and destruction presumably serves to maintain a reserve supply of cells 
that can be promptly mobilized to fight infection whenever it flares up, or phago- 
cytosed and digested for recycling when all is quiet. Compared with the life of the 
organism, the lives of cells are cheap. 

Too little cell death can be as dangerous to the health of a multicellular organ- 
ism as too much proliferation. As noted in Chapter 18, mutations that inhibit cell 
death by causing excessive production of the intracellular apoptosis inhibitor 


Figure 22-34 Some of the parameters 
through which the production of 
blood cells of a specific type might 
be regulated. Studies in culture suggest 
that various colony-stimulating factors 
(CSFs) can affect all of these aspects of 
hematopoiesis. 


REGENERATION AND REPAIR 


Bcl2 promote the development of cancer in B lymphocytes. Indeed, the capacity 
for unlimited self-renewal is a dangerous property for any cell to possess. Many 
cases of leukemia arise through mutations that confer this capacity on committed 
hematopoietic precursor cells that would normally be fated to differentiate and 
die after a limited number of division cycles. 


Summary 


The many types of blood cells, including erythrocytes, lymphocytes, granulocytes, 
and macrophages, all derive from a common multipotent stem cell. In the adult, 
hematopoietic stem cells are found mainly in bone marrow, and they depend on sig- 
nals from the marrow stromal (connective-tissue) cells to maintain their stem-cell 
character. The stem cells are few and far between, and they normally divide infre- 
quently to produce more stem cells (self-renewal) and various committed progen- 
itor cells (transit amplifying cells), each able to give rise to only one or a few types 
of blood cells. The committed progenitor cells divide extensively under the influence 
of various protein signal molecules (colony-stimulating factors, or CSFs) and then 
terminally differentiate into mature blood cells, which usually die after several days 
or weeks. 

Studies of hematopoiesis have been greatly aided by in vitro assays in which 
stem cells or committed progenitor cells form clonal colonies when cultured in a 
semisolid matrix. The progeny of stem cells seem to make their choices between 
alternative developmental pathways in a partly random manner. Cell death by 
apoptosis, controlled by the availability of CSFs, also plays a central part in regulat- 
ing the numbers of mature differentiated blood cells. 


REGENERATION AND REPAIR 


As we have seen, many of the tissues of the body are not only self-renewing but 
also self-repairing, and this is largely thanks to stem cells and the feedback con- 
trols that regulate their behavior and maintain homeostasis. There are, however, 
limits to what these natural repair mechanisms can achieve. In most parts of the 
human brain, for example, nerve cells that die, as in Alzheimer’s disease, are not 
replaced. Likewise, when heart muscle dies for lack of oxygen, as in a heart attack, 
it is replaced by scar tissue rather than new heart muscle. 

Some animals do far better than humans and can regenerate entire organs, 
such as whole limbs, after amputation. Among the invertebrates, there are some 
species that can even regenerate all the tissues of the body from a single somatic 
cell. These phenomena encourage the hope that human cells might be coaxed by 
artificial measures into similar feats of repair and regeneration, so as to replace 
the skeletal muscle fibers that degenerate in victims of muscular dystrophy, the 
nerve cells that die in patients with Parkinson’s disease, the insulin-secreting cells 
that are lacking in type 1 diabetics, the heart muscle cells that die in a heart attack, 
and so on. As we learn more about the basic cell biology, these goals, once only a 
dream, are beginning to seem attainable. 

In this section, we start with some examples of the remarkable regenerative 
abilities of some animal species, as an indication of what is possible in principle. 
We shall then discuss how we can improve upon the natural repair processes of 
the human body and treat disease by exploiting the properties of the various types 
of stem cells found in human tissues. In the final section of the chapter, we shall 
see how a deeper understanding of the molecular biology of cell differentiation 
and of stem cells has revealed ways to convert one type of cell into another, open- 
ing up radically new possibilities. 


Planarian Worms Contain Stem Cells That Can Regenerate a 
Whole New Body 


Schmidtea mediterranea is a small freshwater flatworm, or planarian, just under a 
centimeter long when grown to full size (Figure 22-35). It has an epidermis, a gut, 
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Figure 22-35 The planarian worm, 
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a brain, a pair of primitive eyes, a peripheral nervous system, musculature, and 
excretory and reproductive organs—most of the basic body parts familiar in other 
animals, although all relatively simple by vertebrate standards and built from 
about 20-25 distinct differentiated cell types. For more than a century, planarians 
such as Schmidtea have intrigued biologists because of their extraordinary capac- 
ity for regeneration: a small tissue fragment taken from almost any part of the 
body will reorganize itself and grow to form a complete new animal. This prop- 
erty goes with another: when the animal is starved, it gets smaller and smaller, by 
reducing its cell numbers while maintaining essentially normal body proportions. 
This behavior is called degrowth, and it can continue until the animal is as little 
as one-twentieth or even a smaller fraction of its full size. Supplied with food, it 
will grow back to full size again. Cycles of degrowth and growth can be repeated 
indefinitely, without impairing survival or fertility. 

Underlying this behavior is a process of continual cell turnover. Along with the 
differentiated cells, which do not divide, there is a population of small, apparently 
undifferentiated dividing cells called neoblasts. The neoblasts constitute about 
20% of the cells in the body and are widely distributed within it; by cell division, 
they serve as stem cells for the production of new differentiated cells. Differenti- 
ated cells, meanwhile, are continually dying by apoptosis, allowing their corpses 
to be phagocytosed and digested by neighboring cells. Through this cell canni- 
balism, the constituents of the dying cells can be efficiently recycled. Cell birth 
continues in a dynamic balance with cell death and cell cannibalism, no matter 
whether the animal is fed or starved. In conditions of starvation, the balance is 
evidently tilted toward cell cannibalism, and in conditions of plenty, toward cell 
birth. 

A high dose of x-rays halts all cell division, puts a stop to cell turnover, and 
destroys the capacity for regeneration. The result is death after a delay of several 
weeks. The animal can be rescued, however, by injecting into it a single neoblast 
isolated from an unirradiated donor (Figure 22-36). In a certain proportion of 
cases, the injected cell divides to form a clone of progeny that eventually repopu- 
late the entire body, creating a healthy regenerative individual with an apparently 
complete set of differentiated cell types as well as dividing neoblasts. Genetic 
markers prove that these are all derived from the single neoblast that was injected. 
It follows that at least some neoblasts are totipotent (or at least highly pluripotent) 
stem cells; that is, cells able to give rise to all (or at least almost all) of the cell types 
that make up the body of a flatworm, including more neoblasts like themselves. 


Some Vertebrates Can Regenerate Entire Organs 


One might think that such powers of regeneration would be a prerogative of small, 
simple, primitive animals. But some vertebrates, too, especially fish and amphib- 
ians, show remarkable regenerative abilities. A newt, for example, can regenerate 
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a whole amputated limb. In this process, differentiated cells seem to revert to an 
embryonic character by first forming on the amputation stump a blastema—a 
small bud resembling an embryonic limb bud. The blastema then grows and its 
cells differentiate to form a correctly patterned replacement for the limb that has 
been lost, in what looks like a recapitulation of embryonic limb development (Fig- 
ure 22-37). A large contribution to the blastema comes from the skeletal muscle 
cells in the limb stump. These multinucleate cells re-enter the cell cycle, dediffer- 
entiate, and break up into mononucleated cells, which then proliferate within the 
blastema, before eventually redifferentiating. But do they redifferentiate only into 
muscle, or do they behave like neoblasts in the planarian and give rise to the full 
range of cell types needed to reconstruct the missing part of the limb? Careful lin- 
eage tracing, using genetic markers, shows (contrary to previous belief) that the 
cells are restricted according to their origins: muscle-derived cells give rise only to 
muscle, connective-tissue cells only to connective tissues, epidermal cells only to 
epidermal cells. The cells in the adult vertebrate body are, after all, less adaptable 
than the cells of the flatworm: by working in concert, they can replace the lost 
structure, but each cell type is far from totipotent. 

Why a newt can regenerate a whole limb—as well as many other body parts— 
but a mammal cannot remains a profound mystery. 


Stem Cells Can Be Used Artificially to Replace Cells That Are 
Diseased or Lost: Therapy for Blood and Epidermis 


Earlier in this chapter, we saw how mice can be irradiated to kill off their hemato- 
poietic cells, and then rescued by a transfusion of new stem cells, which repopu- 
late the bone marrow and restore blood cell production (see Figure 22-30). In the 
same way, patients with some forms of leukemia or lymphoma can be irradiated 
or chemically treated to destroy their cancerous cells along with the rest of their 
hematopoietic tissue, and then can be rescued by a transfusion of healthy, non- 
cancerous hematopoietic stem cells. In favorable cases, these can be sorted out 
from samples of the patient’s own hematopoietic tissue before it is ablated. They 
are then transfused back afterward, avoiding problems of immune rejection. 


AMPUTATION 





0 day 25 days 


1249 


Figure 22-36 Regeneration of a 
planarian from a single somatic cell. 
(A) The distribution of dividing cells 
(neoblasts, blue) in the adult body. 
Irradiation blocks all cell division and 
prevents regeneration, but (B) a single 
unirradiated neoblast cell injected into the 
irradiated animal is able to reconstitute 
all tissues. This eventually produces a 
complete animal that consists entirely 

of the progeny of this one cell and can 
regenerate. (Adapted from E.M. Tanaka 
and P.W. Reddien, Dev. Cell 21:172-185, 
2011.) 


Figure 22-37 Newt limb regeneration. 
The time-lapse sequence shows the stages 
of regeneration after amputation at the level 
of the humerus. The sequence spans the 
events of wound healing, dedifferentiation 
of stump tissues, blastema formation, and 
redifferentiation. (Courtesy of Susan Bryant 
and David Gardiner.) 
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Figure 22-38 The continuing production of neurons in an adult mouse 
brain. The brain is viewed from above, in a cut-away section, to show the 
region lining the ventricles of the forebrain where neural stem cells are found. 
These cells continually produce progeny that migrate to the olfactory bulb, 
where they differentiate as neurons. The constant turnover of neurons in 

the olfactory bulb is presumably linked in some way to the turnover of the 
olfactory receptor neurons that project to it from the olfactory epithelium, as 
mentioned earlier. In adult humans, there is a continuing turnover of neurons 
in the hippocampus, a region specially concerned with learning and memory. 
(Adapted from B. Barres, Cell 97:667-670, 1999. With permission from 
Elsevier.) 


Another example of the use of stem cells is in the repair of the skin after exten- 
sive burns. By culturing cells from undamaged regions of the burned patient’s 
skin, it is possible to obtain epidermal stem cells quite rapidly in large numbers. 
These can then be used (through rather long and complicated procedures) to 
repopulate the damaged body surface. 


Neural Stem Cells Can Be Manipulated in Culture and Used to 
Repopulate the Central Nervous System 


The central nervous system (the CNS) is the most complex tissue in the body, at an 
opposite extreme from the epidermis. And yet fish and amphibians can regener- 
ate large parts of the brain, spinal cord, and eyes after they have been cut away. In 
adult mammals, however, these tissues have very little capacity for self-repair, and 
stem cells capable of generating new neurons are hard to find—so hard to find, 
indeed, that for many years they were thought to be absent. 

We now know, however, that neural stem cells that generate both neurons and 
glial cells do persist in certain parts of the adult mammalian brain (Figure 22-38). 
Neuronal turnover occurs on a dramatic scale in certain songbirds’ brains, where 
large numbers of neurons die each year and are replaced by newborn neurons 
as part of a process by which the birds refine their song for each new breeding 
season. In the adult human brain, there is a continuing turnover of neurons in 
the hippocampus, a region specially concerned with learning and memory. Here, 
plasticity of adult function is associated with turnover of a specific subset of neu- 
rons. About 1400 fresh neurons in this class are generated every day, giving a turn- 
over of 1.75% of the population per year. 

Fragments taken from self-renewing regions of the adult brain, or from the 
brain of a fetus, can be dissociated and used to establish cell cultures, where they 
give rise to floating “neurospheres”—clusters consisting of a mixture of neural 
stem cells with neurons and glial cells derived from the stem cells. These neuro- 
spheres can be propagated through many cell generations, or their cells can be 
taken at any time and implanted back into the brain of an intact animal. Here they 
will produce differentiated progeny, in the form of neurons and glial cells. 

Using slightly different culture conditions, with the right combination of 
growth factors in the medium, the neural stem cells can be grown as a monolayer 
and induced to proliferate as an almost pure stem-cell population without atten- 
dant differentiated progeny. By a further change in the culture conditions, these 
cells can be induced at any time to differentiate to give either a mixture of neurons 
and glial cells (Figure 22-39), or just one of these two cell types, according to the 
composition of the culture medium. 

Neural stem cells, whether derived as above or from pluripotent stem cells as 
described in the next section, can be grafted into an adult brain. Once there, they 
show a remarkable ability to adjust their behavior to match their new location. 
Stem cells from the mouse hippocampus, for example, when implanted in the 
mouse olfactory-bulb-precursor pathway (see Figure 22-38), give rise to neurons 
that become correctly incorporated into the olfactory bulb. This capacity of neu- 
ral stem cells and their progeny to adapt to a new environment in animals sug- 
gests applications in the treatment for diseases where neurons degenerate, and 
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for injuries of the central nervous system. For example, might it be possible to use Figure 22-39 Neural stem cells. Shown 
injected neural stem cells to replace the neurons that die in Parkinson’s disease or @"€ the steps leading from fetal brain tissue, 


; f via neurospheres (A), to a pure culture of 
? 
to repair accidents that sever the spinal cord‘ neural stem cells (B). These stem cells can 


be kept proliferating as such indefinitely, 
Summary or, through a change of medium, can be 
caused to differentiate (C) into neurons 
Animals vary in their capacity for regeneration. At one extreme, planarian worms (red) and glial cells (green). Neural stem 
contain stem cells (neoblasts) that support continual turnover of all cell types, and  C®lIS with the same properties can also be 
an entire worm can be regenerated from practically any small body fragment or eee ln e 
f embryonic stem (ES) or induced pluripotent 
even from a single neoblast cell. Newts can regenerate limbs and other large body _ stem (iPS) cells (discussed later in this 
parts after amputation, but the cells remain restricted according to their origins: chapter). (Micrographs from L. Conti et al., 
muscle cells in the regenerate derive from muscle, epidermis from epidermis, and so PLoS Biol. 3:1594-1606, 2005.) 
on. In mammals, regeneration is more limited. Nevertheless, it is becoming possible 
to go beyond the natural limits of wound healing by exploiting stem-cell biology. 
Thus, certain regions of the nervous system contain stem cells that support produc- 
tion of neurons in these sites throughout life. Neural stem cells can be obtained from 
these sites or from fetal brains, grown in culture, and then grafted back into other 
sites in the brain, where they are able to generate neurons appropriate to the new 
location. 


CELL RERPROGRAMMING AND PLURIPOTENT 
STEM CELLS 


When cells are transplanted from one site in the mammalian body to another or 
are removed from the body and maintained in culture, they remain largely faithful 
to their origins. Each type of specialized cell has a memory of its developmental 
history and seems fixed in its specialized fate. Some limited transformations can 
certainly occur, as we saw in our account of the connective-tissue cell family, and 
some stem cells can generate a variety of differentiated cell types, but the possibil- 
ities are restricted. Each type of stem cell serves for the renewal of one particular 
type of tissue, and the whole pattern of self-renewing and differentiated cells in 
the adult body is amazingly stable. What, at a fundamental molecular level, is the 
nature of these stable differences between cell types? Is there any way to override 
the cell memory mechanisms and force a switch from one state to another that is 
radically different? 

We have already discussed these fundamental questions from a general stand- 
point in Chapter 7. Here we consider them more closely in the context of stem-cell 
biology, where there has been a recent revolution in our understanding and in 
our ability to manipulate states of cell differentiation. With further research, these 
advances would seem to have important practical consequences. 
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Nuclei Can Be Reprogrammed by Transplantation into Foreign 
Cytoplasm 


If we cannot switch the basic character of a specialized cell by changing its envi- 
ronment, can we do so by interfering with its inner workings in a more direct and 
drastic way? An extreme treatment of this sort is to take the nucleus of the cell and 
transplant it into the cytoplasm of a large cell of a different type. If the special- 
ized character is defined and maintained by cytoplasmic factors, the transplanted 
nucleus should switch its pattern of gene expression to conform with that of the 
host cell. In Chapter 7, we described a famous experiment of this sort, using the 
frog Xenopus. In this experiment, the nucleus of a differentiated cell (a cell from 
the lining of a tadpole’s gut) was used to replace the nucleus of an oocyte (an egg- 
cell precursor arrested in prophase of the first meiotic division, in readiness for 
fertilization). The resulting hybrid cell went on, in a certain fraction of cases, to 
develop into a complete normal frog (see Figure 7-2A). This was crucial evidence 
for what is now a central principle of developmental biology: the cell nucleus, even 
that of a differentiated cell, contains a complete genome, capable of supporting 
development of all normal cell types. At the same time, the experiment showed 
that cytoplasmic factors can indeed reprogram a nucleus: the oocyte cytoplasm 
can drive the gut cell nucleus back to an early embryonic state, from which it can 
then step through the changing patterns of gene expression that lead all the way 
to acomplete adult organism. 

The full story, however, is not quite so simple. First, the reprogramming in such 
experiments is not perfect. When the transplanted nucleus is taken from a gut cell, 
for example, a gene that is normally specific to the gut is found to be expressed per- 
sistently, even in the muscle cells of the final animal. Second, the experiment suc- 
ceeds in only a limited proportion of cases, and this success rate becomes lower 
and lower, the more mature the animal from which the transplanted nucleus is 
taken: very large numbers of transplantations must be done to score a single suc- 
cess if the nucleus comes from a differentiated cell of an adult frog. 

Nuclear transplantation can be done in mammals too, with basically similar 
results. Thus, a nucleus taken from a differentiated cell in the mammary gland 
of an adult sheep and transplanted into an enucleated sheep’s egg was able to 
support development of an apparently normal sheep—the famous Dolly. Again, 
the success rate is low: many transplantations have to be done to obtain one such 
individual. 


Reprogramming of a Transplanted Nucleus Involves Drastic 
Epigenetic Changes 


In a typical fully differentiated cell, there seem to be mechanisms maintaining 
the pattern of gene expression that cytoplasmic factors cannot easily override. An 
obvious possibility is that the stability of the pattern of gene expression in an adult 
cell may depend, in part at least, on self-perpetuating modifications of chromatin, 
as discussed in Chapter 4. As explained in Chapter 7, the phenomenon of X-inac- 
tivation in mammals provides a clear example of such epigenetic control. Two X 
chromosomes exist side by side in each female cell, exposed to the same chem- 
ical environment, but while one remains active, the other persists from one cell 
generation to the next in a condensed inactive state; cytoplasmic factors cannot 
be responsible for the difference, which must instead reflect mechanisms intrin- 
sic to the individual chromosome. Elsewhere in the genome also, controls at the 
level of chromatin act in combination with other forms of regulation to govern 
the expression of each gene. Genes can be shut down completely, or switched on 
constitutively, or maintained in a labile state where they can be readily switched 
on or off according to changing circumstances. 

The reprogramming of a nucleus transplanted into an oocyte involves dra- 
matic changes in chromatin. The nucleus swells, increasing its volume 50-fold 
as the chromosomes decondense; there is a wholesale alteration in patterns of 
methylation of DNA and histones; the standard histone H1 (the histone that links 
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adjacent nucleosomes) is replaced by a variant form that is peculiar to the oocyte 
and early embryo; and the preexisting type of histone H3 is also replaced at many 
sites by a distinct isoform. Evidently, the egg contains factors that reset the state of 
the chromatin in the nucleus, wiping out old histone modifications on chromatin 
and imposing new ones. Reprogrammed in this way, the genome becomes com- 
petent once again to initiate embryonic development and to give rise to the full 
range of differentiated cell types. 


Embryonic Stem (ES) Cells Can Generate Any Part of the Body 


A fertilized egg, or an equivalent cell produced by nuclear transplantation, is a 
remarkable thing: it can generate a whole new multicellular individual, and that 
means that it can give rise to every normal type of specialized cell, including even 
egg or sperm cells for production of the next generation. A cell in such a state is 
said to be totipotent, and it is said to be pluripotent if it can give rise to most cell 
types but not absolutely all. Nevertheless, such a progenitor is not a stem cell: it 
is not self-renewing, but is instead dedicated to a program of progressive differ- 
entiation. If it were the only available starting point for study and exploitation of 
pluripotent cells, the enterprise would require a continual supply of fresh fertil- 
ized eggs or fresh nuclear transplantation procedures—an awkward requirement 
for studies in experimental animals, and unacceptable for practical applications 
in humans. 

Here, however, nature has been unexpectedly kind to scientists. It is possible 
to take an early mouse embryo, at the blastocyst stage, and through cell culture to 
derive from it a class of stem cells called embryonic stem cells, or ES cells. These 
originate from the inner cell mass of the early embryo (the cluster of cells that give 
rise to the body of the embryo proper, as opposed to extraembryonic structures), 
and they have an extraordinary property: given suitable culture conditions, they 
will continue proliferating indefinitely and yet retain an unrestricted develop- 
mental potential. Their only limitation is that they do not give rise to extraembry- 
onic tissues such as those of the placenta. Thus they are classified as pluripotent, 
rather than totipotent. But this is a minor restriction. If ES cells are put back into 
a blastocyst, they become incorporated into the embryo and can give rise to all 
the tissues and cell types in the body, integrating perfectly into whatever site they 
may come to occupy, and adopting the character and behavior that normal cells 
would show at that site. They can even give rise to germ cells, from which a new 
generation of animals can be derived (Figure 22-40). 

ES cells let us move between cell culture, where we can use powerful tech- 
niques for genetic transformation and selection, and the intact organism, where 
we can discover how such genetic manipulations affect development and phys- 
iology. Thus, ES cells have opened the way to efficient genetic engineering in 
mammals, leading to a revolution in our understanding of mammalian molecular 
biology. 

Cells with properties similar to those of mouse ES cells can now be derived 
from early human embryos and from human fetal germ cells, and even, as we 
shall explain below, from differentiated cells taken from adult mammalian tis- 
sues. In this way, one can obtain a potentially inexhaustible supply of pluripotent 
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Figure 22-40 Production and 
pluripotency of ES cells. ES cells are 
derived from the inner cell mass (ICM) 

of the early embryo. The ICM cells are 
transferred to a culture dish containing 

an appropriate medium, where they 
become converted to ES cells and can 

be kept proliferating indefinitely without 
differentiating. The ES cells can be taken 
at any time—after genetic manipulation, if 
desired—and injected back into the inner 
cell mass of another early embryo. There 
they take part in formation of a well-formed 
chimeric animal that is a mixture of ordinary 
and ES-derived cells. The ES-derived cells 
can differentiate into any of the cell types in 
the body, including germ cells from which a 
new generation of mice can be produced. 
These next-generation progeny are no 
longer chimeric, but consist of cells that 

all inherit half their genes from the cultured 
ES cell line. 





blastocyst develops in 
foster mother into a healthy 
chimeric mouse; the ES cells may 
contribute to any tissue 
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cells. Grown in culture, these can be manipulated, by suitable choice of culture 
conditions, to give rise to large quantities of almost any type of differentiated cell, 
Opening the way to important practical applications. Before discussing them, 
however, we consider the underlying biology. 


A Core Set of Transcription Regulators Defines and Maintains the 
ES Cell State 


What is it that gives ES cells and related types of pluripotent stem cells their 
extraordinary capabilities? What can they tell us about the fundamental mecha- 
nisms underlying stemness, cell differentiation, and the stability of the differen- 
tiated state? 

For some attributes, the answer is simple. For example, an essential feature 
of ES cells is that they must avoid senescence. As discussed in Chapter 17, this is 
the fate of fibroblasts and many other types of somatic cells: they are limited in 
the number of times they will divide, in part at least because they lack telomerase 
activity, with the result that their telomeres become progressively eroded in each 
division cycle, leading eventually to cell-cycle arrest. ES cells, by contrast, express 
high levels of active telomerase, allowing them to escape senescence and con- 
tinue dividing indefinitely. This is a property shared with other, more specialized 
types of stem cells, such as those of the adult intestine, which similarly can carry 
on dividing for hundreds or thousands of cycles. 

The deeper problem is to explain how the whole complex pattern of gene 
expression in an ES cell is organized and maintained. As a first step, one can look 
for genes expressed specifically in ES cells or in the corresponding pluripotent 
cells of the early embryo. This approach identifies a relatively small number of 
candidate ES-critical genes; that is, genes that seem to be essential in one way 
or another for the peculiar character of ES cells. A gene called Oct4, for exam- 
ple, is exclusively expressed in ES cells and in related classes of cells in the intact 
organism—specifically, in the germ-cell lineage and in the inner cell mass and its 
precursors. Oct4 codes for a transcription regulator. When it is lost from ES cells, 
they lose their ES cell character; and when it is missing in an embryo, the cells that 
should specialize as inner cell mass are diverted into an extraembryonic pathway 
of differentiation and their development is aborted. 


Fibroblasts Can Be Reprogrammed to Create Induced Pluripotent 
Stem Cells (IPS Cells) 


In Chapter 7, we saw that fibroblasts and some other cell types can be driven to 
switch their character and differentiate as muscle cells if the master muscle-spe- 
cific transcription regulator MyoD is artificially expressed in them. Could the same 
technique be used to convert adult cell types into ES cells, through forced expres- 
sion of factors such as Oct4? This question was tackled by transfecting fibroblasts 
with retroviral vectors carrying genes that one might hope to have such an effect. 
A total of 24 candidate ES-critical genes were tested in this way. None of them was 
able by itself to cause the conversion; but in certain combinations they could do 
so. In 2006, the first breakthrough experiments whittled down the requirement to 
a core set of four factors, all of them transcription regulators: Oct4, Sox2, KIf4, and 
Myc, known as the OSKM factors for short. When coexpressed, these could repro- 
gram mouse fibroblasts, permanently converting them into cells closely similar 
to ES cells (Figure 22-41). ES-like cells created in this way are called induced 
pluripotent stem cells, or iPS cells. Like ES cells, iPS cells can continue dividing 
indefinitely in culture, and when incorporated into a mouse blastocyst they can 
participate in creation of a perfectly formed chimeric animal. In this animal, they 
can contribute to the development of any tissue and can turn into any differen- 
tiated cell type, including functional germ cells from which a new generation of 
mice can be raised (see Figure 22-40). 

iPS cells can now be derived from adult human cells and from various other dif- 
ferentiated cell types besides fibroblasts. Numerous methods can be used to drive 
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Figure 22-41 Reprogramming fibroblasts 
to IPS cells with the OSKM factors. As 
indicated, the master gene regulator proteins 
Oct4, Sox2, and KIf4 (OSK) induce both 
their own and each other's synthesis (gray 
shading). This generates a self-sustaining 
feedback loop that helps to maintain 

cells in an embryonic stem cell-like state, 
even after all of the experimentally added 
OSKM initiators have been removed. Myc 
overexpression speeds up early stages of 
the reprogramming process through the 
mechanisms shown (see Figure 17-61). 
Stable reprogramming also involves the 
permanently induced expression of the 
Nanog gene, which produces an additional 
master transcription regulator. (Adapted from 
J. Kim et al., Cell 132:1049-1061, 2008.) 
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expression of the transforming OSKM factors, including methods that leave no 
trace of foreign DNA in the reprogrammed cell. Variations of the original cocktail 
of transcription regulators can drive the conversion, with different specialized cell 
types having somewhat different requirements. Myc overexpression, for example, 
turns out not to be absolutely necessary, although it enhances the efficiency of the 
process. And differentiated cell types may express some of the required factors as 
part of their normal phenotype. For example, cells of the dermal papilla of hair 
follicles already express Sox2, Klf4, and Myc; to convert them into iPS cells, it is 
enough to force them artificially to express Oct4. Oct4, indeed, seems to have a 
central role and to be generally indispensable for the creation of iPS cells. 


Reprogramming Involves a Massive Upheaval of the Gene Control 
system 


Converting a differentiated cell into an iPS cell is not like flicking a switch on some 
predictable, precisely engineered piece of machinery. Only a few of the cells that 
receive the OSKM factors will actually become iPS cells—one in several thou- 
sand in the original experiments, and still only a small minority with more recent, 
improved techniques. In fact, the success of the original experiments depended 
on clever selection to pick out those few cells where the conversion had occurred 
(Figure 22-42). 

Conversion to an iPS character by the OSKM factors is not only inefficient but 
also slow: fibroblasts take ten days or more from introduction of the conversion 
factors before they begin to express markers of the iPS state. This suggests that 
the transformation involves a long cascade of changes. These changes are being 
extensively studied, and they affect both the expression of individual genes and 
the state of the chromatin. The results of one such study are outlined in Figure 
22-43. The process begins with a Myc-induced cell proliferation and loosening 
of chromatin structure that promotes the binding of the other three master reg- 
ulators to many hundreds of different sites in the genome. At a large proportion 
of these sites, Oct4, Sox2, and KIf4 all bind in concert. The binding sites include 
the endogenous Oct4, Sox2, and KIf4 genes themselves, which eventually cre- 
ates the types of positive feedback loops just described that makes expression of 
these genes self-sustaining (see Figure 22-41). But self-induction of Oct4, Sox2, 
and KIf4 is only a small part of the transformation that occurs. The three core fac- 
tors activate some target genes and repress others, producing a cascade of effects 
that reorganize the gene control system globally and at every level, changing the 
patterns of histone modification, DNA methylation, and chromatin compaction, 
as well as the expression of innumerable proteins and noncoding RNAs. By the 
end of this complex process, the resulting iPS cell is no longer dependent on the 
artificially generated factors that triggered the change: it has settled into a stable, 
self-sustaining state of coordinated gene expression, making its own Oct4, Sox2, 
KIf4, and Myc (and all the other essential ingredients of a pluripotent stem cell) 
from its own endogenous copies of the genes. 
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Figure 22-42 A strategy used to select 
cells that have converted to an iPS 
character. The experiment makes use 

of a gene (Fbx75) that is present in all 
cells but is normally expressed only in 

ES and early embryonic cells (although not 
required for their survival). A fibroblast cell 
line is genetically engineered to contain 

a gene that produces an enzyme that 
degrades G418 under the control of the 
Fbx15 regulatory sequence. G418 is an 
aminoglycoside antibiotic that blocks 
protein synthesis in both bacteria and 
eukaryotic cells. When the OSKM factors 
are artificially expressed in this cell line, 

a small proportion of the cells undergo a 
change of state and activate the Fox715 
regulatory sequence, driving expression 
of the G418-resistance gene. When G418 
is added to the culture medium, these are 
the only cells that survive and proliferate. 
When tested, they turn out to have an 
IPS character. 
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transcription Figure 22-43 A summary of some of 
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are repressed. In the second transcription 
The low efficiency and slow rate of conversion suggest that there is some barrier Wave, genes required for embryonic 
blocking the switch from the differentiated state to the iPS state in these experi- Se aa 
: i aaan one ; are induced. (Adapted from J.M. Polo et 

ments, and that overcoming this barrier is a difficult process that involves a large al., Cell 151:1617-1632, 2012.) 
element of chance. Likewise, the outcome is variable, with significant differences 
between the individual lines of transformed cells that are generated, even when 
the initial differentiated cells are genetically and phenotypically identical. Only 
some of the candidate iPS lines pass all the tests of pluripotency. At a molecular 
level, there are differences even among the fully validated iPS cells: although they 
share many features, they vary in details of their gene expression patterns and, for 
example, in their patterns of DNA methylation. 

Overcoming these difficulties will be critical for improving our understanding 
of how cell specialization is controlled and organized in multicellular organisms; 
it should also facilitate many medical advances. Thus, intensive research is being 
carried out on the reprogramming process. One approach aims at obtaining a 
much clearer picture of the role that chromatin structures play in gene regulation 
in eukaryotes. 

From our discussion of nuclear transplantation, one might expect that any 
reprogramming of a differentiated cell would require a radical and widespread 
change in the chromatin structure of selected genes. Not only are such changes 
observed, but a large number of different experiments reveal that the efficiency of 
the reprogramming process can be substantially increased by altering the activ- 
ity of proteins that affect chromatin structure. Figure 22-44 categorizes some of 
the factors that have been shown to enhance the transformation of fibroblasts to 
iPS cells; those in the top three rows—chromatin remodelers, histone modify- 
ing enzymes, and histone variants—are especially well known to have profound 
effects on the organization of nucleosomes in chromatin (discussed in Chapter 4). 

We can only touch briefly here on the massive amounts of data that have been 
accumulating in this exciting research area. The major challenge that remains is to 
obtain a systems-level model for the complex set of biochemical changes that are 
involved in reprogramming. For example, which chromatin changes come first, 
and which then follow? How can these be triggered by the master transcription 
regulators through their binding to specific DNA sequences, and why do many 
cells in a population appear resistant to these effects? 


ES and iPS Cells Can Be Guided to Generate Specific Adult Cell 
Types and Even Whole Organs 


We can think of embryonic development in terms of a series of choices pre- 
sented to cells as they follow a road that leads from the fertilized egg to terminal 


CELL REPROGRAMMING AND PLURIPOTENT STEM CELLS 


CHROMATIN 
REMODELING 
HISTONE 
MODIFICATION 


HISTONE nistone varian \istone varian 
VARIANTS H3.: nacroH2 









DNA 
MODIFICATION 





specific 
miRNAs 


specific 
IncRNAs 


RNA 
EXPRESSION 


differentiation. After their long sojourn in culture, the ES cells or iPS cells and 
their progeny can still read the signs at each branch in the highway and respond 
as normal embryonic cells would. If ES or iPS cells are implanted directly into an 
embryo at a later stage or into an adult tissue, however, they fail to receive the 
appropriate sequence of cues; their differentiation then is not properly controlled, 
and they will often give rise to a tumor of the type known as a teratoma, containing 
a mixture of cell types inappropriate to the site in the body. 

In culture, by exposing the ES or iPS cell to an appropriate sequence of signal 
proteins and growth factors, delivered with the right timing, it is possible to guide 
the cell along a pathway that approximates a normal developmental pathway, so 
as to convert it into one of the standard specialized adult cell types (Figure 22-45 
and Movie 22.5). Success requires trial and error, but has now been achieved for 
many different final specialized states, including neuronal, muscular, and intesti- 
nal cell types. In a few cases, it has even been possible, by careful manipulation of 
the culture conditions, to get ES or iPS cells to interact with one another so as to 
construct an entire organ, albeit on a small scale (Figure 22-46). 
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Figure 22-44 Factors that have been 
observed to enhance reprogramming 
efficiency. Emphasized here are those 
factors that can alter chromatin states, 
with those in the top three rows having 
the most direct effects. An up arrow 
indicates that reprogramming is increased 
when the activity of the indicated factor 

is increased; a down arrow indicates that 
reprogramming is increased when the 
activity of the indicated factor is decreased. 
Thus, for example, increased activity of 
histone acetyl transferases and increased 
activity of histone deacetylases have 
opposite effects, as expected from their 
biochemical activities (See p. 196). 


Figure 22-45 Production of differentiated 
cells from ES or iPS cells in culture. 
These cells can be cultured indefinitely 

as pluripotent cells when attached as a 
monolayer to a dish. Alternatively they 

can be detached and allowed to form 
aggregates called embryoid bodies, which 
causes the cells to begin to specialize. 
Cells from embryoid bodies, cultured in 
media with different factors added, can 
then be driven to differentiate in various 
ways. (Based on E. Fuchs and J.A. Segre, 
Cell 100:143-155, 2000. With permission 
from Elsevier.) 
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(A) 
Figure 22-46 Cultured ES cells can give rise to a three-dimensional organ. (A) Remarkably, under appropriate conditions, 
mouse ES cells in culture can proliferate, differentiate, and interact to form a three-dimensional, eye-like structure, which 
includes a multilayered retina similar in organization to the one that forms in vivo. (B) Fluorescent micrograph of an optic cup 
formed by ES cells in culture. The structure includes a developing retina, containing multiple layers of neural cells, which 


produce a protein (pink) that serves as a marker for retinal tissue. (B, from M. Eiraku, N. Takata, H. Ishibashi et al., Nature 
472:51-56, 2011. With permission from Macmillan Publishers Ltd.) 


Cells of One Specialized Tyoe Can Be Forced to Transdifferentiate 
Directly Into Another 


The route we have just described, from one mode of differentiation to another via 
conversion to an iPS cell, seems needlessly roundabout. Could we not convert 
cell type A into cell type B directly, without backtracking to the embryonic-like 
iPS state? For many years, it has been known that such transdifferentiation can be 
achieved in a few special cases, such as the conversion of fibroblasts into skeletal 
muscle cells by forced expression of MyoD (see p. 396). But now, with the insights 
that have come from the study of ES and iPS cells, ways are being found to bring 
about such interconversions in a much wider range of cases. 

An elegant example comes from studies of the heart. By forcing expression of 
an appropriate combination of factors—not Oct4, Sox2, KIf4, and Myc, but Gata4, 
Mef2c, and Tbx5—it is possible to convert heart fibroblasts directly into heart 
muscle cells. This has been done in the living mouse, using retroviral vectors, 
and the transformation occurs with high efficiency when the vectors carrying the 
transgenes are injected directly into the heart muscle tissue itself. Although they 
occupy only a small fraction of the tissue volume, the fibroblasts in the heart out- 
number the heart muscle cells, and they survive in large numbers even where the 
heart muscle cells have died. Thus, in a typical nonfatal heart attack, where heart 
muscle cells have died for lack of oxygen, the fibroblasts proliferate and make 
collagenous matrix so as to replace the lost muscle with a fibrous scar. This is a 
poor sort of repair. By forcing expression of the appropriate factors in the heart, 
as described above, it has proved possible, in the mouse at least, to do better than 
nature and regenerate lost heart muscle by transdifferentiation of heart fibro- 
blasts. 

We are still a long way from putting this technique into practice as a treatment 
for heart attacks in humans, but it shows what the future may hold—not only for 
this medical problem, but for many others. 


ES and iPS Cells Are Useful for Drug Discovery and Analysis of 
Disease 


A large part of the excitement surrounding ES and iPS cells and the technology 
of transdifferentiation comes from the prospect of using the artificially generated 
cells for tissue repair. It begins to seem that virtually any type of tissue might be 
replaceable, allowing treatment of degenerative diseases that have previously had 
no cure. Research in this area is moving rapidly, but there are many difficulties to 
be overcome. 
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With the advent of iPS cells and direct transdifferentiation, at least one major 
hurdle has been surmounted, in principle at least: the problem of immune rejec- 
tion. ES cells, because they are created from early embryos that generally come 
from unrelated donors, will never be genetically identical to the cells of the patient 
receiving the transplant. The transplanted cells and their progeny are therefore 
liable to rejection by the immune system. Both iPS and transdifferentiated cells, in 
contrast, can be generated from a small sample of the patient’s own tissue and so 
should escape immune attack when transplanted back into the same individual. 

Tissue repair by transplantation, however, is not the only application for which 
ES, iPS, and transdifferentiated cells can be used: there are other ways in which 
they promise to be more immediately valuable. In particular, they can be used to 
generate large, homogeneous populations of specialized cells of any chosen type 
in culture; and these can serve for investigation of disease mechanisms and in the 
search for new drugs acting on a specific cell type (Figure 22-47). 

Where a disease has a genetic cause, we can derive iPS cells from sufferers and 
use these cells to produce the specific cell types that malfunction, to investigate 
how the malfunction occurs, and to screen for drugs that might help to put it right. 
Timothy syndrome provides an example. In this rare genetic condition, there is a 
severe, life-threatening disorder in the rhythm of the heart beat (as well as several 
other abnormalities), as a result of a mutation in a specific type of Ca** channel. 
To study the underlying pathology, researchers took skin fibroblasts from patients 
with the disorder, generated iPS cells from the fibroblasts, and drove the iPS cells 
to differentiate into heart muscle cells. These cells, when compared with heart 
muscle cells prepared similarly from normal control individuals, showed irregu- 
lar contractions and abnormal patterns of Ca** influx and electrical activity that 
could be characterized in detail. From this finding, it is a small step to develop- 
ment of an in vitro assay for drugs that might correct the misbehavior of the heart 
muscle cells. 

This approach to drug discovery—where iPS cells are prepared from the indi- 
vidual patient, differentiated into the relevant cell type, and used to test candidate 
drugs in vitro—would seem to represent a huge advance on the slow, costly tradi- 
tional methods that involve administration of test compounds to large numbers 
of people. 
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Figure 22-47 Use of iPS cells for 

drug discovery and for analysis and 
treatment of genetic disease. The 

left side of the diagram shows how 
differentiated cells that are generated from 
iPS cells derived from a patient with a 
genetic disease can be used for analysis of 
the disease mechanism and for discovery 
of therapeutic drugs. The right side of the 
diagram shows how the genetic defect 
might be repaired in the iPS cells, which 
could then be induced to differentiate in an 
appropriate way and grafted back into the 
patient without danger of immune rejection. 
(Based on D.A. Robinton and G.Q. Daley, 
Nature 481:295-305, 2012). 
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Summary 


In the adult mammalian body, the various types of stem cells are highly specialized, 
each giving rise to a limited range of differentiated cell types. Cells become restricted 
to specific pathways of differentiation during embryonic development. One way 
to force a return to a pluripotent or totipotent state is by nuclear transplantation: 
the nucleus of a differentiated cell can be injected into an enucleated oocyte, whose 
cytoplasm reprograms the genome back to an approximation of an early embry- 
onic state. This allows production of an entire new individual. The reversion of the 
genome to this state involves radical, genome-wide changes in chromatin structure 
and DNA methylation. 

Remarkably, cells taken from the inner cell mass of an early mammalian embryo 
can be propagated in culture indefinitely in a pluripotent state. When transplanted 
back into a host early embryo, these embryonic stem (ES) cells can contribute cells 
to any tissue, including the germ line. ES cells have been invaluable for genetic engi- 
neering in mice. Cells with similar properties, called induced pluripotent stem cells 
(iPS cells), can be generated from adult differentiated cells such as fibroblasts by 
forced expression of a cocktail of key transcription regulators. A similar method 
can be used to reprogram adult cells directly from one specialized state to another. 
In principle, iPS cells generated from cells biopsied from an adult human patient 
could be used for tissue repair in that same individual, avoiding the problem of 
immune rejection. More immediately, they provide a source of specialized cells that 
can be used to analyze in vitro the effects of mutations affecting human cells and for 


WHAT WE DON’T KNOW 


e What determines tissue and organ 
size? How do the cells in each tissue 
know when to terminate their growth 
and division, so as to limit the size of 
an organ or tissue appropriately? 


e What is the fundamental molecular 
difference that distinguishes a stem 
cell? 


e How is the correct balance between 
stem cells, progenitor cells, and 
differentiated cells maintained in a 
tissue or organ? 


e What role does chromatin structure 
play in cell memory and in cell 
reprogramming? 


e How are molecules inherited 
asymmetrically during cell division? 


e How do germ cells avoid aging? 


screening for drugs for treatment of genetic diseases. 


PROBLEMS 


Which statements are true? Explain why or why not. 


22-1 In the small intestine, stem cells in the crypts 
divide asymmetrically to maintain the population of cells 
that make up the villi; after each division, one daughter 
remains a stem cell and the other begins to divide rapidly 
to produce differentiated progeny. 


22-2 Stem cells, being stem cells, are by definition the 
same in all tissues. 


22-3 Every tissue that can be renewed is renewed from 
a tissue-specific population of stem cells. 


22-4 Disturbance of the balance in the activities of 
osteoblasts and osteoclasts in favor of osteoclasts can 
give rise to the condition known as osteoporosis, the brit- 
tle-bone syndrome of the elderly. 


Discuss the following problems. 


22-5 Inthe 1950s, scientists fed “H-thymidine to rats to 
label cells that were synthesizing DNA, and then followed 
the fates of labeled cells for periods of up to a year. They 
found three patterns of cell labeling in different tissues. 
Cells in some tissues such as neurons in the central ner- 
vous system and the retina did not get labeled. Muscle, 
kidney, and liver, by contrast, each showed a small number 
of labeled cells that retained their label, apparently with- 
out further division or loss. Finally, cells such as those in 
the squamous epithelia of the tongue and esophagus were 


labeled in fairly large numbers, with radioactive pairs of 
nuclei visible in 12 hours; however, the labeled cells disap- 
peared over time. Which of these three patterns of labeling 
would you expect to see if the labeled cells were generated 
by stem cells? Explain your answer. 


22-6 At any given time, intestinal crypts of mice com- 
prise about 15 stem cells and 10 Paneth cells. After cell 
division, which occurs about once a day, the daughter 
cells remain stem cells only if they maintain contact with a 
Paneth cell. This constant competition for Paneth-cell con- 
tact raises the possibility that crypts might become mono- 
clonal over time; that is, the crypt cells at one point in time 
might derive from only 1 of the 15 stem cells that existed 
at some earlier time. To test this possibility, you use the 
so-called confetti marker that upon activation expresses 
any one of three fluorescent proteins in the stem cells of 
the crypt. You then examine crypts at various times to 
determine whether they contain cells with multiple colors 
or only one color (Figure Q22-1). Do the crypts become 
monoclonal over time or not? How can you tell? 


22-7 ‘The origin of new f cells of the pancreas—from 
stem cells or from preexisting P cells—was not resolved 
until a decade ago, when the technique of lineage tracing 
was used to decide the issue. Using transgenic mice that 
expressed a tamoxifen-activated form of Cre recombinase 
under the control of the insulin promoter, which is active 
only in £ cells, investigators could remove an inhibitory 
segment of DNA and thereby allow expression of human 
placental alkaline phosphatase (HPAP), which can be 
detected by histochemical staining. After a pulse of tamox- 
ifen that converted about 30% of P cells in young mice to 
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microvilli 








Figure Q22-1 Fluorescent cells 
in crypts in mouse intestines at 
various times after activation of 
expression of fluorescent proteins 
(Problem 22-6). The images 
are taken in the xz plane, which 
cuts through multiple crypts, 

as indicated in the schematic 
drawing. Roughly 50 crypts are 
visible in each section. Dotted 
white circles identify some 
individual crypts. Scale bars are 
100 um. (Adapted from 

H.J. Snippert et al., Cell 
143:134-144, 2010. With 
permission from Elsevier.) 


4weeks | 


30 weeks 


cells that express HPAP, the investigators followed the per- 
centage of labeled p cells for a year, during which time the 
total number of f cells in the pancreas increased by 6.5- 
fold. How do you suppose the percentage of B cells would 
change over time if new f cells were derived from stem 
cells? What if new B cells were derived from preexisting 
B cells? Which hypothesis do the results in Figure Q22-2 
support? 


percent of ß cells 
expressing HPAP 
(e2) 
O 


40 
0 
0 4 6 9 12 


months after tamoxifen injection 


Figure Q22-2 Percentage of labeled p cells in pancreatic islets of mice 
at different ages (Problem 22-7). All mice were injected with a pulse of 
tamoxifen at 6 to 8 weeks of age and then stained for human placental 
alkaline phosphatase (HPAP) at various times afterward. Error bars 
represent standard deviations. 
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22-8 One of the earliest assays for hematopoietic stem 
cells made use of their ability to form colonies in the 
spleens of heavily irradiated mice. By varying the amounts 
of transplanted bone marrow cells, investigators showed 
that the number of spleen colonies varied linearly with 
dose and that the curve passed through the origin, sug- 
gesting that single cells were capable of forming individ- 
ual colonies. However, because colony formation was rare 
relative to the numbers of transplanted cells, it was possi- 
ble that undispersed clumps of two or more cells were the 
actual initiators. 

A classic paper resolved this issue by exploiting 
rare, cytologically visible genome rearrangements gen- 
erated by irradiation. Recipient mice were first irradiated 
to deplete bone marrow cells, and then they were irradi- 
ated a second time after transplantation to generate rare 
genome rearrangements in the transplanted cell popula- 
tion. Spleen colonies were then screened to find ones that 
carried genome rearrangements. How do you suppose this 
experiment distinguishes between colonization by single 
cells versus cellular aggregates? 


22-9 It is possible to purify hematopoietic stem cells 
using a combination of antibodies directed against cell- 
surface targets. By removing cells that expressed surface 
markers characteristic of specific lineages such as B cells, 
granulocytes, myelomonocytic cells, and T cells, investi- 
gators generated a population of cells enriched for stem 
cells. They further enriched this population for putative 
stem cells by positively selecting for cells that expressed 
suspected stem-cell surface markers. Spleen colony for- 
mation in irradiated mice by these putative stem cells and 
the unfractionated bone marrow cells is shown in Figure 
Q22-3. Given that only about 1 in 10 cells lodges in the 
spleen, do these results support the idea that the enriched 
population consists mostly of hematopoietic stem cells? 
What additional information would you need to have to 
feel confident that the enriched cells are true stem cells? 
What proportion of bone marrow cells are hematopoietic 
stem cells? 


22-10 Generation of induced pluripotent stem (iPS) 
cells was first accomplished using retroviral vectors to 
carry the OSKM (Oct4, Sox2, klf4, and Myc) set of tran- 
scription regulators into cells. The efficiency of fibroblast 
reprogramming was typically low (0.01%), in part because 
large numbers of retroviruses must integrate to bring 
about reprogramming and each integration event carries 
with it the risk of inappropriately disrupting or activating 
a critical gene. In what other ways, or other forms, do you 
suppose you might deliver the OSKM transcription regula- 
tors so as to avoid these problems? 


m enriched ~~ Unfractionated) Figure Q22-3 Spleen 

£ me cells cells colony formation by cells 

= : enriched for stem cells and by 
a7 unfractionated bone marrow 
-2 cells (Problem 22-9). 
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E 

D5 
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Pathogens and Infection 


Infectious diseases currently cause about one-quarter of all human deaths world- 
wide, more than all forms of cancer combined and second only to cardiovascular 
diseases. In addition to the continuing heavy burden of ancient diseases such as 
tuberculosis and malaria, newer infectious diseases continually emerge. The cur- 
rent pandemic (worldwide epidemic) of AIDS (acquired immune deficiency syn- 
drome), was first clinically observed in 1981 and has since caused more than 35 
million deaths worldwide. Moreover, some diseases long thought to result from 
other causes are now recognized to be associated with infections. Most gastric 
ulcers, for example, are caused not by stress or spicy food, but by infection of the 
stomach lining by the bacterium Helicobacter pylori. 

The burden of infectious diseases is not spread equally across the planet. 
Poorer countries and communities suffer disproportionately, often due to poor 
public sanitation and health systems. Some infectious diseases, however, occur 
primarily or exclusively in industrialized communities: Legionnaire’s disease, for 
example, a bacterial infection of the lungs, commonly spreads through air-condi- 
tioning systems. 

Since the mid-1800s, physicians and scientists have struggled to identify the 
agents—collectively called pathogens—that are capable of causing infectious 
diseases. More recently, the advent of microbial genetics and molecular cell biol- 
ogy has greatly enhanced our understanding of the causes and mechanisms of 
infectious diseases. We now know that pathogens frequently exploit the attributes 
of their host’s cells in order to infect them. This understanding can give us new 
insights into normal cell biology, as well as strategies for treating and preventing 
infectious diseases. 

Although pathogens are understandably a focus of attention, only a relatively 
small fraction of the microbial species we encounter are pathogens. Much of the 
biomass of the Earth is made up of microbes, and they produce everything from 
the oxygen we breathe to the soil nutrients we use to grow food. Even those spe- 
cies of microbes that colonize the human body do not generally cause disease. 
The collective of microorganisms that reside in or on an organism is called the 
microbiota. Many of these microbes have a beneficial effect on the health of the 
organism, assisting its normal development and physiology. 

In this chapter, we give an overview of the different kinds of pathogens, as well 
as those microorganisms that colonize our body without causing trouble. We then 
discuss the cell biology of infection—the molecular interactions between patho- 
gens and their host. In Chapter 24, we consider how our innate and adaptive 
immune systems collaborate to defend us against pathogens. 


INTRODUCTION TO PATHOGENS AND THE HUMAN 
MICROBIOTA 


We normally think of pathogens as hostile invaders, but a pathogen, like any other 
organism, is simply exploiting an available niche in which to live and procreate. 
Living on or in a host organism is a very effective strategy, and it is possible that 
every organism on Earth is subject to some type of infection (Figure 23-1). A 
human host is a nutrient-rich, warm, and moist environment, which remains at a 
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uniform temperature and constantly renews itself. It is not surprising that many 
microorganisms have evolved the ability to survive and reproduce in this desir- 
able niche. In this section, we discuss some of the common features that microor- 
ganisms must have in order to colonize the human body or cause disease, and we 
explore the wide variety of organisms that are known to cause disease. 


The Human Microbiota Is a Complex Ecological System That Is 
Important for Our Develooment and Health 


The human body contains about 10! human cells, as well as a microbiota con- 
sisting of approximately 10" bacterial, fungal, and protozoan cells, which repre- 
sent thousands of microbial species—the so-called normal flora. The combined 
genomes of the various species of the human microbiota, called the microbiome, 
contain more than 5 x 10° genes—more than 100 times greater than the number 
of genes in the human genome itself. A consequence of this genomic diversity 
is that the microbiota expands the range of biochemical and metabolic activities 
available to the humans. 

The microbiota is usually confined to the skin, mouth, digestive tract, and 
vagina. With the exception of microbes colonizing the skin, it consists primarily 
of anaerobic bacteria, with distinct communities of species inhabiting each body 
part. These communities vary considerably between individual humans, even 
between close relatives or identical twins. Although the microbiota of an individ- 
ual is generally consistent over time, it is influenced by a variety of factors, includ- 
ing age, diet, health status, and antibiotic use. 

There are various ecological relationships that these microbes have with their 
host. In mutualism, both the microbe and host benefit. The anaerobic bacteria 
that inhabit our intestines, for example, gain shelter and a nutrient supply but also 
contribute to the digestion of our food, produce important nutrients for us, and 
are essential for the normal development of our gastrointestinal tract and innate 
and adaptive immune systems. In commensalism, the microbe benefits but offers 
no benefit and causes no harm: for example, we are infected with many viruses 
that have no noticeable effect on our health. In parasitism, the microbe benefits 
to the detriment of the host, as is often the case for pathogens. 

Many infectious diseases are caused by a single pathogen. There is increasing 
evidence, however, that an imbalance in the community of microbes that consti- 
tute the microbiota can contribute to some diseases, including autoimmune and 
allergic diseases, obesity, inflammatory bowel disease, and diabetes. Remarkably, 
in such cases of microbiota imbalance (referred to as dysbiosis), the transfer of the 
microbiota from a healthy individual to someone suffering from the disease can 
be beneficial and sometimes curative, as in the case of Clostridium difficile colitis 
caused by overgrowth of the bacterium. 


Pathogens Interact with Their Hosts in Different Ways 


If it is normal for us to live with a community of microbes, why are some of 
them capable of causing us illness or death? Although the ability of a particular 


Figure 23-1 Parasitism at many levels. 
(A) Most animals harbor parasites, an 
example being the blacklegged tick or deer 
tick (Ixodes scapularis), shown here on a 
human finger. Although ticks of this species 
thrive on white-tailed deer and other wild 
mammals, they can also live on humans. 
(B) Ticks themselves harbor their own 
parasites including the bacterium Borrelia 
burgdorferi, stained here with a vital dye 
that labels living bacteria green and dead 
bacteria red. These spiral-shaped bacteria 
live in deer ticks and can be transmitted to 
humans during a tick’s blood meal. Borrelia 
burgdorferi causes Lyme disease, which 

is characterized by a bull’s-eye-shaped 
skin rash and fever; if the infection is left 
untreated, various complications can 
result, including arthritis and neurological 
abnormalities. The idea that parasites have 
their own parasites was noted by Jonathan 
Swift in 1733: 

“So, naturalists observe, a flea 

Has smaller fleas that on him prey; 

And these have smaller still to bite ‘em; 
And so proceed ad infinitum.” 

(A, from Acorn, White-Footed Mice and 
Tick Cycle Augment Risks of Lyme Disease 
in 2012. March 14, 2012. Reprinted with 
permission of Anita Sil; B, courtesy of 

M. Embers.) 
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microorganism to cause disease depends on many factors, it requires that the 
pathogen possess specialized pathogenic characteristics that allow it to live in 
humans. 

Primary pathogens can cause overt disease in most healthy people. Some 
primary pathogens cause acute, life-threatening epidemic infections and spread 
rapidly from one sick or dying host to another; historically important examples 
include the bacterium Vibrio cholerae, which causes cholera, and the variola 
and influenza viruses, which cause smallpox and flu, respectively. Others may 
persistently infect a single individual for years without causing overt disease; 
examples include the bacterium Mycobacterium tuberculosis (which can cause 
the life-threatening lung infection tuberculosis) and the intestinal worm Ascaris. 
Although these potential primary pathogens can make some people critically ill, 
billions of people carry these foreign organisms in an asymptomatic way, often 
unaware that they are infected. It is sometimes difficult to draw a line between 
the asymptomatic presence of such pathogens and the normal microbiota. Some 
microbes of the normal flora can act as opportunistic pathogens, in that they 
cause disease only if our immune systems are weakened or if they gain access toa 
normally sterile part of the body. 

In order to survive and multiply, a successful pathogen must be able to: 
(1) enter the host (usually by breaking an epithelial barrier); (2) find a nutritionally 
compatible niche in the host’s body; (3) avoid, subvert, or circumvent the host’s 
innate and adaptive immune responses; (4) replicate, using host resources; and (5) 
exit one host and spread to another. Pathogens have evolved various mechanisms 
that maximally exploit the biology of their host organisms to help accomplish these 
tasks. For some pathogens, these mechanisms are adapted to a unique host spe- 
cies, whereas for others the mechanisms are sufficiently general to permit inva- 
sion, survival, and replication in a wide variety of hosts. Because pathogens have 
evolved the ability to interface directly with the molecular machinery of host cells, 
we have learned a great deal about cell biological principles by studying them. 

Our constant exposure to pathogens has strongly influenced human evolution. 
In modern times, humans have learned how to limit the ability of pathogens to 
infect us through improvements in public health measures and childhood nutri- 
tion, vaccines, antimicrobial drugs, and routine testing of blood used for transfu- 
sions. As we learn more about the mechanisms by which pathogens cause disease 
(called pathogenesis), our creativity and resourcefullness will continue to serve 
as an important addition to our immune systems in fighting infectious diseases. 


Pathogens Can Contribute to Cancer, Cardiovascular Disease, 
and Other Chronic Illnesses 


Some viral and bacterial pathogens can cause or contribute to chronic, life-threat- 
ening illnesses that are not normally classified as infectious diseases. An import- 
ant example is cancer. As discussed in Chapter 20, the oncogene concept—that 
certain altered genes can trigger cell transformation and tumor development— 
came initially from studies of the Rous sarcoma virus, which causes a form of can- 
cer (sarcomas) in chickens. One of the viral genes encodes an overactive homolog 
of the host tyrosine kinase Src (see Figure 3-63), which has been implicated in 
many kinds of cancer. Several human cancers are also known to have a viral ori- 
gin. Human papillomavirus, for example, which causes genital warts, is responsi- 
ble for more than 90% of cervical cancers (see Figure 20-40). The recent develop- 
ment of a vaccine against the most abundant cancer-associated strains of human 
papillomavirus promises to prevent many of these cancers in the future. In other 
cases, chronic tissue damage caused by infection can increase the likelihood of 
cancer. Inflammation caused by the stomach-dwelling bacterium H. pylori can be 
a major contributor to stomach cancer, as well as to gastric ulcers. 

The major causes of death in wealthy industrialized nations are cardiovascular 
diseases. They frequently result from atherosclerosis, the accumulation in blood 
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vessel walls of fatty deposits that can block blood flow and cause heart attacks 
and strokes. A hallmark of early atherosclerosis is the appearance in blood vessel 
walls of clumps of macrophages called foam cells, which recruit other white blood 
cells into the forming atherosclerotic plaque. Foam cells in atherosclerotic plaques 
often contain the bacterial pathogen Chlamydia pneumoniae, which commonly 
Causes pneumonia in humans and is a significant risk factor for atherosclerosis 
in humans and animal models. Other bacterial species are also implicated in 
atherosclerosis, including bacteria usually associated with teeth and gums, such 
as Porphyromonas gingivalis. As we learn more about the interactions between 
pathogens and the human body, it seems likely that more chronic conditions will 
be found to have a link to an infectious agent. 


Pathogens Can Be Viruses, Bacteria, or Eukaryotes 


Many types of pathogens cause disease in humans. The most familiar are viruses 
and bacteria. Viruses cause diseases ranging from AIDS and smallpox to the com- 
mon cold. Viruses are essentially fragments of nucleic acid (DNA or RNA) that 
generally encode a relatively small number of gene products, wrapped in a pro- 
tective shell of proteins (Figure 23-2A) and (in some cases) an outer membrane 
envelope (see Figure 5-62). Much larger and more complex than viruses, bacte- 
ria are prokaryotic cells, which perform most of their basic metabolic functions 
themselves, relying on the host primarily for nutrition (Figure 23-2B). 

Some other infectious agents are eukaryotic organisms. These range from 
single-celled fungi and protozoa (Figure 23-2C) to large, complex metazoa such 
as parasitic worms. One of the most common human parasites, shared by about 
a billion people at present, is the nematode worm Ascaris lumbricoides, which 
infects the gut (Figure 23-2D). It closely resembles its harmless nematode cousin 
Caenorhabditis elegans, which is used as a model organism for genetic and devel- 
opmental biological research (see Figure 1-39). C. elegans, however, is only about 
1 mm in length, whereas Ascaris can reach 30 cm. 

We now introduce the basic features of each of the major types of pathogens, 
before we examine the mechanisms that pathogens use to infect their hosts. 


Figure 23-2 Pathogens in many forms. 
(A) The structure of the protein coat, or 
capsid, of poliovirus. This virus was once 
a common cause of paralysis, but the 
disease (poliomyelitis) has been greatly 
reduced by widespread vaccination. 

(B) The bacterium Vibrio cholerae, the 
causative agent of the epidemic, diarrheal 
disease cholera. (C) The protozoan parasite 
Trypanosoma brucei (purple) in a field 

of erythrocytes (red blood cells; pink). 

This parasite causes African sleeping 
sickness, a potentially fatal disease of the 
central nervous system. (D) This clump of 
Ascaris nematodes was removed from the 
obstructed intestine of a two-year-old boy. 
(A, courtesy of Robert Grant, 

Stephan Crainic, and James M. Hogle; 

B, photograph courtesy of John 
Mekalanos; C, CDC, Department of Health 
and Human Services; D, from J.K. Baird et 
al., Am. J. Trop. Med. Hyg. 35:314-318, 
1986. Photograph by Daniel H. Connor.) 
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Figure 23-3 Bacterial shapes and cell-surface structures. (A) Bacteria are traditionally classified by shape. (B and C) They 
are also classified as Gram positive or Gram negative. (B) Gram-positive bacteria such as Streptococcus and Staphylococcus 
have a single membrane and a thick cell wall made of cross-linked peptidoglycan. They are called Gram positive because 

they retain the violet dye used in the Gram-staining procedure. (C) Gram-negative bacteria such as Escherichia coli (E. coll) 

and Salmonella have two membranes, separated by the periplasm (see Figure 11-17). The peptidoglycan cell wall of these 
organisms is located in the periplasm and is thinner than in Gram-positive bacteria; they therefore fail to retain the dye in the 
Gram-staining procedure. The inner membrane of both Gram-positive and Gram-negative bacteria is a phospholipid bilayer. The 
inner leaflet of the outer membrane of Gram-negative bacteria is also made primarily of phospholipids, whereas the outer leaflet 
of the outer membrane is composed of a unique glycosylated lipid called //popolysaccharide (LPS). (D) Cell-surface appendages 
are important for bacterial behavior. Many bacteria swim using the rotation of helical flagella. The bacterium illustrated has only a 
single flagellum at one pole; however, many have multiple flagella. Straight pili (also called fimbriae) are used to adhere to various 
surfaces in the host, as well as to facilitate genetic exchange between bacteria. Some kinds of pili can retract to generate force 


and thereby help bacteria move along surfaces. 


Bacteria Are Diverse and Occupy a Remarkable Variety of 
Ecological Niches 


Although bacteria generally lack internal membranes, they are highly sophisti- 
cated cells whose organization and behaviors have attracted the attention of many 
scientists. Bacteria are classified broadly by their shape—as rods, spheres (cocci), 
or spirals (Figure 23-3A)— as well as by their so-called Gram-staining properties, 
which reflect differences in the structure of the bacterial cell wall. Gram-positive 
bacteria have a thick layer of peptidoglycan cell wall outside their inner (plasma) 
membrane (Figure 23-3B), whereas Gram-negative bacteria have a thinner pep- 
tidoglycan cell wall. In both cases, the cell wall protects against lysis by osmotic 
swelling, and it is a target of host antibacterial proteins such as lysozyme and 
antibiotics such as penicillin. Gram-negative bacteria are also covered outside 
the cell wall by an outer membrane containing lipopolysaccharide (LPS) (Figure 
23-3C). Both peptidoglycan and LPS are unique to bacteria and are recognized 
as pathogen-associated molecular patterns (PAMPs) by the host innate immune 
system, as discussed in Chapter 24. The surface of bacterial cells can also display 
an array of appendages, including flagella and pili, which enable bacteria to swim 
or adhere to desirable surfaces, respectively (Figure 23-3D). Apart from cell shape 
and structure, differences in ribosomal RNA and genomic DNA sequence are also 
used for phylogenetic classification. Because bacterial genomes are small—typi- 
cally between 1,000,000 and 5,000,000 nucleotide pairs (compared to more than 


1268 Chapter 23: Pathogens and Infection 


3,000,000,000 for humans)—they are now simple to sequence, making this an 
important new classification tool. 

Bacteria also exhibit extraordinary molecular, metabolic, and ecological diver- 
sity. At the molecular level, bacteria are far more diverse than eukaryotes, and 
they can occupy ecological niches having extremes of temperature, salt concen- 
trations, and nutrient limitation. Some bacteria replicate in an environmental res- 
ervoir such as water or soil and only cause disease if they happen to encounter a 
susceptible host; these are called facultative pathogens. Others can only replicate 
inside the body of their host and are therefore called obligate pathogens. Bacte- 
ria also differ in the range of hosts they will infect. Shigella flexneri, for example, 
which causes epidemic dysentery (bloody diarrhea), will infect only humans and 
other primates. By contrast, the closely related bacterium Salmonella enterica, 
which is a common cause of food poisoning in humans, can also infect other ver- 
tebrates, including chickens and turtles. A champion generalist is the opportunis- 
tic pathogen Pseudomonas aeruginosa, which can cause disease in a wide variety 
of plants and animals. 


Bacterial Pathogens Carry Specialized Virulence Genes 


Pathogenic bacteria and their closest nonpathogenic relatives often differ in a rel- 
atively small number of genes. Genes that contribute to the ability of an organ- 
ism to cause disease are called virulence genes, and the proteins they encode are 
called virulence factors. Such virulence genes are often clustered together on the 
bacterial chromosome; large clusters are called pathogenicity islands. Virulence 
genes can also be carried on bacteriophages (bacterial viruses) or transposons 
(see Table 5-4), both of which integrate into the bacterial chromosome, or on ext- 
rachromosomal virulence plasmids (Figure 23-4A). 

Pathogenic bacteria are thought to emerge when groups of virulence genes 
are transferred together into a previously avirulent bacterium by a process called 
horizontal gene transfer (to distinguish it from vertical gene transfer from parent 
to offspring). Horizontal transfer can occur by one of three mechanisms: natural 
transformation by released naked DNA, transduction by bacteriophages, or sexual 
exchange by conjugation (Figure 23-4B and Movie 23.1). Sequencing the genomes 
of large numbers of pathogenic and nonpathogenic bacteria has indicated that 
horizontal gene transfer has made important contributions to bacterial evolution, 
enabling species to inhabit new ecological and nutritional niches, as well as to 
cause disease. Even within a single bacterial species, the amount of chromosomal 
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Figure 23-4 Genetic differences 
between pathogenic and nonpathogenic 
bacteria. (A) Genetic differences between 
nonpathogenic E. coli and two closely 
related food-borne pathogens — Shigella 
flexneri, which causes dysentery, and 
Salmonella enterica, a common cause 

of food poisoning. Nonpathogenic E. coli 
has a single circular chromosome. The 
chromosome of S. flexneri differs from 

that of E. coli in a limited number of 
locations; most of the genes required for 
pathogenesis (virulence genes) are carried 
on an extrachromosomal virulence plasmid. 
The chromosome of S. enterica carries 

two large inserts (pathogenicity islands) not 
found in the E. coli chromosome; these 
inserts each contain many virulence genes. 
(B) Bacterial pathogens evolve by horizontal 
gene transfer. This can occur by three 
mechanisms: natural transformation, in 
which naked DNA is taken in by competent 
bacteria; transduction, in which bacterial 
viruses (bacteriophages) transfer DNA 

from one bacterium into another; and 
conjugation, during which plasmid DNA, 
and even chromosomal DNA, is transferred 
from a donor to a recipient bacterium. 
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variation is astonishing; the genomes of different strains of Escherichia coli can 
differ by as much as 25%. Such variation has led to the concept that a bacterial 
species has both a core genome common to all isolates within the species and a 
larger pan-genome consisting of all genes present in the full spectrum of isolates. 

Acquisition of genes and gene clusters can drive the rapid evolution of patho- 
gens and turn nonpathogens into pathogens. Consider, for example, Vibrio chol- 
erae—the Gram-negative bacterium that causes the epidemic diarrheal disease 
cholera. Of the hundreds of strains of Vibrio cholerae, the only ones that cause 
pandemic human disease are those infected with a mobile bacteriophage (CTX9) 
containing genes encoding the two subunits of the toxin that causes the diarrhea. 
As summarized in Figure 23-5, seven pandemics of V. cholerae have arisen since 
1817. The first six were caused by the periodic reemergence of so-called Classical 
strains. In addition to the toxin-encoding bacteriophage, these Classical strains 
shared a similar O1 surface antigen, part of the LPS in the outer membrane (see 
Figure 23-3C). In 1961, the seventh pandemic began, caused by a new strain 
named “El Tor,’ which arose when an O1-expressing strain acquired two bacte- 
riophages and at least two new pathogenicity islands. El Tor eventually displaced 
the Classical strains. In 1992, a new strain emerged in which O1 was replaced with 
another O-antigen variant called 0139, which was not recognized by antibodies 
present in the blood of survivors of previous cholera epidemics. The 0139 strain 
also contains a transposon-like element that encodes antibiotic resistance. As this 
example makes clear, the rapid evolution of bacterial pathogens can be likened 
to an arms race which pits the survival of a bacterium against our immune sys- 
tems and the tools of modern medicine. Similar struggles for survival take place 
between all pathogens and humans, and understanding these conflicts provides 
key insights into the evolution of pathogens and greatly informs us how we treat 
new outbreaks of infectious diseases. 


Bacterial Virulence Genes Encode Effector Proteins and Secretion 
systems to Deliver Effector Proteins to Host Cells 


What are the gene products that enable a bacterium to cause disease in a healthy 
host? For pathogenic bacteria that live outside of host cells, called extracellular 
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Figure 23-5 Comparative-genomics- 
based model for the evolution of 
pathogenic Vibrio cholerae strains. 
Progenitor strains in the wild first 

acquired the biosynthetic pathway 
necessary to make the O1 antigen type of 
carbohydrate chain on the outer-membrane 
lin 0polysaccharide (See Figure 23-30). 
Incorporation of the CTX bacteriophage 
created the Classical pathogenic strains 
responsible for the first six worldwide 
epidemics of cholera between 1817 and 
1923. Sometime in the twentieth century, 
an O1 strain in the environment picked 

up the CTX@ bacteriophage again, along 
with an associated bacteriophage RS10 
and two pathogenicity islands (VSP1 

and VSP2), creating the El Tor strain 

that emerged as the seventh worldwide 
pandemic in 1961. In 1992, an El Tor strain 
was isolated that had picked up a new 
DNA cassette, enabling it to produce the 
0139 antigen type of carbohydrate chain 
rather than the O1 type. This altered the 
bacterium’s interaction with the human 
immune system, without diminishing its 
virulence; this bacterium also picked up a 
new pathogenicity island (SXT). An electron 
micrograph of Vibrio cholerae (V. cholerae) 
is shown in Figure 23-2B. 
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bacterial pathogens, virulence genes often encode secreted toxic proteins (tox- 
ins) that interact with host cell structural or signaling proteins to elicit a response 
that is beneficial to the pathogen. Several of these bacterial toxins are among the 
most potent of known human poisons. Bacterial toxins are often composed of two 
protein components—an A subunit with enzymatic activity, and a B subunit that 
binds to specific receptors on the host cell surface and directs the trafficking of 
the A subunit to the cytosol by various routes (Figure 23-6). The Vibrio cholerae 
phage, for example, encodes the two subunits of cholera toxin (Movie 23.2). The 
A subunit catalyzes the transfer of an ADP-ribose moiety from NAD* to the tri- 
meric G protein G, (see Figure 15-23), which activates adenylyl cyclase to make 
cyclic AMP (see Figure 15-25). ADP-ribosylation prevents inactivation of the G 
protein and results in the overaccumulation of intracellular cyclic AMP and the 
release of ions and water into the intestinal lumen, leading to the watery diarrhea 
associated with cholera. The infection then spreads to new hosts via released bac- 
teria, which can contaminate food and water. 

Some pathogenic bacteria secrete multiple toxins, each of which targets a dif- 
ferent signaling pathway in host cells. Anthrax, for example, is an acute infectious 
disease of sheep, cattle, and occasionally humans. It is caused by contact with 
spores of the Gram-positive bacterium Bacillus anthracis. Dormant spores can 
survive in soil for long periods. If inhaled, ingested, or rubbed into breaks in the 
skin, spores can germinate and the bacteria replicate. The bacteria secrete two 
toxins with identical B subunits but different A subunits. The B subunits bind to 
a host cell-surface receptor protein to transfer the two different A subunits into 
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Figure 23-6 Bacterial toxin entry into 
host cells. Bacterial toxins are often 
composed of A and B protein subunits. 
The B (binding) subunit of the toxin 
interacts with host-cell toxin receptors, 
enabling endocytosis and intracellular 
trafficking of B subunit as well as its 
associated and enzymatically active 

A subunit(s). In the case of Bacillus 
anthracis, the B subunit changes 
conformation in the low pH environment 
of the endosome to form a pore through 
which two different A subunits, lethal 
factor and edema factor, are transported 
across the membrane of the endosome 
in an unfolded conformation. In the 
cases of Vibrio cholerae toxin and 
Bordetella pertussis toxin, the B and A 
subunits are transported to the Golgi 
apparatus and then to the endoplasmic 
reticulum (ER), where the A subunits are 
then translocated into the cytosol in an 
unfolded conformation through a protein- 
translocation channel. 
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host cells (see Figure 23-6). The A subunits are called lethal factor and edema 
factor. The A subunit of edema toxin is an adenylyl cyclase that catalyzes the pro- 
duction of cyclic AMP (see Figure 15-25), leading to an ion imbalance that can 
cause an accumulation of extracellular fluid (edema) in the skin or lung. The A 
subunit of lethal toxin is a protease that cleaves several activated members of the 
mitogen-activated protein kinase kinase (MAP kinase kinase) family (see Figure 
15-49), disrupting intracellular signaling and leading to immune cell dysfunction 
and cell death. Injection of lethal toxin into the bloodstream of an animal causes 
shock (a large fall in blood pressure) and death. 

Apart from toxins, bacteria use specialized secretion systems to secrete 
many other effector proteins that interact with host cells. Gram-negative bacteria 
have a general secretion system and several classes of accessory secretion systems 
(types I-VI). A subset of these accessory secretion systems, called contact-depen- 
dent secretion systems, is present in many bacteria that contact or live inside host 
cells. The type III secretion system (Figure 23-7), for example, injects into the 
host-cell cytoplasm effector proteins that can elicit a variety of host cell responses 
that enable the bacterium to invade or survive. There is a remarkable degree of 
structural similarity between the type III syringe and the base of a bacterial fla- 
gellum. Because flagella are found in a wider range of bacteria than are type III 
secretion systems, and the secretion systems appear to be adaptations specific for 
pathogenesis, it seems likely that the type III secretion systems evolved from fla- 
gella. Other types of delivery systems used by bacterial pathogens appear to have 
evolved independently. For example, type IV secretion systems are closely related 
to the conjugation apparatus that many bacteria use to exchange genetic material. 


Fungal and Protozoan Parasites Have Complex Life Cycles 
Involving Multiple Forms 


Pathogenic fungi and protozoan parasites are eukaryotes, as are their hosts. Con- 
sequently, antifungal and antiparasitic drugs are often less effective and more 
toxic to the host than are antibiotics that target bacteria. A second characteristic 
of fungal and parasitic infections that makes them difficult to treat is the tendency 
of the pathogens to switch among several different forms during their life cycles. A 
drug that is effective at killing one form can be ineffective at killing another form; 
therefore the population can survive the treatment. 

Fungi include both unicellular yeasts (such as Saccharomyces cerevisiae and 
Schizosaccharomyces pombe, which are used to bake bread and brew beer, and as 
model organisms for cell biology research) and filamentous, multicellular molds 
(like those found on moldy fruit or bread). Most of the important pathogenic 
fungi exhibit dimorphism—the ability to grow in either yeast or mold form. The 
yeast-to-mold or mold-to-yeast transition is frequently associated with infection. 
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Figure 23-7 Type Ill secretion systems 
that can deliver effector proteins into 
the cytosol of a host cell. (A) Electron 
micrograph of purified type Ill secretion 
systems, each of which consists of 

over two dozen proteins. (B) The large 
lower ring is embedded in the bacterial 
inner membrane, and the smaller upper 
ring is embedded in the bacterial outer 
membrane. During infection, docking of 
the tip of the hollow needle at a host-cell 
plasma membrane results in the secretion 
of bacterial translocator proteins (green), 
which form a pore in the host membrane, 
through which bacterial effector proteins 
are then secreted into the host cell. 

(A, from O. Schraidt et al., PLoS Pathog. 
6(4):e1000824, 2010.) 
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Histoplasma capsulatum, for example, grows as a mold at low temperature in the 
soil, but it switches to a yeast form when inhaled into the lung, where it can cause 
the disease histoplasmosis (Figure 23-8). 

Protozoan parasites are single-celled eukaryotes with more elaborate life 
cycles than fungi, and they frequently require more than one host. Malaria is the 
most devastating protozoal disease, infecting more than 200 million people every 
year and killing upward of 500,000. It is caused by four species of Plasmodium, 
which are transmitted to humans by the bite of the female Anopheles mosquito. 
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Figure 23-8 Dimorphism in the 
pathogenic fungus Histoplasma 
capsulatum. (A) At low temperature in the 
soil, H. capsulatum grows as a multicellular 
filamentous mold consisting of many 
individual cells connected together. (B) After 
it is inhaled into the lung of a mammal, the 
increase in temperature causes a switch 

to a yeast form consisting of small clumps 
of round cells. (C) A stained histologic 
section of a mouse lung infected with 

H. capsulatum, showing a macrophage 
containing yeast forms of the pathogen. 

(A and B, courtesy of Sinem Beyhan and 
Anita Sil; C, courtesy of Davina Hocking 
Murray and Anita Sil.) 


Figure 23-9 The complex life cycle of 
malaria parasites. (A) The sexual cycle of 
Plasmodium falciparum requires passage 
between a human host and an insect host 
(Movie 23.3). (B)-(D) Blood smears from 
people with malaria, showing three different 
forms of the parasite that appear in red 
blood cells: (B) ring stage; (C) schizont; and 
(D) gametocyte. (B-D, courtesy of the 
Centers for Disease Control, Division of 
Parasitic Diseases, DPDx.) 
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Plasmodium falciparum causes the most serious form of malaria and is the most 
intensively studied of the malaria-causing parasites. It exists in many distinct 
forms, and it requires both the human and mosquito hosts to complete its sexual 
cycle (Figure 23-9). Several of these forms are highly specialized to invade and 
replicate in specific tissues—the lining of the insect gut, the human liver, and the 
human red blood cell. Even within a single host cell type, the red blood cell, the 
Plasmodium parasite undergoes a complex sequence of developmental events, 
reflected in striking morphological changes (Figure 23-9B-D). 


All Aspects of Viral Propagation Depend on Host Cell Machinery 


Bacteria, fungal, and protozoan pathogens are living cells themselves. They use 
their own machinery for DNA replication, transcription, and translation, and, for 
the most part, they provide their own sources of metabolic energy. Viruses, by 
contrast, are the ultimate hitchhikers, carrying little more than information in the 
form of nucleic acid. Most clinically important human viruses have small genomes 
consisting of double-stranded DNA or single-stranded RNA (Table 23-1), and we 
now have complete genome sequences of almost all of them. 

Viral genomes typically encode three types of protein: proteins for replicat- 
ing the genome, proteins for packaging the genome and delivering it to more 
host cells, and proteins for modifying the structure or function of the host cell to 
enhance the replication of the virus (see Figure 7-62). In general, viral replication 
involves (1) entry into the host cell, (2) disassembly of the infectious virus particle, 
(3) replication of the viral genome, (4) transcription of viral genes and synthesis of 
viral proteins, (5) assembly of these viral components into progeny virus particles, 


TABLE 23-1 


Herpes simplex virus 1 Double-stranded DNA Recurrent cold sores 


Epstein-Barr virus (EBV) Double-stranded DNA Infectious mononucleosis 


Varicella-zoster virus Double-stranded DNA Chickenpox and shingles 


Smallpox virus (Variola) Double-stranded DNA Smallpox 


Human papillomavirus Double-stranded DNA Warts, cancer 


Adenovirus Double-stranded DNA Respiratory disease 


Hepatitis-B virus Part single-, part double-stranded DNA | Hepatitis B 


Human immunodeficiency virus (HIV-1) | Single-stranded RNA [+] strand Acquired immune deficiency syndrome (AIDS) 


Coronavirus Single-stranded RNA [+] strand 
Rabies virus Single-stranded RNA [+] strand 
Mumps virus Single-stranded RNA [+] strand 
Measles virus Single-stranded RNA [+] strand 
Influenza virus tyoe A Single-stranded RNA [+] strand 
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Figure 23-10 A simple viral life cycle. The hypothetical simple virus shown 
here consists of a small double-stranded DNA molecule that codes for 

only a single viral capsid protein. To reproduce, the viral genome must first 
enter a host cell, where it is replicated to produce multiple copies, which are 
transcribed and translated to produce the viral coat protein. The viral genomes 
can then assemble spontaneously with the coat protein to form a new virus 
particle, which escapes from the host cell. No known virus is this simple. 


and (6) release of progeny virions (Figure 23-10). A single virus particle (a virion) 
that infects a single host cell can produce thousands of progeny. 

Virions come in a wide variety of shapes and sizes (Figure 23-11), and 
although most have relatively small genomes, genome size can vary considerably. 
The recently discovered giant viruses of amoebae, called pandoraviruses, are the 
largest known viruses, with 700 nm particles and double-stranded DNA genomes 
of over 2,000,000 nucleotide pairs. The virions of poxvirus are also large: they are 
250-350 nm long and enclose a genome of double-stranded DNA of about 270,000 
nucleotide pairs. At the other end of the size scale are the virions of parvovirus, 
which are less than 30 nm in diameter and have a single-stranded DNA genome 
of fewer than 5000 nucleotides. 

Viral genomes are packaged in a protein coat, called a capsid, which in some 
viruses is further enclosed by a lipid bilayer membrane, or envelope. The capsid is 
made of one or several proteins, arranged in regular arrays that generally produce 
structures with either helical symmetry, which results in a cylindrical structure 
(for example, influenza, measles, and bunyavirus), or icosahedral symmetry (for 
example, poliovirus and herpesvirus; see Figure 23-11). Some viruses instead pro- 
duce capsids with more complicated structures (for example, poxviruses). When 
the capsid is packaged with the viral genome, the structure is called a nucleocap- 
sid. The nucleocapsids of nonenveloped viruses usually leave an infected cell by 
lysing it. For enveloped viruses, by contrast, the nucleocapsid is enclosed within a 
lipid bilayer membrane that the virus acquires in the process of budding from the 
host-cell plasma membrane, which it does without disrupting the membrane or 
killing the cell (Figure 23-12). Enveloped viruses can cause persistent infections 
that may last for years, often without noticeable deleterious effects on the host. 

Because the host cell performs most of the critical steps in viral replication, the 
identification of effective antiviral drugs that do not harm the host can be difficult. 
Probably the most effective strategy for containing viral diseases is through vacci- 
nating of potential hosts. Highly successful vaccination programs have effectively 
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Figure 23-11 Examples of viral 
morphology. As shown, both DNA and 
RNA viruses vary greatly in both size and 
shape. 
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eliminated smallpox infection from the planet, and the eradication of poliomyeli- 
tis is approaching completion (Figure 23-13). 


Summary 


Infectious diseases are caused by pathogens, which include viruses, bacteria, and 
fungi, as well as protozoan and metazoan parasites. All pathogens must have 
mechanisms for entering their host and for evading immediate destruction by the 
host. The great majority of bacteria are not pathogenic to humans. Those that are 
pathogenic produce specific virulence factors that mediate the bacteria’s interac- 
tions with the host; these proteins change the behavior of host cells in ways that 
promote the replication and spread of the bacteria. Eukaryotic pathogens such as 
fungi and protozoan parasites typically pass through several different forms during 
the course of infection; the ability to switch among these forms is usually required 
for these pathogens to survive in a host and cause disease. In some cases, such as 
malaria, parasites must pass sequentially through several host species to complete 
their life cycles. Unlike bacteria and eukaryotic parasites, viruses have no metabo- 
lism of their own and no intrinsic ability to produce the proteins encoded by their 
DNA or RNA genomes; they rely on subverting the machinery of the host cell. 


A 
O 


inactivated 
vaccine 


ł 







UJ 
O 


oral 
vaccine 


ł 


0 
1940 1950 1960 1970 1980 1990 


reported cases of polio per 
100,000 population 
= N 
O O 


Figure 23-13 Effective control of a viral disease through vaccination. The graph shows the 
number of cases of poliomyelitis reported per year in the United States. The arrows indicate the 
timing of the introduction of the Salk vaccine (inactivated virus given by injection) and the Sabin 
vaccine (live attenuated virus given orally). 
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Figure 23-12 Acquisition of a viral 
envelope. (A) Electron micrograph of 

an animal cell from which six copies of 
an enveloped virus (Semliki forest virus) 
are budding. (B) Schematic drawing of 
the envelope assembly and budding 
processes. The lipid bilayer that surrounds 
the viral capsid is derived directly from 
the plasma membrane of the host cell. In 
contrast, the proteins in this lipid bilayer 
(shown in green) are encoded by the viral 
genome. (A, courtesy of M. Olsen and 

G. Griffith.) 
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CELL BIOLOGY OF INFECTION 


The mechanisms through which pathogens cause disease are as diverse as the 
pathogens themselves. Nonetheless, all pathogens must carry out certain com- 
mon tasks: they must gain access to the host, reach an appropriate niche, avoid 
host defenses, replicate, and exit from the infected host to spread to an uninfected 
one. In this section, we examine the common strategies that many pathogens use 
to accomplish these tasks. 


Pathogens Overcome Epithelial Barriers to Infect the Host 


The first step in infection is for the pathogen to gain access to the host. A thick cov- 
ering of skin protects most parts of the human body from the environment. The 
protective boundaries of some other human tissues (eyes, nasal passages, respi- 
ratory tract, mouth, digestive tract, urinary tract, and female genital tract) are less 
robust. In the lungs and small intestine, for example, the barrier is just a single 
monolayer of epithelial cells. Nonetheless, all these epithelia serve as barriers to 
infection. 

Wounds in barrier epithelia allow pathogens direct access to unoccupied 
niches within otherwise sterile host tissues. This avenue of entry requires little 
in the way of pathogen specialization, and many members of the normal flora 
can cause serious illness if they enter through such wounds. Staphylococci from 
the skin and nose, or Streptococci from the throat and mouth, are two examples 
of opportunistic bacterial pathogens that are responsible for many serious infec- 
tions resulting from breaches in epithelial barriers. The recent emergence of 
bacterial strains of Staphylococcus that are resistant to the antibiotics commonly 
used for treatment (for example, methicillin-resistant Staphylococcus aureus, or 
MRSA, which infects up to 50,000,000 people worldwide) is of particular concern. 
Papillomaviruses, which cause warts and cervical cancer, also take advantage of 
breaches in epithelial barriers. 

Primary pathogens, however, need not wait for a wound to gain access to their 
host. One efficient way for such a pathogen to cross the skin is to catch a ride in the 
saliva of a biting arthropod. A diverse group of bacteria, viruses, and protozoa has 
developed the ability to survive in insects and then use them as vectors to spread 
from one mammalian host to another. As discussed earlier, the Plasmodium pro- 
tozoan that causes malaria develops through several forms in its life cycle, includ- 
ing some that are specialized for survival in a human and others that are special- 
ized for survival in a mosquito (see Figure 23-9). Viruses that are spread by insect 
bites cause yellow fever and Dengue fever, as well as many kinds of viral enceph- 
alitis (inflammation of the brain). These viruses replicate in both insect cells and 
mammalian cells, as required for their transmission by an insect vector. 

The efficient spread of a pathogen via an insect vector requires that an individ- 
ual insect consumes a blood meal from an infected host and transfers the patho- 
gen to a naive host. In a few striking cases, the pathogen alters the behavior of 
the insect so that its transmission to a new host is more likely. An example is the 
bacterium Yersinia pestis, which causes bubonic plague. It multiplies in the flea’s 
foregut to form aggregated masses that physically block the digestive tract; during 
each repeated, but futile, attempt at feeding, some of the bacteria in the foregut are 
flushed into the bite site, thus transmitting plague to a new host (Figure 23-14). 


Pathogens That Colonize an Epithelium Must Overcome Its 
Protective Mechanisms 


Whereas many epithelial barriers such as the skin and the lining of the mouth 
and large intestine are densely populated by normal flora, others, including the 
lining of the lower lung and the bladder, are normally kept nearly sterile. How 
do these epithelia avoid bacterial colonization? A layer of protective mucus cov- 
ers the respiratory epithelium, and the coordinated beating of cilia sweeps the 
mucus and trapped bacteria up and out of the lung. The epithelial lining of the 
bladder and the upper gastrointestinal tract also has a thick layer of mucus, and 


esophagus midgut 





Figure 23-14 Plague bacteria within 
a flea. This light micrograph shows the 
digestive tract dissected from a flea that 
had dined about two weeks previously 
on the blood of an animal infected with 
the plague bacterium, Yersinia pestis. 
The bacteria multiplied in the flea gut to 
produce large cohesive aggregates (red 
arrows); the bacterial mass on the left 

is occluding the passage between the 
esophagus and the midgut. This type of 
blockage prevents a flea from digesting 
its blood meals, so that hunger causes 
it to bite repeatedly, disseminating the 
infection. (From B.J. Hinnebusch, 

E.R. Fischer and T.G. Schwan, 

J. Infect. Dis. 178:1406-1415, 1998.) 
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that recognize and bind to cell-surface molecules on the epithelium. An import- 
ant group of adhesins in E. coli strains that infect the kidney are components of 


up view of one of the bacteria showing 
the pili on its surface. (C) An E. coli pilus 
has adaptor proteins on its tip that bind to 


the pili—surface projections that can be several micrometers long and thus able glycolipids on the surface of kidney cells. 


to span the thickness of the protective mucus layer; at the tip of each pilusisan (A, from G.E. Soto and S.J. Hultgren, 


adhesin protein that binds tightly to the D-galactose-D-galactose disaccharide J. Bacteriol, 181:1059-1077, 1999. With 


on glycolipids on the surface of kidney cells (Figure 23-15). Strains of E. coli that 


permission from the American Society for 
Microbiology; B, courtesy of D.G. Thanassi 


infect the bladder rather than the kidney express a second kind of pilus with a dif- and S.J. Hultgren, Methods 20:111-126, 
ferent adhesin protein that binds to bladder epithelial cells. It is the specificity of | 2000. With permission from Academic 


the adhesin proteins on the tips of the two types of pili that is responsible for the Press.) 
bacteria’s colonizing of the different parts of the urinary tract. 

The stomach is an especially hostile environment for pathogens. Besides the 
thick layer of mucus and peristaltic washing, it is filled with acid (average pH ~2), 
which is lethal to almost all bacteria ingested in food. Yet, it is home to a micro- 
biota of hundreds of resident species, including the bacterium H. pylori, which, 
as we discussed earlier, is the major cause of stomach ulcers and some stomach 
cancers. The hypothesis that a persistent bacterial infection could cause stomach 
ulcers was initially met with skepticism. The young Australian doctor who made 
the initial discovery finally proved the point: he drank a pure culture of H. pylori 
and developed inflammation of the stomach, which often precedes the develop- 
ment of ulcers. A short course of antibiotics can now effectively cure a patient 
of recurrent stomach ulcers. Remarkably, H. pylori is able to persist for life as a 
commensal in most humans. One way in which it survives in the stomach is by 
producing the enzyme urease, which converts urea to ammonia that neutralizes 
the acid in its immediate vicinity. The bacterium also uses its flagellum for che- 
motactic motility, allowing it to seek out the more neutral pH near the surface 
of gastric epithelial cells. H. pylori virulence proteins that target both epithelial 
and immune cells help H. pylori persist in the stomach, but they can also induce 
chronic inflammation, alteration in host gene expression, changes in cell prolifer- 
ation and apoptosis, and disruption of cell-cell junctions, all of which are predis- 
posing factors for stomach cancer. 


Extracellular Pathogens Disturb Host Cells Without Entering Them 


Extracellular pathogens can cause serious disease without entering host cells. 
Bordetella pertussis, the bacterium that causes whooping cough, for example, col- 
onizes the respiratory epithelium and circumvents the normal mechanism that 
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clears the respiratory tract by expressing adhesins that bind ciliated epithelial 
cells. The adherent bacteria produce toxins that eventually kill the ciliated cells, 
compromising the host’s ability to clear the infection. The most familiar of these 
is pertussis toxin, which, like the cholera toxin discused above, has an A subunit 
that ADP-ribosylates the a subunit of the G protein G;, inhibiting the G protein 
from suppressing the activity of the host cell’s adenylyl cyclase, thereby increasing 
the production of cyclic AMP (see Figure 23-6). This toxin also interferes with the 
chemotactic pathway that neutrophils use to seek out and destroy invading bac- 
teria (see Figures 16-3 and 16-86). B. pertussis colonization of the respiratory tract 
causes severe coughing, which helps spread the infection. 

Not all extracellular pathogens that colonize an epithelium exert their effect 
through toxins. Enteropathogenic E. coli (EPEC), which causes diarrhea in young 
children, uses a type III secretion system (see Figure 23-7) to deliver its own spe- 
cial receptor protein (called Tir) into the plasma membrane of a host intestinal 
epithelial cell (Figure 23-16). The extracellular domain of Tir binds to the bacte- 
rial surface protein intimin, triggering actin polymerization in the host cell that 
results in the formation of a unique cell-surface protrusion called a pedestal; this 
pushes the tightly adherent bacteria up about 10 um from the host-cell mem- 
brane, thereby promoting bacterial movement along the cell surface. A similar 
strategy is used by vaccinia virus (the virus that was used as a vaccine to eradicate 
smallpox) to form mobile pedestals, which promote spread of the virus from cell 
to cell. The study of how EPEC and vaccinia virus promote actin polymerization 
has been of major importance in understanding how intracellular signaling path- 
ways regulate the cytoskeleton in normal, uninfected cells (discussed in Chapter 
16). Although pedestal formation promotes the spread of these pathogens, the 
sympoms of EPEC infection (severe diarrhea) are caused by the loss of absorptive 
microvilli and disruption of signaling pathways in epithelial cells, which are trig- 
gered by Tir and other effector proteins. 


Intracellular Pathogens Have Mechanisms for Both Entering and 
Leaving Host Cells 


Many pathogens have to enter host cells to cause disease. These intracellular 
pathogens include all viruses and many bacteria and protozoa. Each of these has 
a preferred niche for replication and survival within host cells. Bacteria and proto- 
zoa replicate either in the cytosol or within a membrane-enclosed compartment. 
While most RNA viruses replicate within the cytosol, most DNA viruses replicate 
in the nucleus. Life inside a host cell has several advantages. The pathogens are not 
accessible to antibodies, nor are they easy targets for phagocytic cells (discussed 
in Chapter 24); furthermore, intracellular bacteria and protozoa are bathed in a 
rich source of nutrients, and viruses have access to the host cell’s biosynthetic 
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Figure 23-16 Interaction of 
enteropathogenic E. coli (EPEC) with 
host intestinal epithelial cells. (A) When 
EPEC contacts an epithelial cell in the 
lining of the human gut, it delivers a 
bacterial protein called Tir into the host 
cell through a type Ill secretion system. 

Tir then inserts into the plasma membrane 
of the host cell, where it functions as a 
receptor for the bacterial adhesin protein 
intimin. Next, a host-cell protein tyrosine 
kinase phosphorylates the intracellular 
domain of Tir on tyrosines. Phosphorylated 
Tir recruits host-cell proteins (including 

an adaptor protein, a WASp protein, 

and the Arp 2/3 complex) that trigger 
actin polymerization (See Figure 16-16). 
Consequently, a branched network of 
actin filaments assembles underneath 

the bacterium, forming an actin pedestal 
(Movie 23.4). (B) EPEC on a pedestal. In 
this fluorescence micrograph, the DNA of 
the EPEC and host cell is labeled in blue, 
Tir protein is labeled in green, and host-cell 
actin filaments are labeled in red. The 
inset shows a close-up view of the two 
upper bacteria on pedestals. (B, from 

D. Goosney et al., Annu. Rev. Cell Dev. 
Biol. 16:173-189, 2000. With permission 
from Annual Reviews.) 
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machinery for their reproduction. This lifestyle, however, requires that the patho- 
gen have mechanisms for entering host cells, for finding a suitable subcellular 
niche where it can replicate, and for exiting from the infected cell to spread the 
infection. Below we consider some of the myriad ways that individual intracellu- 
lar pathogens exploit and modify host cell biology to satisfy these requirements. 


Viruses Bind to Virus Receptors at the Host Cell Surface 


The first step for any intracellular pathogen is to bind to the surface of the host 
target cell. Viruses accomplish this by the binding of viral surface proteins to virus 
receptors displayed on the host cell. The first virus receptor identified was an 
E. coli surface protein that is recognized by the bacteriophage lambda; the protein 
normally functions to transport the sugar maltose from outside the bacterium to 
the inside where it is used as an energy source. Receptors need not be proteins, 
however: an envelope protein of herpes simplex virus, for example, binds to hep- 
aran sulfate proteoglycans (discussed in Chapter 19) on the surface of certain ver- 
tebrate host cells, and simian virus 40 (SV40) binds to a glycolipid. The specificity 
of virus-receptor interactions often serves as a barrier preventing the spread of a 
virus from one species to another. Acquiring the ability to bind to a new receptor 
often requires multiple changes in a virus, but it can be crucial in allowing the 
cross-species transmission that can result in new disease outbreaks. 

Viruses that infect animal cells generally exploit cell-surface receptor mole- 
cules that are either ubiquitous (such as the sialic-acid-containing oligosaccha- 
rides used by the influenza virus) or found uniquely on those cell types in which 
the virus replicates (such as the neuron-specific proteins used by rabies virus). 
Although a virus usually uses a single type of host-cell receptor, some viruses use 
more than one type. An important example is HIV-1, which requires two types of 
receptors to enter a host cell. Its primary receptor is CD4, a cell-surface protein on 
helper T cells and macrophages that is involved in immune recognition (discussed 
in Chapter 24). It also requires a co-receptor, which is either CCR5 (a receptor 
for B-chemokines) or CXCR4 (a receptor for a&-chemokines), depending on the 
particular variant of the virus; macrophages are susceptible only to HIV variants 
that use CCR5 for entry, whereas helper T cells are most efficiently infected by 
variants that use CXCR4 (Figure 23-17). The viruses that are found within the first 
few months after HIV infection almost invariably use CCR5, which explains why 
individuals that carry a defective CCR5 gene are less susceptible to HIV infection. 
In the later stages of infection, viruses often either switch to use CXCR4 or adapt 
to use both co-receptors through the accumulation of mutations; in this way, the 
virus can change the cell types it infects as the disease progresses. It may seem 
paradoxical that viruses would infect immune cells, as we might expect that virus 
binding would trigger an immune response; but invasion of an immune cell can 
be a useful way for a virus to weaken the immune response and travel around the 
body to infect other immune cells. 
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Figure 23-17 Receptor and co-receptors 
for HIV. All strains of HIV require the 

CD4 protein as a primary receptor. Early 

in an infection, most of the viruses use 
CCR5 as a co-receptor, allowing them to 
infect macrophages and their precursors, 
monocytes. As the infection progresses, 
mutant variants arise that now use CXCR4 
as a co-receptor, enabling them to infect 
helper T cells efficiently. The natural ligand 
for the chemokine receptors (Sdf1 for 
CXCR4; Rantes, Mip1a, or Mip1ß for 
CCR5) blocks co-receptor function and 
prevents viral invasion. 
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Viruses Enter Host Cells by Membrane Fusion, Pore Formation, or 
Membrane Disruption 


After recognition and attachment to the host cell surface, the virus must enter the 
cell to replicate. Some enveloped viruses enter the host cell by fusing their enve- 
lope membrane with the plasma membrane. Most viruses, whether enveloped or 
nonenveloped, activate signaling pathways in the cell that induce endocytosis, 
commonly via clathrin-coated pits (see Figure 13-7), leading to internalization 
into endosomes. Large viruses that do not fit into clathrin-coated vesicles, such as 
poxviruses, often enter cells by macropinocytosis, a process by which membrane 
ruffles fold over and entrap fluid into macropinosomes (see Figure 13-50). Once 
inside endosomes, fusion of the viral envelope occurs from the lumenal side of 
the endosome membrane. The mechanism of membrane fusion mediated by 
viral spike glycoproteins has similarities with SNARE-mediated membrane fusion 
during normal vesicular trafficking (discussed in Chapter 13). 

Enveloped viruses regulate fusion both to ensure that they fuse only with the 
appropriate host cell membrane and to prevent fusion with one another. For 
viruses such as HIV-1 that fuse at neutral pH with the plasma membrane (Figure 
23-18A), binding to receptors or co-receptors usually triggers a conformational 
change in a viral envelope protein that exposes a normally buried fusion peptide 
(see Figure 13-21). Other enveloped viruses, such as influenza A virus, only fuse 
with a host cell membrane after endocytosis (Figure 23-18B); in this case, it is 
frequently the acid environment in the late endosome that triggers the confor- 
mational change in a viral surface protein that exposes the fusion peptide. The H* 


adenovirus 


HIV (AIDS virus) influenza virus poliovirus 





pen Ca 





| 
DES 















pendocyrosts 
endocytosis 
a (@) 
 % } W early 
Ss r acidification, Endo come 
% ý fusion, 
uncoating uncoating, 
s t pore 
formation 
( ) Fia 
ysis 
DNA enters 
nucleus 
(A) fusion with (B) fusion with (C) pore formation (D) — endosomal 
plasma membrane membrane after membrane 


endocytosis disruption 


Figure 23-18 Four virus entry strategies. 
(A) Some enveloped viruses, such as HIV, 
fuse directly with the host-cell plasma 
membrane to release their RNA genome 
(blue) and capsid proteins (brown) into the 
cytosol. (B) Other enveloped viruses, such 
as influenza virus, first bind to cell-surface 
receptors, triggering receptor-mediated 
endocytosis; when the endosome acidifies, 
the virus envelope fuses with the endosomal 
membrare, releasing the viral RNA genome 
(blue) and capsid proteins (brown) into the 
cytosol. (C) Poliovirus, a nonenveloped virus, 
induces receptor-mediated endocytosis, 
and then forms a pore in the endosomal 
membrane to extrude its RNA genome 
(blue) into the cytosol. (D) Adenovirus, 
another nonenveloped virus, uses a more 
complicated strategy: it induces receptor- 
mediated endocytosis and then disrupts the 
endosomal membrane, releasing the capsid 
and its DNA genome into the cytosol; the 
trimmed-down virus eventually docks onto 
a nuclear pore and releases its DNA (red) 
directly into the nucleus (Movie 23.5). 
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pumped into the early endosome also has another effect; it enters the influenza 
virion through an ion channel in the viral envelope and triggers changes in the 
viral capsid. These priming steps allow the capsids to disassemble once released 
into the cytosol after virus fusion with the late endosomal membrane. 

Nonenveloped viruses use different strategies to enter host cells—strategies 
that do not rely on membrane fusion. Poliovirus, which causes poliomyelitis, binds 
to a cell-surface receptor, triggering both receptor-mediated endocytosis (see Fig- 
ure 13-52) and a conformational change in the viral particle. The conformational 
change exposes a hydrophobic projection on one of the capsid proteins, which 
inserts into the endosomal membrane to form a pore. The viral RNA genome then 
enters the cytosol through the pore, leaving the capsid in the endosome (Figure 
23-18C). Other nonenveloped viruses such as adenovirus disrupt the endosomal 
membrane after they are taken up by receptor-mediated endocytosis. One of the 
proteins released from the capsid lyses the endosomal membrane, releasing the 
remainder of the virus into the cytosol. During endosomal trafficking and sub- 
sequent transport within the cytosol, adenoviruses undergo multiple uncoating 
steps, which sequentially remove structural proteins and ready the virus particles 
to release their DNA into the nucleus through nuclear pore complexes (Figure 
23-18D). 


Bacteria Enter Host Cells by Phagocytosis 


Bacteria are much larger than viruses—too large to be taken up either through 
pores or by receptor-mediated endocytosis. Instead, they enter host cells by 
phagocytosis, which is a normal function of phagocytes such as neutrophils, mac- 
rophages, and dendritic cells (discussed in Chapter 24). These phagocytes patrol 
the tissues of the body and ingest and destroy microbes; however, some intracel- 
lular bacterial pathogens such as M. tuberculosis use this to their advantage and 
have evolved to survive and multiply inside macrophages. 

Some bacterial pathogens can invade host cells that are normally nonphago- 
cytic. One way they do so is by expressing an invasion protein that binds with high 
affinity to a host-cell receptor, which is often a cell-cell or cell-matrix adhesion 
protein (discussed in Chapter 19). For example, Yersinia pseudotuberculosis (a 
bacterium that causes diarrhea and is a close relative of the plague bacterium 
Y. pestis) expresses a protein called invasin that has an RGD motif that is simi- 
lar to fibronectin’s and likewise is recognized by host-cell Bı integrins (see Fig- 
ure 19-55). Listeria monocytogenes, which causes a rare but serious form of food 
poisoning, invades host cells by expressing a protein that binds to the cell-cell 
adhesion protein E-cadherin (see Figure 19-6). For both these bacterial species, 
binding of the bacterial invasion proteins to the host cell adhesion proteins stim- 
ulates signaling through members of the Rho family of small GTPases (discussed 
in Chapter 16). This in turn activates proteins in the WASp family and the Arp 2/3 
complex, leading to actin polymerization at the site of bacterial attachment. Actin 
polymerization, together with the assembly of a clathrin coat (see Figure 13-6), 
drives the advancement of the host cell’s plasma membrane over the adhesive 
surface of the microbe, resulting in the phagocytosis of the bacterium—a process 
known as the zipper mechanism of invasion (Figure 23-194). 

A second pathway by which bacteria can invade nonphagocytic cells is known 
as the trigger mechanism (Figure 23-19B). It is used by various pathogens that 
cause food poisoning, including Salmonella enterica, and it is initiated when the 
bacterium injects a set of effector molecules into the host-cell cytosol through 
a type III secretion system (see Figure 23-7). Some of these effector molecules 
activate Rho family proteins, which in turn stimulate actin polymerization, as just 
discussed. Other bacterial effector proteins interact with host-cell cytoskeletal 
elements more directly, nucleating and stabilizing actin filaments and causing 
the rearrangement of actin cross-linking proteins. The overall effect is to cause the 
formation of localized ruffles on the surface of the host cell (Figure 23-19C and 
D), which fold over and engulf the bacteria by a process that resembles macropi- 
nocytosis. The appearance of cells being invaded by use of the trigger mechanism 
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is similar to the ruffling induced by some extracellular growth factors, suggesting 
that the bacteria exploit normal intracellular signaling pathways. 


Intracellular Eukaryotic Parasites Actively Invade Host Cells 


The uptake of viruses and bacteria into host cells is carried out largely by the host, 
with the pathogen being a relatively passive participant. In contrast, intracellular 
eukaryotic parasites, which are typically much larger than other types of intra- 
cellular pathogens, invade host cells through a variety of complex pathways that 
usually require energy expenditure by the parasite. 

Toxoplasma gondii,a cat parasite that also causes occasional serious human 
infections, is an example. When this protozoan contacts a host cell, it protrudes 
an unusual microtubule-based structure called a conoid, which facilitates entry 
into the host cell (Figure 23-20). The energy for invasion seems to come from 
actin polymerization in the parasite rather than host cytoskeleton, and invasion 
also requires at least one unusual parasite myosin motor protein (Class XIV; see 
Figure 16-40). At the point of contact, the parasite discharges effector proteins 
from secretory organelles into the host cell, and these proteins target various host 
pathways to enable invasion, to block an innate immune response, and promote 
survival. As the parasite moves into the host cell, a membrane derived from the 
host-cell plasma membrane surrounds it. Remarkably, the parasite removes host 
transmembrane proteins from the surrounding membrane as it forms, so that 
the parasite is protected in a membrane-enclosed compartment that does not 
fuse with lysosomes and does not participate in host-cell membrane trafficking 
processes (see Figure 23-20). The specialized membrane is selectively porous: it 
allows the parasite to take up small metabolic intermediates and nutrients from 
the host cell’s cytosol but excludes macromolecules. Malaria parasites invade 
human red blood cells using a similar mechanism. 


Figure 23-19 Mechanisms used by 
bacteria to induce phagocytosis by host 
cells that are normally nonphagocytic. 
(A) In the zipper mechanism, bacteria 
express an invasion protein that binds with 
high affinity to a host-cell receptor, which 
is often a cell-cell or cell-matrix adhesion 
protein. (B) In the trigger mechanism, 
bacteria inject a set of effector molecules 
into the host-cell cytosol through a type Ill 
secretion system called SPI1 (Salmonella 
pathogenicity island 1), inducing membrane 
ruffling. Both the zipper and trigger 
mechanisms cause the polymerization of 
actin at the site of bacterial attachment 

by activating Rho family small GTPases 
and the Arp 2/3 complex. (C) A scanning 
electron micrograph showing a very early 
stage of Salmonella enterica invasion 

by the trigger mechanism. Bacteria 
(oseudocolored yellow) are shown 
surrounded by a small membrane ruffle. 
(D) Fluorescence micrograph showing that 
the large ruffles that engulf the Salmonella 
bacteria are actin-rich. The bacteria are 
labeled in green and actin filaments in red; 
because of the color overlap, the bacteria 
appear yellow. (C, from Rocky Mountain 
Laboratories, NIAID, NIH; D, from 

J.E. Galán, Annu. Rev. Cell Dev. Biol. 
17:53-86, 2001. With permission from 
Annual Reviews.) 
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Figure 23-20 The life cycle of the 
intracellular parasite Toxoplasma gondii. 
(A) After attachment to a host cell, 7T. gondii 
uses its conoid to inject effector proteins 
that facilitate invasion. As the host cell’s 
plasma membrane invaginates to surround 
the parasite, it somehow removes the 
normal host-cell membrane proteins, so that 


: i . i ; the compartment (shown in red) does not 
The protozoan Trypanosoma cruzi, which causes Chagas disease, in Mexico se with ioc =. erse cl ounce 


and Central and South America, uses two alternative invasion strategies. In a of replication, the parasite causes the 
lysosome-dependent pathway, the parasite attaches to host cell-surface receptors, compartment to break down and the host 
inducing a local increase in Ca** in the host cell’s cytosol. The Ca** signal recruits cell to lyse, releasing the progeny parasites 
lysosomes to the site of parasite attachment, and the lysosomes fuse with the host tO infect other host cells (Movie 23.6). 

I's pl b ieee th it id töthe l l (B) Light micrograph of T. gondii replicating 
cell's plasma membrane, allowing the parasites rapid access to the lysosomal Within a membrane-enclosed compartment 
compartment (Figure 23-21). In a lysosome-independent pathway, the parasite (a vacuole) in a cultured cell. (B, courtesy of 
penetrates the host-cell plasma membrane by inducing the membrane to invagi- Manuel Camps and John Boothroyd.) 
nate, without the recruitment of lysosomes. 
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Figure 23-21 The two alternative strategies that Trypanosoma cruzi uses to invade host cells. In the lysosome-dependent pathway (left), 

T. cruzi recruits host-cell lysosomes to its site of attachment to the host cell. The lysosomes fuse with the invaginating plasma membrane to create 
an intracellular compartment constructed almost entirely of lysosomal membrane. After a brief stay in the compartment, the parasite secretes a 
pore-forming protein that disrupts the surrounding membrane, thereby allowing the parasite to escape into the host-cell cytosol and proliferate. 

In the lysosome-independent pathway (right), the parasite induces the host plasma membrane to invaginate and pinch off without recruiting 
lysosomes; then lysosomes fuse with the endosome prior to the parasite’s escape into the cytosol. 
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Some Intracellular Pathogens Escape from the Phagosome into 
the Cytosol 


The intracellular parasites just discussed raise a general problem that faces all 
intracellular pathogens, including viruses, bacteria, and eukaryotic parasites: 
they must find a cell compartment in which they can replicate. After their endocy- 
tosis by a host cell, they usually find themselves in an endosomal compartment, 
which normally would fuse with lysosomes to form a phagolysosome—a danger- 
ous place for pathogens. To survive, pathogens use a variety of strategies. Some 
escape from the endosomal compartment before such fusion. Others remain in 
the endosomal compartments but modify it so that it no longer fuses with lyso- 
somes. Still others have evolved to weather the harsh conditions in the phagoly- 
sosome (Figure 23-22). 

Trypanosoma cruzi uses the escape route by secreting a pore-forming toxin 
that lyses the lysosome membrane, releasing the parasite into the host cell’s cyto- 
sol (see Figure 23-21). The bacterium Listeria monocytogenes uses a similar strat- 
egy. Following phagocytosis by the zipper mechanism, it secretes a protein called 
listeriolysin O, which disrupts the phagosomal membrane, releasing the bacteria 
into the cytosol (Figure 23-23). 


Many Pathogens Alter Membrane Traffic in the Host Cell to Survive 
and Replicate 


The survival and reproduction of many intracellular pathogens requires that they 
modify membrane (vesicular) traffic in the host cell. They may, for example, pre- 
vent the normal fusing of endosomes with lysosomes, or adapt themselves to 
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Figure 23-22 Choices that an 
intracellular pathogen faces. After 
entry into a host cell, generally through 
phagocytosis into a membrane-enclosed 
compartment, intracellular pathogens can 
use one of three strategies to survive and 
replicate. Pathogens that follow strategy 
(1) include all viruses, Trypanosoma cruzi, 
Listeria monocytogenes, and Shigella 
flexneri. Those that follow strategy (2) 
include Mycobacterium tuberculosis and 
Legionella pneumophila. Those that follow 
strategy (3) include Salmonella enterica, 
Coxiella burnetii, and Leishmania. 







proteasome 





Figure 23-23 Escape of Listeria monocytogenes by selective destruction of the phagosomal membrane. The bacterium 
attaches to E-cadherin on the surface of host epithelial cells and induces its own uptake by the zipper mechanism (see 

Figure 23-19A). Within the phagosome, the bacterium secretes the protein listeriolysin O, which is activated at PH <6 and 
forms oligomers in the phagosome membrane, thereby creating large pores and eventually disrupting the membrane. Once 

in the host-cell cytosol, the bacteria begin to replicate and continue to secrete listeriolysin O; because the pH in the cytosol 

is >6, however, the listeriolysin O there is inactive and is also rapidly degraded by proteasomes. Thus, the host cell’s plasma 


membrane remains intact. 
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resist the lysosome’s antimicrobial armaments. Intracellular pathogens must also 
provide a pathway for importing nutrients from the host cytosol into their com- 
partment of choice. 

Different pathogens have distinct strategies for altering membrane traffic in 
the host cell (Figure 23-24). M. tuberculosis prevents the early endosome that 
contains the bacteria from maturing, so the endosome never acidifies or acquires 
the other characteristics of a late endosome or lysosome. This strategy requires 
the activity of its type VII secretion system, as well as mycobacterial lipid products 
that mimic host lipids and influence vesicular traffic. Phagosomes containing Sal- 
monella enterica, in contrast, acidify and acquire markers of late endosomes and 
lysosomes, but the bacteria slow the process of phagosomal maturation. They do 
so by injecting effector proteins through a second type III secretion system. These 
effectors activate host kinesin motor proteins to pull membrane tubules outward 
from the phagosome along cytoplasmic microtubules, forming a specialized com- 
partment called the Salmonella-containing vacuole (Figure 23-25). 

Other bacteria seem to find shelter in intracellular compartments that are dis- 
tinct from those of the usual endocytic system. One example is Legionella pneu- 
mophila, which was first recognized as a human pathogen in 1976, when it was 
found to be the cause of a type of pneumonia known as Legionnaire’s disease. 
L. pneumophila is normally a parasite of freshwater amoebae, but it is commonly 
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Figure 23-24 Modifications of 
membrane traffic in host cells by 
bacterial pathogens. Intracellular bacterial 
pathogens, including Mycobacterium 
tuberculosis, Salmonella enterica, and 
Legionella pneumophila, all replicate in 
membrane-enclosed compartments, but 
the compartments differ. M. tuberculosis 
remains in a compartment that has early 
endosomal markers and continues to 
communicate with the plasma membrane 
via transport vesicles. S. enterica replicates 
in a compartment that has late endosomal 
markers and does not communicate with 
the plasma membrane. L. pneumophila 
replicates in an unusual compartment that 
is wrapped in rough endoplasmic reticulum 
(ER) membrane and communicates with 
the ER via transport vesicles. TGN, trans 
Golgi network. 


Figure 23-25 Salmonella enterica 
residing in a modified phagosomal 
compartment called the Salmonella- 
containing vacuole. These bacteria invade 
the host cell using an SPI1 type IIl secretion 
system to inject effector proteins that 
induce the trigger mechanism of microbe 
entry illustrated in Figure 23-19B. 

(A) Following its engulfment into a 
phagosome, the bacterium inactivates its 
SPI1 type Ill secretion system and activates 
its SPI2 type Ill secretion system to inject 
different effector proteins, which remodel 
the phagosome into the specialized 
Salmonella-containing vacuole. One of the 
injected effector proteins activates host 
kinesin motor proteins to pull membrane 
tubules outward toward the plus ends 

of the microtubules (See Figure 16—42). 

(B) Fluorescence micrograph showing 

S. enterica in a Salmonella-containing 
vacuole. The bacteria are stained green, 
the microtubules red, and the nucleus blue. 
(B, courtesy of Stephane Meresse.) 
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Figure 23-26 Legionella pneumophila residing in a compartment with characteristics similar to those of the rough 
endoplasmic reticulum (ER). (A) Electron micrograph showing the unusual coiled structure that the Legionella pneumophila 
bacterium induces on the surface of a phagocyte during the invasion process. Some other pathogens, including the bacterium 
Borrelia burgdorferi, which causes Lyme disease, the eukaryotic pathogen Leishmania, and the yeast Candida albicans, can 
also invade cells using this type of coiling phagocytosis. (B) Following invasion, L. pneumophila uses its type IV secretion system 
to secrete effector proteins that block phagosome—endosome fusion and phagosome maturation. It also secretes effector 
proteins that promote the fusion of the phagosome with ER-derived vesicles, thereby creating a Legionella-containing vacuole 
with characteristics similar to the rough ER. (A, from M.A. Horwitz, Cell 36:27-33, 1984. With permission from Elsevier.) 


(B) 





spread to humans by central air-conditioning systems, which harbor infected 
amoebae and produce microdroplets of water that are easily inhaled. Once in 
the lung, the bacteria are engulfed by macrophages by an unusual process called 
coiling phagocytosis (Figure 23-26A). L. pneumophila uses a type IV secretion 
system to inject effector proteins into the phagocyte that modulate the activity 
of proteins that regulate vesicular traffic, including SNARE proteins and Rab and 
Arf family small GTPases (discussed in Chapter 13). The effector proteins thereby 
prevent the phagosome from fusing with endosomes and promote its fusion with 
vesicles derived from the endoplasmic reticulum, converting the phagosome into 
a compartment that resembles the rough endoplasmic reticulum (Figure 23-26B). 

Viruses can also alter membrane traffic in the host cell. Enveloped viruses 
make use of host cell membranes to acquire their own envelope membrane. In 
the simplest cases, virally encoded glycoproteins are inserted into the endoplas- 
mic reticulum membrane and follow the secretory pathway through the Golgi 
apparatus to the plasma membrane; the viral capsid proteins and genome assem- 
ble into nucleocapsids, which acquire their envelope as they bud off from the 
plasma membrane (see Figure 23-12). This mechanism is used by many envel- 
oped viruses including HIV-1. Other enveloped viruses such as herpesviruses and 
vaccinia virus acquire their lipid envelopes in more complex ways (Figure 23-27). 


Viruses and Bacteria Use the Host-Cell Cytoskeleton for 
Intracellular Movement 


As mentioned earlier, many pathogens escape into the cytosol rather than 
remaining in a membrane-enclosed compartment. The cytosol of mammalian 
cells is extremely viscous, as itis crowded with protein complexes, organelles, and 
cytoskeletal filaments, all of which inhibit the diffusion of particles the size of a 
bacterium or a viral nucleocapsid. Thus, to reach a particular region of the host 
cell a pathogen must be actively moved there. As with transport of intracellular 
organelles, pathogens generally use the host cell’s cytoskeleton for their active 
movement. 
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Several bacteria that replicate in the host cell’s cytosol have adopted a remark- 
able mechanism that depends on actin polymerization for movement. These 
bacteria include the human pathogens Listeria monocytogenes, Shigella flexneri, 
Rickettsia rickettsii (which causes Rocky Mountain spotted fever), and Burkholde- 
ria pseudomallei (which causes melioidosis, a disease characterized by severe 
respiratory symptoms). Baculovirus, an insect virus, also uses this mechanism for 
intracellular movement. All of these pathogens induce the nucleation and assem- 
bly of host-cell actin filaments at one pole of the bacterium or virus. The growing 
filaments generate force and push the pathogens through the cytosol at rates of 
up to 1 um/sec (Figure 23-28). New filaments form at the rear of each patho- 
gen and are left behind like a rocket trail as the microbe advances; the filaments 
depolymerize within a minute or so as they encounter depolymerizing factors in 
the cytosol. For L. monocytogenes and S. flexneri, the moving bacteria collide with 
the plasma membrane and move outward, inducing the formation of long, thin, 
host-cell protrusions with the bacteria at their tip. As shown in Figure 23-28, a 
neighboring cell often engulfs these projections, allowing the bacteria to enter the 
neighbor’s cytoplasm without exposure to the extracellular environment, thereby 
avoiding antibodies produced by the host’s adaptive immune system. For B. pseu- 
domallei, movement and collision of the bacteria with the plasma membrane pro- 
motes cell-cell fusion, which serves a similar purpose of immune avoidance while 
allowing continued bacterial replication. 
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Figure 23-27 Complex strategies for 
viral envelope acquisition. (A) Herpesvirus 
nucleocapsids assemble in the nucleus 
and then bud through the inner nuclear 
membrane into the space between the 
inner and outer nuclear membranes, 
acquiring a lipid bilayer membrane coat. 
The virus particles then apparently lose this 
coat when they fuse with the endoplasmic 
reticulum membrane to escape into the 
cytosol. Subsequently, the nucleocapsids 
bud into the Golgi apparatus and bud 

out again on the other side, thereby 
acquiring two new membrane coats in 

the process. The virus then buds from 

the cell surface with a single membrane 
when its outer membrane fuses with the 
plasma membrane. (B) Vaccinia virus 
(which is closely related to the virus that 
causes smallpox and is used to vaccinate 
against smallpox) assembles in “replication 
factories” in the cytosol, far away from the 
plasma membrane. The immature virion, 
with one membrane, is then surrounded 
by two additional membranes, both 
acquired from the Golgi apparatus by a 
poorly understood wrapping mechanism, 
to form the intracellular enveloped virion. 
After fusion of the outermost membrane 
with the host-cell plasma membrane, the 
extracellular enveloped virion is released 
from the host cell. 


1288 Chapter 23: Pathogens and Infection 


Figure 23-28 The actin-based movement of bacterial pathogens free bacterium Gp 
within and between host cells. (A) Following invasion, bacterial pathogens 


such as Listeria monocytogenes, Shigella flexneri, Rickettsia rickettsii, and 
Burkholderia pseudomallei induce the assembly of actin-rich tails in the i host cell 
host-cell cytoplasm, which drives rapid bacterial movement. For most of 


these pathogens, the moving bacteria collide with the host-cell plasma phagocytosis > 
membrane to form membrane-covered protrusions, which are engulfed by by pee S 
neighboring cells — spreading the infection from cell to cell. In contrast, for get ee 
B. pseudomallei, collision with the plasma membrane promotes cell-cell pena 
fusion, creating a conduit through which bacteria can invade neighboring $ N 
cells (Movie 23.7). | Y a phagosome 
y AN 
actin YQ 
; , , . nucleation N 
The molecular mechanisms of pathogen-induced actin assembly differ for the ' D 
different pathogens, suggesting that they evolved independently (Figure 23-29). actin tail a Lee a 
L. monocytogenes and baculovirus produce proteins that directly bind to and acti- =e 
vate the Arp 2/3 complex to initiate the formation of an actin tail and movement motile 


(see Figure 16-16). S. flexneri produces an unrelated surface protein that binds to 
and activates N-WASp, which then activates the Arp 2/3 complex. Rickettsia spe- 
cies produce a protein that directly polymerizes actin by mimicking the function 
of host formin proteins (see Figure 16-17). 

Many viral pathogens rely primarily on microtubule-dependent motor pro- 
teins rather than actin polymerization to move within the host-cell cytosol. 
Viruses that infect neurons, such as the neurotropic alpha herpesviruses, which 
include the virus that causes chickenpox, provide important examples. The virus 
enters sensory neurons at the tips of their axons, and microtubule-based retro- 
grade “backward” axonal transport carries the nucleocapsids down the axon to 
the nucleus. The transport is mediated by attachment of viral capsid proteins to engulfment by 
the motor protein dynein (see Figure 16-58). After replication and assembly in the Se ee ee 
nucleus, the enveloped virions are then carried by antegrade “forward” axonal 
transport along microtubules to the axon tips, with the transport being mediated 
by the attachment of a different viral capsid protein to a kinesin motor protein (see 
Figure 16-56). A large number of viruses associate with either dynein or kinesin 
motor proteins to move along microtubules at some stage in their replication. As 
microtubules serve as oriented tracks for vesicular transport in eukaryotic cells, 
it is not surprising that many viruses have independently evolved the ability to 
exploit them for their own transport. 





Viruses Can Take Over the Metabolism of the Host Cell 


Viruses use basic host cell machinery for most aspects of their reproduction: they 
depend on host-cell ribosomes to produce their proteins, and most use host-cell 
DNA and RNA polymerases for their own replication and transcription. Many 
viruses encode proteins that modify the host transcription or translation appa- 
ratus to favor the synthesis of viral RNAs and proteins over those of the host cell, 
shifting the synthetic capacity of the cell toward the production of new virus par- 
ticles. Poliovirus, for example, encodes a protease that specifically cleaves the 
TATA-binding component of TFIID (see Figure 6-17), shutting off transcription 
of most of the host cell’s protein-coding genes. Influenza virus produces a protein 
that blocks both the splicing and the polyadenylation of host-cell RNA transcripts, 
preventing their export into the cytosol (see Figure 6-38). 

Viruses also alter translation by the host. Translation initiation for most host- 
cell mRNAs depends on recognition of their 5’ cap by translation initiation factors 
(see Figure 6-70). This initiation process is often inhibited during viral infection, 
so that the host-cell ribosomes can be used more efficiently for the synthesis of 
viral proteins. Some viral genomes encode endonucleases that cleave off the 5’ 
cap from host-cell mRNAs; some go even further by using the liberated 5’ caps as 
primers to synthesize viral mRNAs, a process called cap snatching. Several other 
viral RNA genomes encode proteases that cleave certain translation initiation fac- 
tors; these viruses rely on 5’ cap-independent translation of their own RNA, using 
internal ribosome entry sites (IRESs) (see Figure 7-68). 
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A few DNA viruses use host-cell DNA polymerase to replicate their genome. 
Unfortunately for these viruses, DNA polymerase is expressed at high levels only 
during S phase of the cell cycle, and most cells that these viruses infect spend 
most of their time in Gı phase. Adenovirus has evolved a mechanism to drive the 
host cell into S phase, so that the cell produces large amounts of active DNA poly- 
merase, which then replicates the viral genome; to accomplish this, the adenovi- 
rus genome also encodes proteins that inactivate both Rb (see Figure 17-61) and 
p53 (see Figure 17-62), two key suppressors of cell-cycle progression. As might be 
expected for any mechanism that encourages unregulated DNA replication, these 
viruses can promote, under some circumstances, the development of cancer. 
Other DNA viruses, including poxviruses and mimivirus, encode their own DNA 
and RNA polymerases, as well as some transcription regulators, allowing them to 
bypass usual host pathways and replicate outside the nucleus. 

RNA viruses must always encode their own replication proteins because host 
cells lack polymerase enzymes that use RNA as a template. For RNA viruses with 
a single-stranded genome, the replication strategy depends on whether the RNA 
is a positive [+] strand, which contains translatable information like mRNA, or a 
complementary negative |-] strand. When the RNA is a positive [+] strand, the 
incoming viral genome is used to produce the viral RNA polymerase and viral pro- 
teins; the viral polymerase is then used to replicate the viral RNA and to generate 
mRNAs for the production of more viral proteins. For viruses with a negative [-| 
strand RNA genome (such as influenza and measles virus), an RNA polymerase 
enzyme is packaged as a structural protein of the incoming viral capsids. 

Retroviruses such as HIV-1, which have a positive [+] strand RNA genome, are 
a special class of RNA virus because they carry with them a viral reverse transcrip- 
tase enzyme. After entry to the host cell, the reverse transcriptase uses the viral 
RNA genome as a template to synthesize a double-stranded DNA copy of the viral 
genome, which enters into the nucleus and integrates into the host cell’s chromo- 
somes (see Figure 5-62). It is later transcribed by the cell’s DNA-dependent RNA 
polymerase to produce viral genomes and proteins. 


Pathogens Can Evolve Rapidly by Antigenic Variation 


The complexity and specificity of the interplay between pathogens and their host 
cells might suggest that virulence would be difficult to acquire by random muta- 
tion. Yet, new pathogens are constantly emerging, and old pathogens are con- 
stantly changing in ways that make familiar infections more difficult to prevent 
or treat. Pathogens have two advantages that enable them to evolve rapidly. First, 
they replicate very quickly, providing a great deal of material for natural selection 
to work with. Whereas humans and chimpanzees have acquired a 2% difference 
in genome sequences over about 8 million years of divergent evolution, poliovirus 
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Figure 23-29 Molecular mechanisms 

for actin nucleation by various bacterial 
pathogens. Listeria monocytogenes and 
Shigella flexneri induce actin nucleation 

by recruiting and activating the host Arp 
2/3 complex (see Figure 16-16), although 
each uses a different recruitment strategy: 
L. monocytogenes expresses a surface 
protein, ActA, that directly binds 

to and activates the Arp 2/3 complex; 

S. flexneri expresses a Surface protein, 
IcsA (unrelated to ActA), that recruits the 
host protein N-WASp, which in turn recruits 
the Arp 2/3 complex, along with other host 
proteins, including WIP (WASp-interacting 
protein). Rickettsia rickettsii uses an entirely 
different strategy; it expresses a surface 
protein, Sca2, that directly nucleates actin 
polymerization by mimicking the activity of 
host formin proteins. 
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manages a 2% change in its genome in 5 days—about the time it takes the virus 
to pass from the human mouth to the gut. Second, selective pressures act rapidly 
on this genetic variation. The host’s adaptive immune system and modern micro- 
bicidal drugs, both of which destroy pathogens that fail to change, are the main 
sources of these selective pressures. 

An example of an adaptation to the selective pressure imposed by the adap- 
tive immune system is the phenomenon of antigenic variation. An important 
adaptive immune response against many pathogens is the host’s production of 
antibodies that recognize specific molecules (antigens) on the pathogen’s surface 
(discussed in Chapter 24). Many pathogens have evolved mecanisms that delib- 
erately change these antigens during the course of an infection, enabling them to 
evade antibodies. Some eukaryotic parasites, for example, undergo programmed 
rearrangements of the genes encoding their surface antigens. A striking example 
occurs in Trypanosoma brucei, a protozoan parasite that causes African sleeping 
sickness and is spread by tsetse flies. (T. brucei is a relative of T. cruzi—see Fig- 
ure 23-21—but it replicates extracellularly rather than intracellularly.) T. brucei 
is covered with a single type of glycoprotein, called variant-specific glycoprotein 
(VSG), which elicits in the host a protective antibody response that rapidly clears 
most of the parasites. The trypanosome genome, however, contains about 1000 
different Vsg genes or pseudogenes, each encoding a VSG with a distinct amino 
acid sequence. Only one of these genes is expressed at any one time, from one of 
approximately 20 possible expression sites in the genome. Gene rearrangements 
that copy different Vsg genes into expression sites repeatedly change the VSG pro- 
tein displayed on the surface of the pathogen. In this way, a few trypanosomes 
with an altered VSG escape the initial antibody-mediated clearance, replicate, 
and cause the disease to recur, leading to a chronic cyclic infection (Figure 23-30). 

Bacterial pathogens can also rapidly change their surface antigens. As dis- 
cussed in Chapter 5, Salmonella enterica bacteria switch between expressing 
either of two versions of the protein flagellin, the structural component of the bac- 
terial flagellum (see Figure 23-3D), in a process called phase variation (see Figure 
5-65). Species of the genus Neisseria are also champions at this. These Gram-neg- 
ative cocci can cause meningitis and sexually transmitted diseases. They undergo 
genetic recombination very similar to that just described for eukaryotic patho- 
gens, which enables them to vary the pilin protein they use to attach to host cells. 
By inserting one of the multiple silent copies of variant pilin genes into a single 
expression locus, they can express many slightly different versions of the protein 
and repeatedly change the amino acid sequence over time. Neisseria bacteria are 
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Figure 23-30 Antigenic variation in 
trypanosomes. (A) There are about 1000 
distinct Vsg genes in Trypanosoma brucei, 
and they are expressed one at a time from 
approximately 20 expression sites in the 
genome. To be expressed, an inactive 
gene is copied and the copy is moved 
into an expression site through DNA 
recombination. Each Vsg gene encodes a 
different surface protein (antigen). These 
switching events allow the trypanosome 
to repeatedly change the surface antigen 
it expresses. (B) A person infected with 
trypanosomes expressing VSG? mounts 
a protective antibody response, which 
clears most of the parasites expressing 
this antigen. However, a few of the 
trypanosomes will have switched to 
expression of VSG®, which can now 
proliferate until anti-VSG® antibodies 

clear them. By that time, however, some 
parasites will have switched to VSG°, and 
so the cycle continues. 
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also extremely adept at taking up DNA from their environment by natural trans- 
formation and incorporating it into their genomes, further contributing to their 
extraordinary variability. The end result of this considerable variation is a pleth- 
ora of different surface compositions with which to bewilder the host adaptive 
immune system. It is therefore not surprising that it has been difficult to develop 
an effective vaccine against Neisseria infections, although there are now several 
that protect against Neisseria meningitidis, acommon cause of fatal meningitis. 


Error-Prone Replication Dominates Viral Evolution 


In contrast to the DNA rearrangements in bacteria and parasites, viruses rely on an 
error-prone replication mechanism for antigenic variation. Retroviral genomes, 
for example, acquire on average one point mutation every replication cycle, 
because the viral reverse transcriptase (see Figure 5-62) needed to produce DNA 
from the viral RNA genome lacks the proofreading activity of DNA polymerases. A 
typical, untreated HIV infection may eventually produce HIV genomes with every 
possible point mutation. By a process of mutation and selection within each host, 
most viruses change over time—from a form that is most efficient at infecting mac- 
rophages to one more efficient at infecting T cells, as described earlier (see Figure 
23-17). Similarly, once a patient is treated with an antiviral drug, the viral genome 
can quickly mutate and be selected for its resistance to the drug. Remarkably, only 
about one-third of the nucleotide positions in the coding sequence of the viral 
genome are invariant, and nucleotide sequences in some parts of the genome, 
such as the Env gene (see Figure 7-62), can differ by as much as 30% from one 
HIV isolate to another. This extraordinary genomic plasticity greatly complicates 
attempts to develop vaccines against HIV. It has also led to the rapid emergence 
of new HIV strains. Nucleotide sequence comparisons between various strains 
of HIV and the very similar simian immunodeficiency virus (SIV) isolated from a 
variety of monkey species suggest that the most virulent type of HIV, HIV-1, may 
have jumped from primates to humans multiple independent times, starting as 
long ago as 1908 (Figure 23-31). 

Influenza viruses are an important exception to the rule that error-prone rep- 
lication dominates viral evolution. They are unusual in that their genome consists 
of several (usually eight) strands of RNA. When two strains of influenza infect the 
same host, the RNA strands of the two strains can reassort to form a new type 
of influenza virus. In normal years, influenza is a mild disease in healthy adults, 
although it can be life-threatening in the very young and very old. Different influ- 
enza strains infect fowl such as ducks and chickens, but only a subset of these 
strains can infect humans, and transmission from fowl to humans is rare. In 1918, 
however, a particularly virulent variant of avian influenza crossed the species 
barrier to infect humans, triggering the catastrophic pandemic of 1918 called 
the Spanish flu, which killed 20-50 million people worldwide. Subsequent influ- 
enza pandemics have been triggered by genome reassortment, in which a new 
RNA segment from an avian form of the virus replaced one or more of the viral 
RNA segments from the human form (Figure 23-32). In 2009, a new H1N1 swine 
virus emerged that derived genes from pig, avian, and human influenza viruses. 
Such recombination events allowed the new virus to replicate rapidly and spread 
through an immunologically naive human population. Generally, within two or 
three years, the human population develops immunity to a new recombinant 
strain of virus, and the infection rate drops to a steady-state level. Because the 
recombination events are unpredictable, it is not possible to know when the next 
influenza pandemic will occur or how severe it might be. 


Drug-Resistant Pathogens Are a Growing Problem 


The development of drugs that cure rather than prevent infections has had a 
major impact on human health. Antibiotics, which are either bactericidal (they 
kill bacteria) or bacteriostatic (they inhibit bacterial growth without killing), are 
the most successful class of such drugs. Penicillin was one of the first antibiotics 
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Figure 23-31 Diversification of 

HIV-1, HIV-2, and related strains 

of SIV. HIV comprises different viral 
families, all descended from SIV (simian 
immunodeficiency virus). On three separate 
occasions, SIV was passed from a 
chimpanzee to a human, resulting in three 
HIV-1 groups: major (M), outlier (O), and 
non-M non-O (N). The HIV-1 M group is the 
most common and is primarily responsible 
for the global AIDS epidemic. On two 
separate occasions, SIV was passed from 
a sooty mangabey monkey to a human, 
resulting in the two HIV-2 groups. In 2009, 
a new strain of HIV was discovered that 
appears to have resulted from SIV passage 
from a gorilla to a human. 
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used to treat infections in humans, just in time to prevent tens of thousands of 
deaths from infected battlefield wounds in World War II. Because bacteria (see 
Figure 1-17) are not closely related evolutionarily to the eukaryotes they infect, 
much of their basic machinery for DNA replication and transcription, RNA trans- 
lation, and metabolism differs from that of their host. These differences enable us 
to develop antibacterial drugs that exhibit selective toxicity, in that they specifi- 
cally inhibit these processes in bacteria without disrupting them in the host. Most 
of the antibiotics that we use to treat bacterial infections are small molecules that 
inhibit macromolecular synthesis in bacteria by targeting bacterial enzymes that 
either are distinct from their eukaryotic counterparts or are involved in pathways 
such as cell wall biosynthesis that are absent in animals (Figure 23-33 and see 
Table 6-4). 

However, bacteria continuously evolve and strains resistant to antibiotics rap- 
idly develop, often within a few years of the introduction of a new drug. Similar 
drug resistance also arises rapidly when treating viral infections with antiviral 
drugs. The virus population in an HIV-infected person treated with the reverse 
transcriptase inhibitor AZT, for example, will acquire complete resistance to the 
drug within a few months. The current protocol for treatment of HIV infections 
involves the simultaneous use of three drugs, which helps to minimize the acqui- 
sition of resistance for any one of them. 

There are three general strategies by which a pathogen can develop drug resis- 
tance: (1) it can alter the molecular target of the drug so that it is no longer sensi- 
tive to the drug; (2) it can produce an enzyme that modifies or destroys the drug; 
or (3) it can prevent the drug’s access to the drug target by, for example, actively 
pumping the drug out of the pathogen (Figure 23-34). 

Once a pathogen has chanced upon an effective drug-resistance strategy, the 
newly acquired or mutated genes that confer the resistance are frequently spread 
throughout the pathogen population by horizontal gene transfer. They may even 
spread between pathogens of different species. The highly effective but expensive 
antibiotic vancomycin, for example, is used as a treatment of last resort for many 
severe, hospital-acquired, Gram-positive bacterial infections that are resistant 
to most other known antibiotics. Vancomycin prevents one step in bacterial cell 
wall synthesis—the cross-linking of peptidoglycan chains in the bacterial cell wall 
(see Figure 23-3B). Resistance can arise if the bacterium synthesizes a cell wall 
using different subunits that do not bind vancomycin. The most effective form of 
vancomycin resistance depends on the acquisition of a transposon (see Figure 
5-60) containing seven genes, the products of which work together to sense the 
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Figure 23-32 Model for the evolution 

of pandemic strains of influenza virus 
by recombination. Influenza A virus is 

a natural pathogen of birds, particularly 
waterfowl, and it is always present in wild 
bird populations. In 1918, a particularly 
virulent form of the virus crossed the 
species barrier from birds to humans and 
caused a devastating worldwide epidemic. 
This strain was designated H1N1, referring 
to the specific forms of its main antigens, 
hemagglutinin (H) and neuraminidase (N). 
Changes in the virus, rendering it less 
virulent, and the rise of adaptive immunity 
in the human population, prevented the 
pandemic from continuing in subsequent 
seasons, although H1N1 influenza strains 
continued to cause serious disease every 
year in very young and very old people. In 
1957, anew pandemic arose when three 
genes were replaced by equivalent genes 
from an avian virus (green bars); the new 
strain (designated H2N2) was not effectively 
cleared by antibodies in people who had 
previously contracted only H1N1 forms of 
influenza. In 1968, another pandemic was 
triggered when two genes were replaced 
from another avian virus; the new virus 
was designated H3N2. In 1977, there was 
a resurgence of H1N1 influenza, which 
had previously been almost completely 
replaced by the N2 strains. Molecular 
sequence information suggests that this 
minor pandemic may have been caused by 
an accidental release of an influenza strain 
that had been held in a laboratory since 
about 1950. In 2009, anew H1N1 swine 
virus emerged that had derived five genes 
from pig influenza viruses, two from avian 
influenza viruses, and one from a human 
influenza virus. As indicated, most human 
influenza today is caused by H1N1 and 
HSN2 strains. 
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presence of vancomycin, shut down the normal pathway for bacterial cell wall 
synthesis, and produce a different type of cell wall. 

Drug-resistance genes acquired by horizontal transfer frequently come from 
environmental microbial reservoirs. Nearly all antibiotics used to treat bacterial 
infections today are based on natural products produced by fungi or bacteria. 
Penicillin, for example, is made by the mold Penicillium, and more than 50% of 
the antibiotics currently used in the clinic are made by Gram-positive bacteria of 
the genus Streptomyces, which reside in the soil. It is believed that microorgan- 
isms produce antimicrobial compounds, many of which have probably existed 
on Earth for hundreds of millions of years, as weapons in their competition with 
other microorganisms in the environment. Surveys of bacteria taken from soil 
samples that have never been exposed to antibiotic drugs used in modern medi- 
cine reveal that the bacteria are typically already resistant to about seven or eight 
of the antibiotics widely used in clinical practice. When pathogenic microorgan- 
isms are faced with the selective pressure provided by antibiotic treatments, they 
can apparently draw upon this immense source of genetic material to acquire 
resistance. 

Like most other aspects of infectious disease, human behavior has exacerbated 
the problem of drug resistance. Many patients take antibiotics for symptoms that 
are typically caused by viruses (flu-like illnesses, colds, sore throats, and earaches) 
and these drugs have no effects. Persistent and chronic misuse of antibiotics can 
eventually result in antibiotic-resistant normal flora, which can then transfer the 
resistance to pathogens. Antibiotics are also misused in agriculture, where they 
are commonly employed as food additives to promote the growth and health of 
farm animals. An antibiotic closely related to vancomycin was commonly added 
to cattle feed in Europe; the resulting resistance in the normal flora of these ani- 
mals is widely believed to be one of the original sources for vancomycin-resistant 
bacteria that now threaten the lives of hospitalized patients. 
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Figure 23-33 Antibiotic targets. Although 
there are many antibiotics in clinical use, 
they have a narrow range of targets, 

which are highlighted in yellow. A few 
representative antibiotics in each class are 
listed. Nearly all antibiotics used to treat 
human infections fall into one of these 
categories. The vast majority inhibit either 
bacterial protein synthesis or bacterial cell 
wall synthesis. 


Figure 23-34 Three general mechanisms 
of antibiotic resistance. (A) A nonresistant 
wild-type bacterial cell bathed in a drug 
(red triangles) that binds to and inhibits an 
essential enzyme (light green) will be killed 
due to enzyme inhibition. (B) A bacterium 
that has altered the drug’s target enzyme 
so that the drug no longer binds to the 
enzyme will Survive and proliferate. In many 
cases, a single point mutation in the gene 
encoding the target protein can generate 
resistance. (C) A bacterium that expresses 
an enzyme (dark green) that either 
degrades or covalently modifies the drug 
will Survive and proliferate. Some resistant 
bacteria, for example, make B-lactamase 
enzymes, which cleave penicillin and similar 
molecules. (D) A bacterium that expresses 
or up-regulates an efflux pump that ejects 
the drug from the bacterial cytoplasm 
(using energy derived from either ATP 
hydrolysis or the electrochemical gradient 
across the bacterial plasma membrane) will 
survive and proliferate. Some efflux pumps, 
such as the TetR efflux pump, are specific 
for a single drug (in this case, tetracycline), 
whereas others, called multidrug resistance 
(MDR) efflux pumps, are capable of 
exporting a wide variety of structurally 
dissimilar drugs. Upregulation of an MDR 
pump can render a bacterium resistant to a 
very large number of different antibiotics in 
a single step. 
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Because the acquisition of drug resistance is almost inevitable, it is crucial that 
we continue to develop innovative treatments for infectious diseases. We must 
also take additional measures to delay the onset of drug resistance. 


Summary 


All pathogens share the ability to interact with host cells in diverse ways that pro- 
mote the replication and spread of the pathogen. Pathogens often colonize the host 
by adhering to or invading the epithelial surfaces that line the respiratory, gastroin- 
testinal, and urinary tracts, as well as the other body surfaces in direct contact with 
the environment. Intracellular pathogens, including all viruses and many bacteria 
and protozoa, invade host cells by one of several mechanisms. Viruses rely largely on 
receptor-mediated endocytosis, whereas bacteria exploit cell adhesion and phago- 
cytic pathways; in both cases, the host cell provides the machinery and energy for 
the invasion. Protozoa, by contrast, employ unique invasion strategies that usually 
require significant metabolic expense on the part of the invader. Once inside, intra- 
cellular pathogens seek out a cell compartment that is favorable for their survival 
and replication, frequently altering host membrane traffic and exploiting the host- 
cell cytoskeleton for intracellular movement. Pathogens evolve rapidly, so that new 
infectious diseases frequently emerge, and old pathogens acquire new ways to evade 


WHAT WE DON'T KNOW 


e What are the genetic and molecular 
features that differ between pathogens 
and members of the normal human 
microbiota? How can our immune 
system distinguish between the two? 


e To what extent are common host- 
cell biological pathways and molecules 
hijacked by diverse microbes? 


e Can host-cell defense molecules be 
mobilized by drugs to fight infection? 


our attempts at treatment, prevention, and eradication. 


PROBLEMS 


Which statements are true? Explain why or why not. 


23-1 Our adult bodies harbor about 10 times more 
microbial cells than human cells. 


23-2 The microbiomes from healthy humans are all 
very similar. 


23-3 Pathogens must enter host cells to cause disease. 


23-4 Viruses replicate their genomes in the nucleus of 
the host cell. 


23-5 You should not take antibiotics for diseases caused 
by viruses. 


Discuss the following problems. 


23-6 In order to survive and multiply, a successful 
pathogen must accomplish five tasks. Name them. 


23-7 Clostridium difficile infection is the leading cause 
of hospital-associated gastrointestinal illness. It is typically 
treated with a course of antibiotics, but the infection recurs 
in about 20% of cases. C. difficile infections are difficult to 
eradicate because the bacteria exist in two forms: a repli- 
cating, toxin-producing form and a spore form that is resis- 
tant to antibiotics. Fecal microbiota transplantation—the 
transfer of normal gut microbiota from a healthy individ- 
ual—can resolve >90% of recurrent infections, a much bet- 
ter cure rate than further antibiotic treatment alone. Why 
do you suppose microbiota transplantation is so effective? 


23-8 What are the three general mechanisms for hori- 
zontal gene transfer? 


23-9 The Gram-negative bacterium Yersinia pestis, the 
causative agent of the plague, is extremely virulent. Upon 
infection, Y. pestis injects a set of effector proteins into 
macrophages that suppresses their phagocytic behavior 
and also interferes with their innate immune responses. 
One of the effector proteins, YopJ, acetylates serines and 
threonines on various MAP kinases, including the MAP 
kinase kinase kinase TAK1, which controls a key signaling 
step in the innate immune response pathway. To deter- 
mine how Yop] interferes with TAK1, you transfect human 
cells with active Yop) (YopJ“') or inactive YopJ (YopJ©*) 
and with FLAG-tagged active TAK] (TAK1™") or inactive 
TAK1 (TAK1*®3”), and assay for total TAK] and for phos- 
phorylated TAK1, using antibodies against the FLAG tag or 
against phosphorylated TAK] (Figure Q23-1). How does 
Yop] block the TAKI signaling pathway? How do you sup- 
pose the serine/threonine acetylase activity of YopJ might 
interfere with TAKI activation? 


TAK1 WT WT WT 
YopJ - CA WT 
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m -76 
IB: o-FLAG-TAK1 a a aS 


Figure Q23-1 Effects of YopJ on TAK1 phosphorylation (Problem 
23-9). TAK1 was immunoprecipitated (IP) using antibodies against 
the FLAG tag (a-FLAG-TAK1). Total TAK1 in the immunoprecipitation 
was assayed by immunoblot (IB) using the same antibody. 
Phosphorylated TAK1 was assayed by IB using antibodies specific 
for phospho-TAK1 (a-pTAK1). A scale of protein molecular mass is 
shown at right in kilodaltons. (From N. Paquette et al., Proc. Natl 
Acad. Sci. USA 109:12710-12715, 2012. With permission from 
National Academy of Sciences.) 
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23-10 The intracellular bacterial pathogen Salmonella 
typhimurium, which causes gastroenteritis, injects effector 
proteins to promote its invasion into nonphagocytic host 
cells by the trigger mechanism. S. typhimurium first stim- 
ulates membrane ruffling to promote invasion, and then 
suppresses membrane ruffling once invasion is complete. 
This behavior is mediated in part by injection of two effec- 
tor proteins: SopE, which promotes membrane ruffling 
and invasion, and SptP, which blocks the effects of SopE. 
Both effector proteins target the monomeric GTPase, Rac, 
which in its active form promotes membrane ruffling. How 
do you suppose SopE and SptP affect Rac activity? How do 
you suppose the effects of SopE and SptP are staggered in 
time if they are injected simultaneously? 


23-11 John Snow is widely regarded as the father of mod- 
ern epidemiology. Most famously, he investigated an out- 
break of cholera in London in 1854 that killed more than 
600 victims before it was finished. Snow recorded where 
the victims lived, and plotted the data on a map, along with 
the locations of the water pumps that served as the source 
of water for the public (Figure Q23-2). He concluded that 
the disease was most likely spread in the water, although 
he could find nothing suspicious-looking in it. His conclu- 
sion ran counter to the then-current belief that cholera was 
from “miasmas” in bad air. Very few believed his theory 
during the next 50 years, with the “bad air” theory persist- 
ing until at least 1901. What do you suppose Snow saw in 
the data that led him to his conclusion? Why do you think 
most scientists remained skeptical for so long? 


Oxford Circus 


a 





London Palladium = 


ar pie 





Pi cachily Cicus 


Figure Q23-2 A map of where the victims of the 1854 cholera 
outbreak lived, superimposed on a modern Google map of the area 
(Problem 23-11). The locations of the victims’ houses are indicated 
in a gradient of colors from blue (indicating few cases) to orange 
(indicating many cases). Public water pumps are shown as red 
squares. 


23-12 Influenza epidemics account for 250,000 to 
500,000 deaths globally each year. These epidemics are 
markedly seasonal, occurring in temperate climates in the 
northern and southern hemispheres during their respec- 
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Figure Q23-3 Seasonal patterns of influenza epidemics (Problem 
23-12). Cases of influenza at different times of the year are shown for 
the northern hemisphere (blue), the southern hemisphere (orange), 
and the tropics (red). 


tive winters. By contrast, in the tropics, there is significant 
influenza activity year round, with a peak in the rainy sea- 
son (Figure Q23-3). Can you suggest some possible expla- 
nations for the patterns of influenza epidemics in temper- 
ate zones and the tropics? 


23-13 Several negative-strand viruses carry their genome 
as a set of discrete RNA segments. Examples include influ- 
enza virus (eight segments), Rift Valley fever virus (three 
segments), Hantavirus (three segments), and Lassa virus 
(two segments), to name a few. Why does segmentation of 
the genome provide a strong evolutionary advantage for 
these viruses? 


23-14 Avian influenza viruses readily infect birds, but 
are transmitted to humans very rarely. Similarly, human 
influenza viruses spread readily to other humans, but 
have never been detected in birds. The key to this speci- 
ficity lies in the viral capsid protein, hemagglutinin, which 
binds to sialic acid residues on cell-surface glycoproteins, 
triggering virus entry into the cell (Movie 23.8). Hemag- 
glutinin on human viruses recognizes sialic acid in a 2-6 
linkage with galactose, whereas avian hemagglutinin rec- 
ognizes sialic acid in a 2-3 linkage with galactose. Humans 
make carbohydrate chains that have only the 2-6 linkage 
between sialic acid and galactose; birds make only the 
2-3 linkage; but pigs make carbohydrate chains with both 
linkages. How does this situation make pigs ideal hosts for 
generating new strains of human influenza viruses? 


23-15 The majority of antibiotics used in the clinic are 
made as natural products by bacteria. Why do you suppose 
bacteria make the very agents we use to kill them? 


23-16 In the early days of penicillin research, it was dis- 
covered that bacteria in the air could destroy the penicillin, 
a big problem for large-scale production of the drug. How 
do you suppose this occurs? 


23-17 When the Oxford team of Ernst Chain and Nor- 
man Heatley had laboriously collected their first two grams 
of penicillin (probably no more than 2% pure!), Chain 
injected two normal mice with 1 g each of this preparation, 
and waited to see what would happen. The mice survived 
with no apparent ill effects. Their boss, Howard Florey, was 
furious at what he saw as a waste of good antibiotic. Why 
was this experiment important? 
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The Innate and Adaptive 
Immune Systems 


As we discussed in Chapter 23, all living organisms serve as hosts for other spe- 
cies, usually in relationships that are benign or even mutually helpful. But all 
organisms, and all cells in a multicellular organism, need to defend themselves 
against infection by harmful invaders, collectively called pathogens, which can 
be microbes (bacteria, viruses, or fungi), or larger parasites. Even bacteria defend 
themselves against viruses, using intracellular proteins called restriction factors, 
which block viral propagation. Invertebrates use a variety of defense strategies, 
including protective barriers, toxic molecules, restriction factors, and phagocytic 
cells that ingest and destroy invading pathogens. Vertebrates, too, depend on such 
innate immune responses, but they can also harness more sophisticated and spe- 
cific mechanisms, called adaptive immune responses. The innate responses occur 
first, calling the adaptive immune responses into play if required, in which case, 
both types of responses work together to eliminate the pathogen (Figure 24-1). 

Whereas innate immune responses are general defense reactions that can 
involve almost any cell type in an organism, the adaptive immune responses are 
highly specific to the particular pathogen that induced them and depend on a class 
of white blood cells (leukocytes) called lymphocytes. There are two major classes 
of lymphocytes that mount adaptive immune responses—B lymphocytes (B cells), 
which secrete antibodies that bind specifically to the pathogen, and T lymphocytes 
(T cells), which can either directly kill cells infected with the pathogen or produce 
secreted or cell-surface signal proteins that stimulate other host cells to help elim- 
inate the pathogen (Figure 24-2). Unlike innate immune responses, which are 
generally short-lasting, the adaptive responses provide long-lasting protection: 
a person who recovers from measles or is vaccinated against it, for example, is 
protected for life against measles by the adaptive immune system, although not 
against other common viruses, such as those that cause mumps or chickenpox. 

Both the innate and adaptive immune systems have evolved sensing mech- 
anisms that enable them to recognize harmful invaders (pathogens) and distin- 
guish them from both the host’s own cells and molecules and harmless or bene- 
ficial foreign organisms and their molecules. The innate system relies on sensor 
proteins that recognize particular types or patterns of molecules that are common 
to pathogens but are absent or sequestered in the host. The adaptive system, by 
contrast, uses unique genetic mechanisms to produce a virtually limitless diver- 
sity of related proteins—receptors on T and B cells and secreted antibodies—that, 
between them, can bind almost any foreign molecule. This remarkable strategy 
enables the adaptive immune system to react specifically against any pathogen, 
even if the animal never encountered it before. But, it also requires that the system 
learn not to react against self molecules or harmless foreign ones; if these learning 
mechanisms fail, harmful autoimmune or allergic responses result. 

In this chapter, we focus on vertebrate immune responses and the features 
that distinguish them from other kinds of cell responses. We begin with innate 
immune defenses and then discuss the highly specialized properties of the adap- 
tive immune system. 
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Figure 24—1 Innate and adaptive immune 
responses. Innate immune responses are 
activated directly by pathogens and defend 
all multicellular organisms against infection. 
In vertebrates, pathogens, together 

with the innate immune responses they 
activate, also stimulate adaptive immune 
responses, which then work together with 
innate immune responses to help fight the 
infection. 


THE INNATE IMMUNE SYSTEM 


Adaptive immune responses are slow to develop when a vertebrate first encoun- 
ters anew pathogen. This is because the specific B cells and T cells that can respond 
to a particular pathogen are initially few in number and must be stimulated to 
proliferate and differentiate before they can mount effective adaptive immune 
responses, which can take days. By contrast, a single bacterium that divides every 
hour can generate almost twenty million progeny in a single day, producing a full- 
blown infection. Vertebrates, therefore, rely on their innate immune system to 
defend them against infection during the first critical hours and days of exposure 
to a new pathogen. Plants and invertebrates lack adaptive immune systems and 
therefore rely entirely on innate immunity for protection against pathogens. 

In this section, we consider some of the strategies the innate immune system 
uses to recognize pathogens and to provide a first line of defense against them. 


Epithelial Surfaces Serve as Barriers to Infection 


In vertebrates, the first encounters with infectious organisms are typically at the 
epithelial surfaces that form the skin and line the respiratory, digestive, urinary, 
and reproductive tracts. These epithelia provide both physical and chemical bar- 
riers to invasion by pathogens: tight junctions between epithelial cells bar entry 
between the cells, and a variety of substances secreted by the cells discourage the 
attachment and entry of pathogens. The keratinized epithelial cells of the skin, 
for example, form a thick physical barrier, and the sebaceous glands in the skin 
secrete fatty acids and lactic acid, which inhibit bacterial growth. In addition, epi- 
thelial cells in all tissues, including those in plants and invertebrates, secrete anti- 
microbial molecules called defensins. Defensins are positively charged, amphi- 
pathic peptides that bind to and disrupt the membranes of many pathogens, 
including enveloped viruses, bacteria, fungi, and parasites. 

The epithelial cells that line internal organs such as the respiratory and diges- 
tive tracts also secrete slimy mucus, which sticks to the epithelial surface and 
makes it difficult for pathogens to adhere. The beating of cilia on the surface of the 
epithelial cells lining the respiratory tract and the peristaltic action of the intestine 
also discourage the adherence of pathogens. Moreover, as we discuss in Chapter 
23, healthy skin and gut are populated by enormous numbers of harmless (and 
often helpful) commensal microbes, collectively called the normal flora, which 
compete for nutrients with pathogens; some also produce antimicrobial peptides 
that actively inhibit pathogen proliferation. 


Pattern Recognition Receptors (PRRs) Recognize Conserved 
Features of Pathogens 


Pathogens do occasionally breach the epithelial barricades, in which case under- 
lying, nonepithelial cells of the innate immune system provide the next line of 
defense. These cells sense the presence of pathogens largely through the use of 
receptor proteins that recognize microbe-associated molecules that either are not 
present or are sequestered in the host organism. Because these microbial mol- 
ecules often occur in repeating patterns, they are called pathogen-associated 
molecular patterns (PAMPs), even though they are not unique to microbes that 
can cause disease. PAMPs are present in various microbial molecules, including 
nucleic acids, lipids, polysaccharides, and proteins. 

The special receptor proteins that recognize PAMPs are called pattern recog- 
nition receptors (PRRs). Some PRRs are transmembrane proteins on the surface 
of many types of host cells, where they recognize extracellular pathogens; on 
professional phagocytic cells (phagocytes) such as macrophages and neutrophils 
(discussed in Chapter 22), they can mediate the uptake of the pathogens into pha- 
gosomes, which then fuse with lysosomes, where the pathogens are destroyed. 
Other PRRs are located intracellularly, where they can detect intracellular patho- 
gens such as viruses; these PRRs are either free in the cytosol or associated with 
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Figure 24-2 The two main classes 

of adaptive immune responses. 
Lymphocytes carry out both classes of 
adaptive responses — shown here as 
responses to a viral infection. In one class, 
B cells secrete antibodies that specifically 
bind to and neutralize extracellular viruses, 
by preventing the viruses from infecting 
host cells. In the other, T cells mediate 
the response; in this example, they kill the 
virus-infected host cells. In both cases, 
innate immune responses help activate 
the adaptive immune responses through 
pathways that are not shown. 
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the membranes of the endolysosomal system (discussed in Chapter 13). Still other 
PRRs are secreted and bind to the surface of extracellular pathogens, marking 
them for destruction by either phagocytes or blood proteins that are part of the 
complement system (discussed later). 


There Are Multiple Classes of PRRs 


The first PRR identified was the Toll receptor in Drosophila, which was well-known 
for its role in fly development (see Figure 21-17). It was later discovered to be also 
required for the production of antimicrobial peptides that protect the fly against 
fungal infections (Figure 24-3). Toll is a transmembrane glycoprotein with a 
large extracellular domain that contains a series of leucine-rich repeats. Soon it 
was discovered that both plants and animals have a variety of Toll-like recep- 
tors (TLRs) that function as PRRs in innate immune responses against various 
pathogens. Mammals make at least 10 different TLRs, each recognizing distinct 
ligands: TLR3, for example, recognizes double-stranded viral RNA in the endoso- 
mal lumen (Figure 24-4); TLR4 recognizes lipopolysaccharide (LPS) on the outer 
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Figure 24-3 A scanning electron 
micrograph of a mutant fruit fly that died 
from a fungal infection. The fly is covered 
with fungal hyphae, as it lacked a Toll 
receptor, which helps protect Drosophila 
from fungal infections. (From B. Lemaitre 
et al., Cell 86:973-983, 1996.) 





Figure 24-4 A Toll-like receptor. The structure 
of human TLRS is shown (green), with a double- 
stranded RNA molecule (dsRNA, blue) bound to 
it. The receptor is a homodimer in the membrane 
of endosomes. The binding of dsRNA to the 


(A) ligand-binding domains 
TLR 
bound 
double-stranded 
RNA 
ENDOSOME 
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membrane 





cytosolic domains 


two horseshoe-shaped domains on the lumenal 
side of the endosome brings the two cytosolic 
domains together, allowing adaptor proteins in the 
cytosol to assemble into a large signaling complex 
(not shown). (B) The crystal structure of a lumenal 
domain of the receptor, which contains 23 
conventional leucine-rich repeats, each of which 
contributes a B strand to the continuous 

B sheet (red) that lines the concave surface of the 
structure. (A, adapted from L. Liu et al., Science 
320:379-381, 2008. With permission from AAAS; 
B, adapted from J. Choe, M.S. Kelker and I.A. 
Wilson, Science 309:581—585, 2006; PDB: 1ZIW.) 
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membrane of Gram-negative bacteria; TLR5 recognizes the protein that forms 
the bacterial flagellum; and TLR9 recognizes short, unmethylated sequences of 
bacterial, viral, or protozoan DNA, called CpG motifs, which are uncommon in 
vertebrate DNA. 

In addition to TLRs, vertebrates use several other families of PRRs to detect 
pathogens. One is the large family of NOD-like receptors (NLRs). Like TLRs, 
NLRs have leucine-rich repeat motifs, but they are exclusively cytoplasmic and 
recognize a distinct set of bacterial molecules. Individuals who are homozygous 
for a particular mutant allele of the NLR gene NOD2 have a greatly increased 
risk of developing Crohn’s disease, a chronic inflammatory disease of the small 
intestine, possibly triggered by a bacterial infection. Another class of PRRs con- 
sists of RIG-like receptors (RLRs), which are members of the RNA helicase family 
of proteins. They are also exclusively cytoplasmic and detect viral pathogens. A 
fourth class of PRRs consists of C-type lectin receptors (CLRs), which are trans- 
membrane cell-surface proteins that recognize carbohydrates (which is why they 
are called lectins) on various microbes. Table 24-1 summarizes some PRRs and 
their ligands and locations in cells. Collectively, these and other PRRs act as an 
alarm system to alert the innate and adaptive immune systems that an infection 
is brewing (Movie 24.1). 

When a cell-surface or intracellular PRR binds a PAMP, it stimulates the cell 
to secrete a variety of cytokines and other extracellular signal molecules. Some of 
these inhibit viral replication, but most induce a local inflammatory response that 
helps eliminate the pathogen, as we now discuss. 


Activated PRRs Trigger an Inflammatory Response at Sites of 
Infection 


When a pathogen invades a tissue, it activates PRRs on or in various cells of the 
innate immune system, resulting in an inflammatory response at the site of infec- 
tion. The inflammatory response depends on changes in local blood vessels and is 
characterized clinically by local pain, redness, heat, and swelling. The blood ves- 
sels dilate and become permeable to fluid and proteins, leading to local swelling 
and an accumulation of blood proteins that aid in defense. At the same time, the 
endothelial cells lining the local blood vessels are stimulated to express cell adhe- 
sion proteins, which promote the attachment and escape of white blood cells or 
leukocytes (see Figure 19-29B), adding to the local swelling; initially neutrophils 
escape, followed later by lymphocytes and monocytes (the blood-borne precur- 
sors of macrophages). 


TABLE 24-1 


Degradation products of peptidoglycans 


Retinoic acid-inducible gene 1-like receptors (RLRs) 


C-type lectin receptors (CLRs) 
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The activation of PRRs results in the production of a large variety of extracel- 
lular signal molecules that mediate the inflammatory response at the site of an 
infection. These include both lipid signal molecules, such as prostaglandins, and 
protein (or peptide) signal molecules called cytokines. Some of the most import- 
ant pro-inflammatory cytokines are tumor necrosis factor-a (TNFa), interferon-y 
(IFNy), a variety of chemokines (which recruit leukocytes), and various interleu- 
kins (ILs) that we discuss later, including IL1, IL6, IL12, and IL17. In addition, a 
secreted PRR (mannose-binding lectin) activates the complement system when 
the PRR binds to a pathogen; fragments of complement proteins released during 
complement activation stimulate an inflammatory response (discussed shortly; 
see Figure 24-7). 

When activated by PAMPs, most cell-surface and intracellular PRRs stimulate 
the production of multiple pro-inflammatory cytokines by activating intracellu- 
lar signaling pathways that switch on transcription regulators, including NFKB, 
to induce the transcription of the relevant cytokine genes (see Figure 15-62). 
Some PRRs, however, can also stimulate pro-inflammatory cytokine production 
by a different mechanism: when activated, several cytoplasmic NLRs assemble 
with adaptor proteins and specific protease precursors of the caspase family (dis- 
cussed in Chapter 18) to form inflammasomes, in which pro-inflammatory cyto- 
kines such as IL] are cleaved from their inactive precursor proteins by activated 
caspases. These cytokines are then released from the cell by a poorly understood, 
unconventional secretion pathway. Inflammasomes closely resemble apopto- 
somes in their assembly and structure, but, in apoptosomes, procaspases are 
activated to initiate a proteolytic caspase cascade that leads to apoptosis (see Fig- 
ure 18-7). 

NLR-dependent inflammasome assembly can also be triggered in the absence 
of infection if cells are damaged or stressed. Such cells produce “danger signals,” 
such as altered or misplaced self molecules, which can activate the relevant NLRs: 
the arthritis caused by uric acid crystals formed in the joints of individuals with 
gout, who have abnormally high uric acid levels in their blood, is a painful example. 


Phagocytic Cells Seek, Engulf, and Destroy Pathogens 


In all animals, the recognition of a microbial invader is usually quickly followed by 
its engulfment by a phagocytic cell. Macrophages are long-lived phagocytes that 
reside in most vertebrate tissues; they are among the first cells to encounter invad- 
ing microbes, whose PAMPs activate the macrophages to secrete pro-inflamma- 
tory signal molecules. Neutrophils are short-lived phagocytes that are abundant 
in blood but are not present in healthy tissues; they are rapidly recruited to sites of 
infection by various attractive molecules, including formylmethionine-contain- 
ing peptides (which are released by microbes but are not made by mammalian 
cells), chemokines secreted by activated macrophages, and peptide fragments 
produced from cleaved, activated complement proteins. The recruited neutro- 
phils contribute their own pro-inflammatory cytokines. 

In addition to their PRRs, macrophages and neutrophils display a variety of 
cell-surface receptors that recognize fragments of complement proteins or anti- 
bodies bound to the surface of a pathogen. The binding of such a pathogen to these 
receptors leads to its phagocytosis (Figure 24-5) and an attack on the ingested 
pathogen once inside a phagolysosome. The phagocytes possess an impres- 
sive armory of weapons to kill the invader, including enzymes such as lysozyme 
and acid hydrolases that can degrade the pathogen’s cell wall. The cells assem- 
ble NADPH oxidase complexes on the phagolysosomal membrane, where the 
complexes catalyze the production of highly toxic oxygen-derived compounds, 
including superoxide (O27), hydrogen peroxide, and hydroxyl radicals. A transient 
increase in oxygen consumption by the phagocytic cells, called the respiratory 
burst, accompanies the production of these toxic compounds. Whereas macro- 
phages generally survive this killing frenzy and live to kill again, neutrophils do 
not. Dead and dying neutrophils are a major component of the pus that forms 
in acute bacterially infected wounds; their half-life in the human bloodstream is 
only a few hours. 
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Figure 24-5 Antibody-activated 
phagocytosis. Electron micrograph of 

a neutrophil phagocytosing an antibody- 
coated bacterium, which is in the process 
of dividing. The process in which antibody 
(or complement) coating of a pathogen 
increases the efficiency with which the 
pathogen is phagocytosed is called 
opsonization. (Courtesy of Dorothy F. 
Bainton, from R.C. Williams, Jr. and 

H.H. Fudenberg, Phagocytic Mechanisms 
in Health and Disease. New York: 
Intercontinental Medical Book 
Corporation, 1971.) 
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If a pathogen is too large to be successfully phagocytosed (if it is a large para- 
site such as a worm, for example), a group of macrophages, neutrophils, or eosin- 
ophils (another type of leukocyte) will gather around the invader. They secrete 
defensins and other damaging agents and release the toxic products of the respi- 
ratory burst. This barrage is often sufficient to destroy the pathogen (Figure 24-6). 


Complement Activation Targets Pathogens for Phagocytosis or 
Lysis 


The blood and other extracellular fluids contain numerous proteins with antimi- 
crobial activity, some of which are produced in response to an infection, while 
others are produced constitutively. The most important of these are components 
of the complement system, which consists of about thirty interacting soluble 
proteins that are mainly made continuously by the liver and are inactive until an 
infection or another trigger activates them. They were originally identified by their 
ability to amplify and “complement” the action of antibodies made by B cells, but 
some are also secreted PRRs, which directly recognize PAMPs on microbes. 

The early complement components consist of three sets of proteins, belong- 
ing to three distinct pathways of complement activation—the classical pathway, 
the lectin pathway, and the alternative pathway. The early components of all 
three pathways act locally to cleave and activate C3, which is the pivotal comple- 
ment component (Figure 24-7); individuals with a C3 deficiency are subject to 
repeated bacterial infections. The early components are proenzymes, which are 
activated sequentially by proteolytic cleavage. The cleavage of each proenzyme 
in the series activates the next component to generate a serine protease, which 
cleaves the next proenzyme in the series, and so on. Since each activated enzyme 
cleaves many molecules of the next proenzyme in the chain, the activation of the 
early components consists of an amplifying proteolytic cascade. 

Many of these protein cleavages liberate a biologically active small fragment 
that can attract neutrophils, plus a membrane-binding larger fragment. The bind- 
ing of the large fragment to a cell membrane, usually the surface of a pathogen, 
helps stimulate the next reaction in the sequence. In this way, complement acti- 
vation is largely kept confined to the cell surface where it began. In particular, the 
large fragment of C3, called C3b, binds covalently to the surface of the pathogen. 
Here, it recruits protein fragments produced by cleavage of other early complement 
components to form proteolytic complexes that catalyze the subsequent steps in 
the complement cascade. The early events in complement activation have diverse 
functions: C3b-binding receptors on phagocytic cells enhance the ability of these 
cells to phagocytose the pathogen, and similar receptors on B cells enhance the 
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Figure 24-6 Eosinophils attacking a 
parasite. Phagocytes cannot ingest large 
parasites such as the schistosome larva 
shown here. When the larva is coated with 
antibody or complement components, 
however, eosinophils (and other leukocytes) 
can recognize it and collectively kill it by 
secreting a large variety of toxic molecules. 
(Courtesy of Anthony Butterworth.) 


Figure 24-7 The principal stages in 
complement activation by the classical, 
lectin, and alternative pathways. In 

all three pathways, the reactions of 
complement activation usually take place 
on the surface of an invading microbe, such 
as a bacterium, and lead to the cleavage of 
C3 and the various consequences shown. 
As indicated, the complement proteins 

C1 to C9, mannose-binding lectin (MBL), 
MBL-associated serine protease (MASP), 
and factors B and D are the central 
components of the complement system. 
The early components are shown within 
gray arrows, while the late components are 
shown within a brown arrow. The functions 
of the protein fragments produced during 
complement activation are indicated by 
the black arrows. The various complement 
proteins that regulate the system are 
omitted. 
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ability of these cells to make antibodies against various microbial molecules on 
C3b-coated pathogens; the smaller fragment of C3 (called C3a), as well as small 
fragments of C4 and C5, act independently as diffusible signals to promote an 
inflammatory response by recruiting leukocytes to the site of infection. 

As indicated in Figure 24-7, antibodies bound to the surface of a pathogen 
activate the classical pathway. Mannose-binding lectin, mentioned earlier, is a 
secreted PRR that initiates the lectin pathway of complement activation when 
it recognizes bacterial or fungal glycolipids and glycoproteins bearing terminal 
mannose and fucose sugars in a particular spatial conformation. These initial 
binding events in the classical and lectin pathways cause the recruitment and 
activation of the early complement components. Finally, molecules on the sur- 
face of pathogens will often directly activate the alternative pathway. 

Host cells produce various plasma membrane molecules that prevent comple- 
ment reactions from proceeding on their cell surface. The most important of these 
is the carbohydrate moiety sialic acid, acommon constituent of cell-surface gly- 
coproteins and glycolipids (see Figure 10-16). Because pathogens generally lack 
sialic acid, they are singled out for complement-mediated destruction, while host 
cells are spared. Some pathogens, including the bacterium Neisseria gonorrhoeae 
that causes the sexually transmitted disease gonorrhea, coat themselves with a 
layer of sialic acid to effectively hide from the complement system. 

Membrane-immobilized C3b, produced by any of the three pathways, trig- 
gers a further cascade of reactions that leads to the assembly of the late comple- 
ment components to form membrane attack complexes. These protein complexes 
assemble in the pathogen membrane near the site of C3 activation, forming aque- 
ous pores through the membrane (Figure 24-8). For this reason, and because they 
perturb the structure of the lipid bilayer in their vicinity, they make the membrane 
leaky and can, in some cases, cause the microbe to lyse. 

The self-amplifying, inflammatory, and destructive properties of the comple- 
ment cascade make it essential that key activated components be rapidly inacti- 
vated after they are generated, ensuring that the attack does not spread to nearby 
host cells. Inactivation is achieved in at least two ways. First, specific inhibitor pro- 
teins in the blood or on the surface of host cells terminate the cascade, by either 
binding or cleaving certain complement components once the components have 
been activated by proteolytic cleavage. Second, many of the activated compo- 
nents in the cascade are unstable; unless they bind immediately to either the next 
component in the complement cascade or to a nearby membrane, they rapidly 
inactivate. 


Virus-Infected Cells Take Drastic Measures to Prevent Viral 
Replication 

Because host-cell ribosomes make a virus’s proteins and host-cell lipids form the 
membranes of enveloped viruses, PAMPs are generally not present on the surface 


of viruses. Therefore, the only general way that a host cell PRR can recognize the 
presence of a virus is to detect unusual elements of the viral genome, such as the 
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Figure 24-8 Assembly of the late 
complement components to form a 
membrane attack complex. The cleavage 
of the early complement components 
(shown within gray arrows in Figure 

24-7) results in the formation of C3b- 
containing proteolytic complexes called 
C5 convertases (not shown). These then 
cleave the first of the late components, C5, 
to produce C5a and C5b. As illustrated, 
Cob rapidly assembles with C6 and C7 

to form C567, which then binds firmly via 
C7 to the membrane. One molecule of 
C8 binds to the complex to form C5678. 
The binding of a molecule of C9 to C5678 
induces a conformational change in C9 
that exposes a hydrophobic region and 
causes C9 to insert into the lipid bilayer of 
the target membrane. This starts a chain 
reaction in which the altered C9 binds a 
second molecule of C9, which can then 
bind another molecule of C9, and so on. 
In this way, a ring of C9 molecules forms 
a large transmembrane channel in the 
membrane. 
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double-stranded RNA (dsRNA) that is an intermediate in the life cycle of many 
viruses and is recognized by several PRRs including the Toll-like receptor TLR3; 
in addition, DNA virus genomes frequently contain significant amounts of the 
CpG motifs discussed earlier, which can be recognized by TLR9 (see Table 24-1, 
p. 1300). 

Mammalian cells are particularly adept at recognizing the presence of dsRNA, 
which activates intracellular PRRs that induce the host cell to produce and secrete 
two antiviral cytokines—interferon-@ (IFN@) and interferon-B (IFNB). These 
interferons are referred to as type I interferons to distinguish them from IFNy, 
which is a type II interferon and has different functions, as we discuss later. Type 
I interferons act in both an autocrine fashion on the infected cells that produced 
it and a paracrine fashion on uninfected neighbors. They bind to a common 
cell-surface receptor, which activates the JAK-STAT intracellular signaling path- 
way (see Figure 15-56) to stimulate specific gene transcription and thereby the 
production of more than 300 proteins, including many cytokines, reflecting the 
complexity of the cell’s acute response to a viral infection. 

The production of type I interferons appears to be a general response of mam- 
malian cells to a viral infection, and viral components other than dsRNA can trig- 
ger it. The type I interferons help block viral replication in multiple ways. They 
activate a latent ribonuclease that nonspecifically degrades single-stranded RNA. 
They also indirectly activate a protein kinase that phosphorylates and inactivates 
the protein synthesis initiation factor eIF2 (discussed in Chapter 6), thereby shut- 
ting down most protein synthesis in the infected host cell. Apparently, by destroy- 
ing most of its own RNA and transiently halting most of its protein synthesis, the 
host cell inhibits viral replication without killing itself. If these measures fail, the 
cell takes an even more extreme step to prevent the virus from replicating: it kills 
itself by undergoing apoptosis, often with the help of immune killer cells, as we 
discuss next. 


Natural Killer Cells Induce Virus-Infected Cells to Kill Themselves 


Type I interferons also have less direct ways of blocking viral replication. One of 
these is to enhance the activity of natural killer cells (NK cells), which are leu- 
kocytes related to T and B cells but are part of the innate immune system and 
are recruited early to sites of inflammation. Like cytotoxic T cells of the adaptive 
immune system (discussed later), NK cells destroy virus-infected cells by induc- 
ing the infected cells to kill themselves by undergoing apoptosis (discussed in 
Chapter 18). We consider how killer cells induce apoptosis later, when we dis- 
cuss how cytotoxic T cells do it (see Figure 24-43). Although they kill in the same 
way, the means by which cytotoxic T cells and NK cells distinguish the surface of 
virus-infected cells from that of uninfected cells are different (Movie 24.2). 

Both cytotoxic T cells and NK cells recognize the same special class of cell-sur- 
face proteins on a host cell to help determine if the cell is virus-infected, but they 
use distinct receptors to do so. The special cell-surface proteins recognized are 
called class I MHC proteins, because they are encoded by genes in the major his- 
tocompatibility complex; almost all nucleated cells in vertebrates express these 
genes, and we discuss them in detail later. Cytotoxic T cells use both T cell recep- 
tors (TCRs) and co-receptors to recognize peptide fragments of viral proteins 
bound to class I MHC proteins on the surface of virus-infected host cells and then 
induce the infected cells to kill themselves. By contrast, NK cells have cell-surface 
inhibitory receptors that monitor the level of class I MHC proteins on the surface 
of other host cells: the high levels of these MHC proteins normally present on 
healthy cells engage these receptors and thereby inhibit the killing activity of the 
NK cells. The NK cells thus focus primarily on host cells expressing abnormally 
low levels of class I MHC proteins and induce them to kill themselves; these are 
mainly virus-infected cells and some cancer cells (Figure 24-9). NK cell killing 
activity is stimulated when various activating receptors on the NK cell surface rec- 
ognize specific proteins that are greatly increased on the surface of virus-infected 
cells and some cancer cells. 


natural killer cell cancer cell 





Figure 24-9 A natural killer (NK) cell 
attacking a cancer cell. This scanning 
electron micrograph was taken shortly 
after the NK cell attached to the cancer 
cell, but before it induced the cell to die 
by apoptosis. (Courtesy of J.C. Hiserodt, 
in Mechanisms of Cytotoxicity by Natural 
Killer Cells [R.B. Herberman and D. 
Callewaert, eds.]. New York: Academic 
Press, 1995.) 
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NK cell NK cell Figure 24-10 How an NK cell recognizes 
its target. An NK cell preferentially attacks 
infected host cells and cancer cells 
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The reason that class I MHC protein levels are often low on virus-infected 
cells is that many viruses have developed a variety of mechanisms to inhibit the 
expression of these proteins on the surface of the host cells they infect, in order 
to avoid detection by cytotoxic T cells: some viruses encode proteins that block 
class I MHC gene transcription; others block the intracellular assembly of pep- 
tide-MHC complexes; still others block the transport of these complexes to the 
cell surface. By evading recognition by cytotoxic T cells in these ways, however, 
a virus incurs the wrath of NK cells, which recognize the infected cells as being 
different—both because the infected cells express little class I MHC protein and 
because they express large amounts of other surface proteins that are recognized 
by the activating receptors on the NK cells (Figure 24-10). 


Dendritic Cells Provide the Link Between the Innate and 
Adaptive Immune Systems 


Dendritic cells are crucially important components of the innate immune system. 
They are a heterogeneous class of cells that are widely distributed in the tissues 
and organs of vertebrates. They express a large variety of PRRs, which enable den- 
dritic cells to recognize and phagocytose invading pathogens and their products 
and to become activated in the process. The activated dendritic cells cleave the 
proteins of the pathogen into peptide fragments, which bind to newly synthesized 
MHC proteins, which then carry the fragments to the dendritic cell surface. The 
activated cells then migrate to a nearby lymphoid organ such as a lymph node 
(also called a lymph gland), where they present the peptide-MHC complexes to 
T cells of the adaptive immune system, activating the T cells to join in the battle 
against the specific pathogen (Figure 24-11). 

In addition to the complexes of MHC proteins and microbial peptides dis- 
played on their cell surface, activated dendritic cells also display cell-surface co- 
stimulatory proteins that help activate T cells (see Figure 24-11). As we discuss 
later, the activated dendritic cells also secrete a variety of cytokines that influ- 
ence the type of response that the T cells make, ensuring that it is appropriate to 
fight the particular pathogen. In these ways, dendritic cells serve as crucial links 
between the innate immune system, which provides a rapid first line of defense 
against invading pathogens, and the adaptive immune system, which mounts 
slower but more powerful and highly specific responses to attack an invader, as 
we now discuss. 


Summary 


All multicellular organisms possess innate immune defenses against invading 
pathogens; these defenses include physical and chemical barriers and various 
defensive cell responses. In vertebrates, these innate defense responses can also 
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Figure 24-11 Dendritic cells as functional links between the innate and adaptive immune systems. Dendritic cells pick 
up invading microbes or their products at the site of an infection. The microbial PAMPs activate the dendritic cells to express 
co-stimulatory proteins and increased amounts of MHC proteins on their surface and to migrate via lymphatic vessels to a 
nearby lymph node. In the lymph node, the activated dendritic cells activate T cells that express appropriate receptors for the 
co-stimulatory proteins and the microbial peptides bound to MHC proteins on the dendritic cell surface. The activated T cells 
proliferate, and some of their progeny migrate to the original site of infection, where they help eliminate the microbes, either by 
activating local macrophages or by killing infected host cells (not shown). In addition, some of the activated T cells help stimulate 
specific B cells in the lymph node to secrete antibodies against the microbe (not shown). 

A crucial feature of dendritic cell activation is that the pathogen provides an individual dendritic cell with both the peptides 
for presentation to T cells and the PAMP signals that activate the dendritic cell to express co-stimulatory proteins. In this way, 
the individual dendritic cell has all it needs to activate specific T cells that recognize the peptide-MHC complexes on its surface 
(Movie 24.3). 


recruit specific and more powerful adaptive immune responses to help fight the 
infection. Innate immune responses rely on the ability of host cells to recognize 
characteristic features of microbial molecules called pathogen-associated molecu- 
lar patterns, or PAMPs, which can be associated with a pathogen’s proteins, lip- 
ids, sugars, or nucleic acids. PAMPs are mainly recognized by pattern recognition 
receptors (PRRs), including the Toll-like receptors (TLRs) found on or in both plant 
and animal cells. In vertebrates, some PRRs are secreted and can activate comple- 
ment when they bind microbial PAMPs. The complement system, which can also 
be activated by antimicrobial antibodies bound to pathogens, consists of a group 
of blood proteins that are activated in sequence to help fight infections, by disrupt- 
ing the pathogen’s membrane, stimulating an inflammatory response, or target- 
ing the microbe for phagocytosis—mainly by macrophages and neutrophils. The 
phagocytes use a combination of degradative enzymes, antimicrobial peptides, and 
oxygen-derived toxic molecules to kill invading pathogens; in addition, they secrete 
various signal molecules that help trigger an inflammatory response. 

Cells infected by a virus produce and secrete type I interferons (IFNa and IFNp), 
which induce a complex set of host-cell responses that inhibit viral replication. The 
interferons also enhance the killing activity of natural killer (NK) cells. An NK cell 
kills infected host cells because they express large amounts of surface proteins that 
activate the NK cell; the killing is especially efficient when infected cells express 
reduced amounts of class I MHC proteins, which, when present in normal amounts 
on a host cell surface inhibit the killing activity of NK cells. 


OVERVIEW OF THE ADAPTIVE IMMUNE SYSTEM 


Dendritic cells of the innate immune system functionally link innate immune 
responses to adaptive immune responses. The cells become activated when their 
PRRs pick up microbes and their products at sites of infection and phagocytose them. 
The activated cells cleave the microbial proteins into peptide fragments, which bind 
to newly made MHC proteins, which transport the fragments to the cell surface. The 
activated dendritic cells then carry the peptide-MHC complexes to a lymph organ, 
where they activate appropriate T cells to make specific adaptive immune responses 
against the microbes. 


OVERVIEW OF THE ADAPTIVE IMMUNE SYSTEM 


A dramatic “big bang” in immune defense mechanisms occurred when jawed ver- 
tebrates evolved and acquired an adaptive immune system. This sophisticated 
defense system depends on B and T lymphocytes (B and T cells), which, during 
their development, rearrange particular DNA sequences in various combinations 
so that, together, the cells can produce an almost limitless variety of B and T cell 
receptors and antibodies. Collectively, these proteins can bind to essentially any 
molecule, including small chemicals, carbohydrates, lipids, and proteins; indi- 
vidually, they can distinguish between molecules that are very similar—such as 
between two proteins that differ in only a single amino acid, or between two opti- 
cal isomers of the same small molecule. By this strategy, the adaptive immune 
system can recognize and respond specifically to any pathogen, including new 
mutant forms. However, because the genetic rearrangement process produces 
both receptors that can bind to self molecules as well as receptors that can bind to 
foreign molecules, vertebrates have had to evolve special mechanisms to ensure 
that B and T cells do not react against the host’s own molecules and cells—a pro- 
cess called immunological self-tolerance. 

Moreover, many harmless foreign substances enter the body, for example, as 
food or inhaled material, and it would be pointless and potentially dangerous to 
mount adaptive immune responses against them. Such inappropriate responses 
are normally avoided because innate immune responses are required to call 
adaptive immune responses into play and do so only when the innate cells’ PRRs 
recognize microbial PAMPs, as we discussed earlier. One can trick the adaptive 
immune system into responding to a harmless foreign molecule, such as a foreign 
protein, by co-injecting a molecule (often of microbial origin) called an adjuvant, 
which activates PRRs. This trick is called immunization and is the basis of vacci- 
nation. Any substance capable of stimulating B or T cells to make a specific adap- 
tive immune response against it is referred to as an antigen (antibody generator). 

There are two broad classes of adaptive immune responses—antibody 
responses and T-cell-mediated immune responses, and most pathogens induce 
both classes of responses. In antibody responses, B cells are activated to secrete 
antibodies, which are proteins that circulate in the bloodstream and permeate 
the other body fluids, where they can bind specifically to the foreign antigen that 
stimulated their production (see Figure 24-2). Binding of antibody neutralizes 
extracellular viruses and microbial toxins (such as tetanus toxin or cholera toxin) 
by blocking their ability to bind to receptors on host cells. Antibody binding also 
marks invading pathogens for destruction, both by making it easier for phago- 
cytes of the innate immune system to ingest and destroy them and by activating 
the complement system. 

In T-cell-mediated immune responses, T cells recognize foreign antigens that 
are bound to MHC proteins on the surface of host cells such as dendritic cells, 
which are specialized for presenting antigen to T cells and are therefore referred 
to as professional antigen-presenting cells (APCs). Because MHC proteins carry 
fragments of pathogen proteins from inside a host cell to the cell surface, T cells 
can detect pathogens hiding inside a host cell and either kill the infected cell (see 
Figure 24-2) or stimulate phagocytes or B cells to help eliminate the pathogens. 

In this section, we discuss the origins and general properties of B and T cells. 
In later sections, we consider the specific properties and functions of these cells. 
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B Cells Develop in the Bone Marrow, T Cells in the Thymus 


There are about 2 x 10!* lymphocytes in the human body, making the immune 
system comparable in cell mass to the liver or the brain. They occur in large num- 
bers in the blood and lymph (the colorless fluid in the lymphatic vessels, which 
connect the lymph nodes in the body to each other and to the bloodstream). They 
are also concentrated in lymphoid organs, such as the thymus, lymph nodes, and 
spleen (Figure 24-12), and many are also found in other organs, including skin, 
lung, and gut. 

T cells and B cells derive their names from the organs in which they develop: 
T cells develop in the thymus, and B cells, in adult mammals, develop in the bone 
marrow. Both types of cells develop from lymphoid progenitor cells that are pro- 
duced from multipotent hematopoietic stem cells, which are found mainly in the 
bone marrow (Figure 24-13). The hematopoietic stem cells give rise to more than 
just lymphocytes: as discussed in Chapter 22, they produce all of the cells of the 
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Figure 24—12 Human lymphoid organs. 
Lymphocytes develop from lymphoid 
progenitor cells in the thymus and bone 
marrow (yellow), which are therefore called 
central (or primary) lymphoid organs. 

The newly formed lymphocytes migrate 
from these primary organs to peripheral 
(or secondary) lymphoid organs, where 
they can react with foreign antigen. Only 
some of the peripheral lymphoid organs 
(blue) and lymphatic vessels (green) are 
shown; many lymphocytes, for example, 
are found in the skin and respiratory tract. 
As we discuss later, the lymphatic vessels 
ultimately empty into the bloodstream (not 
shown). 


Figure 24-13 The development of 

B and T cells. The central lymphoid 
organs, where lymphocytes develop from 
lymphoid progenitor cells, are labeled in 
yellow boxes. The lymphoid progenitor cells 
develop from multipotent hematopoietic 
stem cells in the bone marrow. Some 
lymphoid progenitor cells develop locally 
in the bone marrow into immature B cells, 
while others migrate to the thymus (via 

the bloodstream) where they develop into 
thymocytes (developing T cells). Foreign 
antigens activate B cells and T cells mainly 
in peripheral lymphoid organs, such as 
lymph nodes or the spleen. 
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hematopoietic system, including erythrocytes, leukocytes, and platelets (see Fig- 
ure 22-32). 

Because they are sites where lymphocytes develop from lymphoid progeni- 
tor cells, the thymus and bone marrow are referred to as central (primary) lym- 
phoid organs (see Figure 24-12). As we discuss later, most B and T cells die in 
the central lymphoid organs soon after they develop, without ever functioning. 
Others, however, mature and migrate via the blood to the peripheral (secondary) 
lymphoid organs—mainly the lymph nodes, spleen, and epithelium-associated 
lymphoid tissues in the gastrointestinal tract, respiratory tract, and skin. It is in 
these peripheral lymphoid organs that foreign antigens activate B and T cells (see 
Figure 24-13). 

B and T cells become morphologically distinguishable from each other only 
after antigen has activated them: resting B and T cells look very similar, even in 
an electron microscope (Figure 24-14A). After activation by an antigen, both pro- 
liferate and mature into effector cells. Effector B cells secrete antibodies; in their 
most mature form, called plasma cells, they are filled with an extensive rough 
endoplasmic reticulum that is busily making antibodies (Figure 24-14B). In con- 
trast, effector T cells (Figure 24-14C) contain very little endoplasmic reticulum 
and secrete a variety of cytokines rather than antibodies. Whereas B-cell-derived 
antibodies are widely distributed by the bloodstream, T-cell-derived cytokines 
mainly act locally on neighboring cells, although some are carried via the blood 
and act on distant host cells. 


Immunological Memory Depends On Both Clonal Expansion and 
Lymphocyte Differentiation 


The most remarkable feature of the adaptive immune system is that it can respond 
to millions of different foreign antigens in a highly specific way. Human B cells, for 
example, collectively, can make more than 10!" different antibody molecules that 
react specifically with the antigen that induced their production. How can B cells 
and T cells respond specifically to such an enormous diversity of foreign antigens? 
The answer for both B and T cells is the same. As each lymphocyte develops in 
a central lymphoid organ, it becomes committed to react with a particular anti- 
gen before ever being exposed to the antigen. It expresses this commitment in 
the form of cell-surface receptors that specifically bind the antigen. When a lym- 
phocyte encounters its antigen in a peripheral lymphoid organ, the binding of the 
antigen to the receptors (with help from co-stimulatory signals, discussed later) 
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Figure 24-14 Electron micrographs of 
resting and effector lymphocytes. 

(A) This resting lymphocyte could be 

either a B cell or a T cell, as these cells 

are difficult to distinguish morphologically 
until antigen activates them to become 
effector cells. (B) An effector B cell (a 
plasma cell). It is filled with an extensive 
rough endoplasmic reticulum (ER), which is 
distended with antibody molecules that are 
secreted in large amounts. (C) An effector 
T cell, which has relatively little rough ER 
but is filled with free ribosomes; it secretes 
cytokines, but in relatively small amounts. 
The three cells are shown at the same 
magnification. (A, courtesy of Dorothy 
Zucker-Franklin; B, courtesy of Carlo 
Grossi; A and B, from D. Zucker-Franklin 
et al., Atlas of Blood Cells: Function and 
Pathology, 2nd ed. Milan, Italy: Edi. Ermes, 
1988; C, courtesy of Stefanello de Petris.) 
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activates the lymphocyte; this causes the lymphocyte to proliferate, thereby pro- 
ducing many more cells with the same receptor—a process called clonal expan- 
sion. The encounter with antigen also causes some of the cells to differentiate into 
effector cells. An antigen therefore selectively stimulates those cells that express 
complementary antigen-specific receptors and are thus already committed to 
respond to it (Figure 24-15). This arrangement, called clonal selection, provides 
an explanation for immunological memory, whereby we develop lifelong immu- 
nity to many common infectious diseases after our initial exposure to the patho- 
gen—either through natural infection or vaccination. 

It is easy to demonstrate such immunological memory in experimental ani- 
mals. If an animal is immunized once with antigen A, an immune response (anti- 
body, T-cell-mediated, or both) can be detected after several days; the response 
rises rapidly and exponentially, and then, more gradually, declines. This is the 
characteristic course of a primary immune response, occurring on an animal’s 
first exposure to an antigen. If, after some weeks, months, or even years have 
elapsed, the animal is immunized again with antigen A, it will usually produce 
a secondary immune response that differs from the primary response: the lag 
period is shorter, because there are now many more preexisting B or T cells (or 
both) with specificity for antigen A, and the response is greater and more efficient. 
These differences indicate that the animal has “remembered” its first exposure 
to antigen A. If the animal is given a different antigen (for example, antigen B) 
instead of a second immunization with antigen A, the response is typical of a pri- 
mary, and not a secondary, immune response. The secondary response therefore 
reflects antigen-specific immunological memory for antigen A (Figure 24-16). 

Immunological memory depends on both lymphocyte proliferation and 
differentiation. In an adult animal, the peripheral lymphoid organs contain a 
mixture of lymphocytes in at least three stages of maturation: naive cells, effector 
cells, and memory cells. When naive cells encounter their specific foreign antigen 
for the first time, the antigen stimulates some of them to proliferate and differen- 
tiate into effector cells, which then carry out an immune response (effector B cells 
secrete antibody, whereas effector T cells either kill infected cells or influence 
the response of other immune cells—by secreting cytokines, for example). Some 
of the antigen-stimulated naive cells multiply and differentiate into memory 
cells, which are more easily and more quickly induced to become effector cells 
by a later encounter with the same antigen: like naive cells, when memory cells 
encounter their antigen, they give rise to either effector cells or more memory 
cells (Figure 24-17). 


Figure 24-15 Clonal selection. An antigen 
activates only those lymphocytes that are 
already committed to respond to it. The 
committed cell expresses cell-surface 
receptors that specifically recognize the 
antigen. The human adaptive immune 
system consists of many millions of 
different T and B lymphocyte clones, 

with cells within a clone expressing the 
same unique antigen receptor. Before 

its first encounter with antigen, a clone 
would usually contain only one or a small 
number of cells. A particular antigen may 
activate hundreds of different clones, each 
expressing a different antigen receptor that 
binds either a different part of the antigen 
or the same part with a different binding 
affinity. Although only B cells are shown 
here, T cells are selected in a similar way. 
Note that the antigen receptors on the 

B cells labeled B in this diagram have the 
same antigen-binding site as the antibodies 
secreted by the effector BB cells. As we 
discuss later, B cells require co-stimulatory 
signals from T cells to become activated by 
antigen to proliferate and differentiate into 
antibody-secreting cells (not shown). 
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Thus, during the primary response, clonal expansion and differentiation of 
antigen-stimulated naive cells creates many memory cells, which are able to 
respond to the same antigen more sensitively, rapidly, and effectively. And, unlike 
most effector cells, which die within days or weeks, memory cells can persist for 
the lifetime of the animal, even in the absence of their specific antigen, thereby 
providing lifelong immunological memory. Although most effector B and T cells 
die after an immune response is over, some survive as effector cells and help pro- 
vide long-term protection against the pathogen. A small proportion of the plasma 
cells produced in a primary B cell response, for example, can survive for many 
months or years in the bone marrow, where they continue to secrete their specific 
antibodies into the bloodstream. 


Lymphocytes Continuously Recirculate Through Peripheral 
Lymphoid Organs 


Pathogens generally enter the body through an epithelial surface, usually through 
the skin, gut, or respiratory tract. To induce an adaptive immune response, 
microbes or their products must travel from these entry points to a peripheral 
lymphoid organ, such as a lymph node or the spleen, which are the sites where 
lymphocytes are activated (see Figure 24-11). The route and destination depend 
on the site of entry. Lymphatic vessels carry antigens that enter through the skin 
or respiratory tract to local lymph nodes; antigens that enter through the gut end 
up in gut-associated peripheral lymphoid organs such as Peyer’s patches; and the 
spleen filters out antigens that enter the blood (see Figure 24-12). As discussed 
earlier (see Figure 24-11), in many cases, activated dendritic cells will carry the 
antigen from the site of infection to the peripheral lymphoid organ, where they 
play a crucial part in activating T cells, as we discuss later. 

But only a tiny fraction of naive B and T cells can recognize a particular micro- 
bial antigen in a peripheral lymphoid organ, a reasonable estimate being between 
1/10,000 and 1/1,000,000 of each class of lymphocyte, depending on the antigen. 
How do these rare cells find an antigen-presenting cell displaying their specific 
antigen? The answer is that the lymphocytes continuously recirculate between 
one peripheral lymphoid organ and another via the lymph and blood. In a lymph 
node, for example, lymphocytes continually leave the bloodstream by squeezing 
out between specialized endothelial cells lining small veins called postcapillary 


Figure 24-17 A model for the cellular basis of immunological memory. 
When stimulated by their specific antigen and co-stimulatory signals, naive 
lymphocytes proliferate and differentiate. Most become effector cells, which 
function and then usually die, while others become memory cells. During a 
subsequent exposure to the same antigen, the memory cells respond more 
readily, rapidly, and efficiently than did the naive cells: they proliferate and give 
rise to effector cells and to more memory cells. Some memory T cells also 
develop from a minority of effector T cells (not shown). It is not known how 
the decision to become an effector cell versus a memory cell is made. 
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Figure 24—16 Immunological memory: 
primary and secondary antibody 
responses. The secondary response 
induced by a second exposure to antigen 
A is faster and greater than the primary 
response and is specific for A, indicating 
that the adaptive immune system has 
specifically remembered its previous 
encounter with antigen A. The same type 
of immunological memory is observed in 
T-cell-mediated responses (not shown). As 
we discuss later, the types of antibodies 
produced in the secondary response 

are different from those produced in the 
primary response, and these antibodies 
bind the antigen more tightly. 
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venules. After percolating through the node, they accumulate in small lymphatic 
vessels that leave the node and connect with other lymphatic vessels that pass 
through other lymph nodes downstream (see Figure 24-12). Passing into larger 
and larger vessels, the lymphocytes eventually enter the main lymphatic vessel 
(the thoracic duct), which carries them back into the blood (Figure 24-18). 

The continuous recirculation of a lymphocyte between the blood and lymph 
ends only if its specific antigen activates it in a peripheral lymphoid organ. In that 
case, the lymphocyte remains in the peripheral lymphoid organ, where it prolifer- 
ates and differentiates into either effector cells or memory cells. Many of the effec- 
tor T cells leave the lymphoid organ via the lymph and migrate through the blood 
to the site of infection (see Figure 24-11), whereas others stay in the lymphoid 
organ and help activate (or suppress) other immune cells there. Some effector 
B cells (plasma cells) remain in the peripheral lymphoid organ and secrete anti- 
bodies into the blood for days until they die; others migrate to the bone marrow, 
where they secrete antibodies into the blood for months or years. The memory T 
and B cells produced join the recirculating pool of lymphocytes. 

Lymphocyte recirculation depends on specific interactions between the lym- 
phocyte cell surface and the surface of the endothelial cells lining the blood ves- 
sels in the peripheral lymphoid organs. Lymphocytes that enter a lymph node 
via the blood, for example, adhere weakly to specialized endothelial cells lining 
the postcapillary venules via homing receptors that belong to the selectin family 
of cell-surface lectins that bind to specific sugar groups on the endothelial cell 
surface (see Figure 19-28). The lymphocytes roll slowly along the surface of the 
endothelial cells until another, much stronger adhesion system, dependent on 
an integrin protein, is called into play by chemokines secreted by the endothe- 
lial cells. Now, the lymphocytes stop rolling, and they crawl out of the blood ves- 
sel into the lymph node by using yet another cell adhesion protein called CD31 
(Figure 24-19). Although B and T cells initially enter the same region of a lymph 
node, different chemokines guide them to separate regions of the node—B cells to 
lymphoid follicles and T cells to the paracortex (Figure 24-20). 


basal 
lamina 











l i 
WEAK ADHESION 
AND ROLLING 
(selectin-dependent) postcapillary 

| venule 





lymphocyte ~~ N chemokine 





specialized 
endothelial cell 
of postcapillary 
venule 


chemokine —» 


Figure 24-19 Migration of a lymphocyte out of the bloodstream into a lymph node. 

A circulating lymphocyte adheres weakly to the surface of the specialized endothelial cells lining a 
postcapillary venule in a lymph node. This initial adhesion is mediated by L-selectin (discussed in 
Chapter 19) on the lymphocyte surface. The adhesion is sufficiently weak to enable the lymphocyte, 
pushed by the flow of blood, to roll along the surface of the endothelial cells. Stimulated by 
chemokines secreted by specialized endothelial cells in the node (curved red arrow), the lymphocyte 
rapidly activates a stronger adhesion system, mediated by an integrin. This strong adhesion 
enables the cell to stop rolling. The lymphocyte then uses an immunoglobulin-like cell adhesion 
protein (CD31) to bind to the junctions between adjacent endothelial cells and migrate out of the 
venule. The subsequent migration of the lymphocyte in the lymph node depends on chemokines 
produced within the node (Straight red arrow). The migration of other types of leukocytes out of the 
bloodstream into sites of infection occurs in a similar way (See Figure 19-28 and Movie 19.2). 
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Figure 24-18 The path followed by 
lymphocytes as they continuously 
recirculate between the lymph and 
blood. The circulation through a lymph 
node (yellow) is shown here. Microbial 
antigens are usually carried into the 

lymph node by activated dendritic cells 
(not shown), which enter the node via 
afferent lymphatic vessels draining an 
infected tissue (green). B and T cells, by 
contrast, enter via the blood, migrating out 
of the bloodstream into the lymph node 
through postcapillary venules. Unless 

they encounter their antigen, the B and 

T cells leave the lymph node via efferent 
lymphatic vessels, which eventually join the 
thoracic duct. The thoracic duct empties 
into a large vein carrying blood to the heart, 
completing the circulation cycle for T and 
B cells. A typical circulation cycle for these 
lymphocytes takes about 12-24 hours. 
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Unless they encounter their antigen, both B and T cells soon leave the lymph 
node via efferent lymphatic vessels. If they encounter their antigen, however, they 
are stimulated to display adhesion receptors that trap the cells in the node; the 
cells accumulate at the junction between the B cell and T cell areas, where the 
rare antigen-specific B and T cells can interact, leading to their proliferation and 
differentiation into either effector cells or memory cells. Many of the effector cells 
leave the node, expressing different chemokine receptors that help guide them to 
their new destinations—effector plasma B cells to the bone marrow and effector 
T cells to sites of infection. 


Immunological Self- Tolerance Ensures That B and T Cells Do Not 
Attack Normal Host Cells and Molecules 


As discussed earlier, cells of the innate immune system use PRRs to distinguish 
microbial molecules from self molecules made by the host. The adaptive immune 
system has the far more difficult recognition task of responding specifically to an 
almost unlimited number of foreign molecules while not responding to the large 
number of self molecules. How does it accomplish this feat? It helps that self mol- 
ecules normally do not induce the innate immune reactions required to activate 
adaptive immune responses. But even when an infection or tissue injury triggers 
innate reactions, the vast majority of self molecules normally still fail to induce an 
adaptive immune response. Why? 

One important reason is that the adaptive immune system “learns” not to 
respond to self molecules. Normal mice, for example, cannot mount an immune 
response against one of their own protein components of the complement system 
called C5 (see Figure 24-7). However, mutant mice that lack the gene encoding C5 
but are otherwise genetically identical to normal mice of the same strain can make 
a strong immune response to this blood protein when immunized with it. The 
immunological self-tolerance exhibited by normal mice persists only for as long 
as the self molecule remains in the body: if a self molecule such as C5 is experi- 
mentally removed from an adult mouse, the animal gains the ability to respond 
to it after a few weeks or months, as new B and T cells develop in the absence of 
C5. Thus, the adaptive immune system is genetically capable of responding to self 
molecules, but it learns not to do so. 

Self-tolerance depends on a number of distinct mechanisms, including the 
following (Figure 24-21): 


1. In receptor editing, developing B cells that recognize self molecules change 
their antigen receptors so that the cells no longer do so. 


2. In clonal deletion, potentially self-reactive B and T cells die by apoptosis 
when they encounter their particular self molecule. 
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Figure 24—20 A simplified drawing of a 
human lymph node. B cells are primarily 
clustered in structures called lymphoid 
follicles, whereas T cells are found mainly 
in the paracortex. Chemokines attract both 
types of lymphocytes into the lymph node 
from the blood via postcapillary venules 
(see Figure 24-19). B and T cells then 
migrate to their respective areas, attracted 
by different chemokines. If they do not 
encounter their specific antigen, both B 
cells and T cells then enter the medullary 
sinuses and leave the node via the efferent 
lymphatic vessel. This vessel ultimately 
empties into the bloodstream, allowing 

the lymphocytes to begin another cycle of 
circulation through a peripheral lymphoid 
organ (see Figure 24-18). During an 
infection, proliferation of pathogen-specific 
B cells produces a germinal center in some 
lymphoid follicles. 
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3. Inclonal inactivation (also called clonal anergy), self-reactive B and T cells 
become functionally inactivated when they encounter their self molecule. 


4. In clonal suppression, self-reactive regulatory T cells (discussed later) sup- 
press the activity of other types of potentially self-reactive lymphocytes. 


Some of these mechanisms—especially the first two, receptor editing in B cells 
and clonal deletion of B and T cells—operate in central lymphoid organs when 
newly formed self-reactive B and T cells first encounter their self molecules, and 
they are largely responsible for the process called central tolerance. Clonal inac- 
tivation and clonal suppression, by contrast, operate mainly when mature B and 
T cells encounter their self molecules in peripheral lymphoid organs, and they 
are largely responsible for the process called peripheral tolerance. Clonal deletion, 
however, can also operate peripherally, and clonal inactivation can also operate 
centrally. 

Why does the binding of a self molecule lead to tolerance rather than activa- 
tion? The answer is still not completely known. As we discuss later, the activation 
of a B or T cell by its antigen in a peripheral lymphoid organ requires more than 
just antigen binding: it requires co-stimulatory signals, which are provided by a 
helper T cell (discussed later) in the case of a B cell and by an activated dendritic 
cell in the case of a naive T cell. The production of such signals is usually triggered 
by exposure to a pathogen, but a self-reactive lymphocyte normally encounters 
its self antigen in the absence of such signals. Under these conditions, the lym- 
phocyte will not only fail to be activated, it will often be rendered tolerant—being 
either killed or inactivated, or actively suppressed by a regulatory T cell (see Fig- 
ure 24-21). In peripheral lymphoid organs, both T cell tolerance and activation 
usually occur on the surface of a dendritic cell. 

For reasons that are usually unknown, self-tolerance mechanisms sometimes 
fail, causing T or B cells (or both) to react against the animal’s own molecules. 
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Figure 24-21 Mechanisms of immunological self-tolerance. When a self-reactive immature B cell binds its self molecule in the central lymphoid 
organ where the cell is produced, it may alter its antigen receptor so that it is no longer self-reactive (cell 1); this process is called receptor editing. 
Alternatively, when either an immature B or T cell binds its self molecule in a central lymphoid organ, it may die by apoptosis, a process called clonal 
deletion (cell 2). Because these two forms of tolerance (shown on the left) occur in central lymphoid organs, they are called central tolerance. 

When a self-reactive naive B or T cell escapes tolerance in the central lymphoid organ and binds its self molecule in a peripheral lymphoid organ 
(cell 4), or in another peripheral tissue, it will generally not be activated, because the binding usually occurs in the absence of sufficient co-stimulatory 
signals; instead, the cell may die by apoptosis (often after a period of proliferation), be inactivated, or be suppressed by a regulatory T cell. These 
forms of tolerance (shown on the right) are called peripheral tolerance. As discussed later, the cells providing the co-stimulatory signals are 
T lymphocytes for B cells and usually dendritic cells for T cells (not shown). For T cells at least, both activation and tolerance in a peripheral lymphoid 
organ usually occurs on the surface of a dendritic cell, although the dendritic cells are different in the two cases. 
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Myasthenia gravis is an example of such an autoimmune disease. Most of the 
affected individuals make antibodies against the acetylcholine receptors on their 
own skeletal muscle cells; these receptors are required for the muscle to contract 
normally in response to nerve stimulation, which releases acetylcholine (see Fig- 
ure 11-39). The antibodies interfere with the normal functioning of the receptors 
so that the patients become weak and may die because they cannot breathe. Sim- 
ilarly, in juvenile (type 1) diabetes, adaptive immune reactions against insulin-se- 
creting ß cells in the pancreas kill these cells, leading to severe insulin deficiency. 


Summary 


Innate immune responses triggered by pathogens at sites of infection help activate 
adaptive immune responses in peripheral lymphoid organs. The adaptive immune 
system is composed of many millions of B and T cell clones, with the cells in each 
clone sharing a unique cell-surface receptor that enables them to bind a particu- 
lar pathogen antigen. The binding of antigen to these receptors, with the help of 
co-stimulatory signals, stimulates the lymphocyte to proliferate and differentiate 
into an effector cell that can help eliminate the pathogen. Effector B cells secrete 
antibodies, which can act over long distances to help eliminate extracellular 
pathogens and their toxins. Effector T cells, by contrast, produce cell-surface and 
secreted co-stimulatory molecules, which mainly act locally to help other immune 
cells eliminate the pathogen; in addition, some T cells can induce infected host cells 
to kill themselves. 

During a primary adaptive immune response to an antigen, lymphocytes that 
recognize the antigen proliferate so that there are more of them to respond the next 
time, during a secondary response to the same antigen; moreover, during a primary 
response, some lymphocytes differentiate into memory cells, which can respond 
faster and more efficiently the next time the same pathogen invades. These two 
mechanisms are largely responsible for immunological memory. Both B and T cells 
circulate continuously between one peripheral lymphoid organ and another via the 
blood and lymph; only if they encounter their specific foreign antigen in a periph- 
eral lymphoid organ do they stop migrating, proliferate, and differentiate into effec- 
tor cells or memory cells. Lymphocytes that would react against self molecules either 
alter their receptors (in the case of B cells) or are eliminated or inactivated; they can 
also be suppressed by regulatory T cells. These mechanisms collectively are respon- 
sible for immunological self-tolerance, which ensures that the adaptive immune 
system normally avoids attacking the molecules and cells of the host. 
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Vertebrates inevitably die of infection if they are unable to make antibodies. 
Antibodies are secreted proteins that defend us against extracellular pathogens 
in several ways. They bind to viruses and microbial toxins, thereby preventing 
them from binding to and damaging host cells (see Figure 24-2). When bound 
to an extracellular pathogen or its products, antibodies also recruit some of the 
components of the innate immune system, including various types of leukocytes 
and components of the complement system, which work together to inactivate or 
eliminate the invaders. 

Synthesized exclusively by B cells, antibodies are produced in billions of forms, 
each with a different amino acid sequence. They belong to the class of proteins 
called immunoglobulins (abbreviated as Igs) and are among the most abundant 
protein components in the blood. In this section, we discuss the structure and 
function of immunoglobulins and how they are made in so many different forms. 


B Cells Make Immunoglobulins (lgs) as Both Cell-Surface Antigen 
Receptors and Secreted Antibodies 
The first Igs made by a newly formed B cell are not secreted but are instead 


inserted into the plasma membrane, where they serve as receptors for antigen. 
They are called B cell receptors (BCRs), and each B cell has approximately 10° of 


1315 


them in its plasma membrane. Each BCR is stably associated with invariant trans- 
membrane proteins that activate intracellular signaling pathways when antigen 
binds to the BCR; we discuss these invariant proteins later, when we consider how 
B cells are activated with the assistance of helper T cells. 

Each B cell clone produces a single species of BCR, with a unique antigen- 
binding site. When an antigen and a helper T cell activate a naive or a memory 
B cell, the B cell proliferates and differentiates into an effector cell, which then 
produces and secretes large amounts of soluble (rather than membrane-bound) 
Ig. The secreted Ig is now called an antibody, and it has the same unique antigen- 
binding site as the BCR (Figure 24-22). 

A typical Ig molecule is bivalent, with two identical antigen-binding sites. It 
consists of four polypeptide chains—two identical light chains and two identical 
heavy chains. The N-terminal parts of both light and heavy chains usually cooper- 
ate to form the antigen-binding surface, while the more C-terminal parts of the 
heavy chains form the tail of the Y-shaped protein (Figure 24-23). The tail medi- 
ates many of the activities of antibodies, and antibodies with the same antigen- 
binding sites can have any one of a number of different tail regions, each of which 
gives the antibody different functional properties, such as the ability to activate 
complement or to bind to receptor proteins on phagocytic cells that bind a spe- 
cific type of antibody tail. 


Mammals Make Five Classes of Igs 


In mammals, there are five major classes of Igs, each of which mediates a char- 
acteristic biological response following antigen binding to an antibody: IgA, IgD, 
IgE, IgG, and IgM, each with its own class of heavy chain—a, ð, £, y, and y, respec- 
tively. IgA molecules have a chains, IgG molecules have y chains, and so on. 
Moreover, there are four human IgG subclasses (IgG1, IgG2, IgG3, and IgG4), with 
Yu Y2 Y3, and y4 heavy chains, respectively. There are also two IgA subclasses in 
humans. In addition to the various classes and subclasses of heavy chains, higher 
vertebrates have two types of light chains, K and A, which seem to be function- 
ally indistinguishable. Either type of light chain may be associated with any of the 
heavy chains, but an individual Ig molecule always contains identical light chains 
and identical heavy chains: an IgG molecule, for instance, may have either K or 
À light chains, but not one of each. As a result, an Ig’s antigen-binding sites are 
always identical (see Figure 24-22). 

The various heavy chains give a distinctive conformation to the tail region of 
antibodies, so that each class (and subclass) has characteristic properties of its 
own. IgM is always the first class of Ig that a developing B cell in the bone marrow 
makes. It forms the BCRs on the surface of immature naive B cells. After these cells 
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Figure 24-22 The B cell receptors 
(BCRs) and secreted antibodies made 
by a B cell clone. The binding of an 
antigen to BCRs on either a naive or 
memory B cell (together with co-stimulatory 
signals provided by helper T cells —not 
shown) activates the cell to proliferate 
and differentiate into effector B cells. 

The effector cells produce and secrete 
antibodies with a unique antigen-binding 
site, which is the same as that of the cell- 
surface BCRs. Because antibodies have 
two identical antigen-binding sites, they 
can cross-link antigens, as shown for an 
antigen with multiple identical antigenic 
determinants. 


Figure 24-23 A schematic drawing of 

a bivalent antibody molecule. The two 
heavy chains each have a hinge region, 
which, because of its flexibility, improves 
the efficiency with which the antibody can 
cross-link antigens (see Figure 24-22). The 
two heavy chains also form the tail of the 
antibody, which determines its functional 
properties. The heavy and light chains are 
held together by both covalent S-S bonds 
(red) and noncovalent bonds (not shown). 
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leave the bone marrow, they start to produce IgD BCRs as well, with the same 
antigen-binding site as the IgM BCRs. These cells are now called mature naïve B 
cells, as they can now respond to their specific foreign antigen in peripheral lym- 
phoid organs (Figure 24-24). IgM is also the major class of antibody secreted into 
the blood in the early stages of a primary antibody response on first exposure to 
an antigen. In its secreted form, IgM is a wheel-like pentamer composed of five 
four-chain units, giving it a total of 10 antigen-binding sites that allow it to bind 
strongly to pathogens; in its antigen-bound form, IgM is highly efficient at activat- 
ing complement, which is important in early antibody responses to pathogens. 

The major antibody class in the blood is IgG. These antibodies are four-chain 
monomers (see Figure 24-23), and they are produced in especially large quan- 
tities during secondary antibody responses. The tail region of some subclasses 
of IgG antibodies that are bound to antigen can activate complement and also 
bind to specific receptors on macrophages and neutrophils. Largely by means of 
such Fc receptors (so-named because antibody tails are called Fc regions), these 
phagocytic cells bind, ingest, and destroy infecting microorganisms that have 
become coated with the IgG antibodies produced in response to the infection; 
the activated Fc receptors also signal the phagocyte to secrete pro-inflammatory 
cytokines (Movie 24.4). 

The tail region of IgE antibodies binds to another class of Fc receptors on the 
surface of mast cells in tissues and of basophils in the blood. Because antigen-free 
IgE antibodies bind with high affinity to such Fc receptors, the antibodies act as 
antigen receptors on these cells. Antigen binding to the bound antibodies acti- 
vates the Fc receptors and stimulates the cells to secrete a variety of cytokines 
and biologically active amines, especially histamine, which causes blood vessels 
to dilate and become leaky; this helps leukocytes, antibodies, and complement 
components to enter sites where mast cells have been activated. The release of 
amines from mast cells and basophils is largely responsible for the symptoms 
of such allergic reactions as hay fever, asthma, and hives. In addition, mast cells 
secrete factors that attract and activate leukocytes called eosinophils, which also 
have Fc receptors that bind IgE molecules and can kill extracellular parasitic 
worms, especially if the worms are coated with IgE antibodies (see Figure 24-6). 

IgA is the principal antibody class in secretions, including saliva, tears, milk, 
and respiratory and intestinal secretions. Yet another class of Fc receptors, located 
on the relevant epithelial cells, guides the secretion by binding antigen-free IgA 
dimers and transporting them across the epithelium. The properties of the various 
classes of antibodies in humans are summarized in Table 24-2. 

All classes of Ig can be made in a membrane-bound form, as well as in a sol- 
uble, secreted form. The two forms differ only in the C-terminus of their heavy 
chain. The heavy chains of membrane-bound Ig molecules (BCRs) have a trans- 
membrane hydrophobic C-terminus, which anchors them in the lipid bilayer of 
the B cell’s plasma membrane. The heavy chains of secreted antibody molecules, 
by contrast, have instead a hydrophilic C-terminus, which allows them to escape 
from the cell. The switch in the character of the Ig molecules made occurs because 
the activation of B cells by antigen and helper T cells induces a change in the way 
in which the heavy-chain RNA transcripts are made and processed in the nucleus 
(see Figure 7-59). 
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Figure 24-24 Stages of B cell 
development. All of the stages shown 
occur before the cells bind their specific 
antigen. The first cells in the B cell lineage 
that make Ig are called pro-B cells; they 
make u heavy chains, which remain in the 
endoplasmic reticulum until a special type 
of light chain is made called a surrogate 
light chain. The surrogate light chains 
substitute for genuine light chains and 
assemble with u chains to form a receptor 
molecule that inserts into the plasma 
membrane. The cells are now called pre-B 
cells. Signaling from this pre-B cell receptor 
allows the cells to make bona fide light 
chains, which combine with u chains to 
form four-chain IgM molecules that serve 
as cell-surface BCRs on immature naive 

B cells. After these cells leave the bone 
marrow, they start to express IgD BCRs as 
well, which have the same antigen-binding 
sites as the IgM BCRes; it is this mature 
naive B cell that reacts with its specific 
foreign antigen in peripheral lymphoid 
organs. 
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TABLE 24-2 





Ig Light and Heavy Chains Consist of Constant and Variable 
Regions 


Both light and heavy chains have a variable amino acid sequence at their N-termi- 
nal ends but a constant sequence at their C-terminal ends. Whereas the constant 
region and variable region of a light chain are the same size, the constant region 
of a heavy chain is about three or four times longer, depending on the class (Fig- 
ure 24-25). 

The variable regions of the light and heavy chains come together to form the 
antigen-binding sites, and the variability of their amino acid sequences provides 
the structural basis for the diversity of these binding sites. The greatest diversity 
occurs in three small hypervariable regions in the variable regions of both light 
and heavy chains. Only about 5-10 amino acids in each hypervariable region form 
the actual antigen-binding site (Figure 24-26). As a result, the size of the anti- 
genic determinant that an Ig molecule recognizes is generally comparably small: 
it can consist of fewer than 10 amino acids on the surface of a globular protein, for 
example (see Figure 24-22). 

Both light and heavy chains are made up of repeating segments—each about 
110 amino acids long and each containing one intrachain disulfide bond. Each 
repeating segment folds independently to form a compact functional unit called 
an immunoglobulin (Ig) domain. As shown in Figure 24-274, a light chain con- 
sists of one variable (V1) and one constant (C) domain, whereas a heavy chain 
has one variable and three or four constant domains: the variable domains of the 
light and heavy chains pair to form the antigen-binding region. Each Ig domain 
has a very similar three-dimensional structure, consisting of a sandwich of two 8 
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Figure 24-25 Constant and variable regions of Ig chains. The variable regions of the light 

and heavy chains form the antigen-binding sites, while the constant regions of the heavy chains 
determine the other biological properties of an lg protein. The different subclasses of IgG antibodies 
have different y-chain constant regions. 
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sheets held together by a disulfide bond; the variable domains are unique in that 
each has its particular set of hypervariable regions, which are arranged in three 
hypervariable loops that cluster together at the ends of the variable domains to 
form the antigen-binding site (Figure 24-27B). 


lg Genes Are Assembled From Separate Gene Segments During 
B Cell Develooment 


Even in the absence of antigen stimulation, a human can probably make more 
than 10!* different Ig molecules—its preimmune, primary Ig repertoire. The pri- 
mary repertoire consists of IgM and IgD proteins and is apparently large enough 
to ensure that there will be an antigen-binding site to fit almost any potential 
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Figure 24-27 lg domains. (A) The light and heavy chains in an lg protein are each folded into similar repeating domains. The variable domains 
(shaded in blue) of the light and heavy chains (VL and Vy) make up the antigen-binding sites, while the constant domains (shaded in gray) of the 
heavy chains (mainly CH2 and CH3) determine the other biological properties of the protein. The heavy chains of IgM and IgE do not have a hinge 
region and have an extra constant domain (Cy4). Hydrophobic interactions between domains on adjacent chains help hold the chains together in the 
Ig molecule: VL binds to Vp, CL binds to Cy1, and so on. (B) X-ray crystallography-based structures of the lg domains of a light chain (Movie 24.5). 
Both the variable and constant domains have a similar overall structure, consisting of two B sheets joined by a disulfide bond (red). Note that all the 
hypervariable regions (black) form loops at the far end of the variable domain, where they come together to form part of the antigen-binding site. All 
Igs are glycosylated on their CH2 domains (not shown); the attached oligosaccharide chains vary from lg to Ig and can greatly influence the biological 
properties of the protein, largely by affecting its binding to Fc receptors on immune cells. 
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antigenic determinant, albeit with low affinity—K, = 10°-10’ liters/mole. After 
stimulation by antigen and helper T cells, B cells can switch from making IgM and 
IgD to making other classes of Ig—a process called class switching. In addition, the 
binding affinity of these Igs for their antigen progressively increases over time—a 
process called affinity maturation. Thus, antigen stimulation generates a second- 
ary Ig repertoire, with a greatly increased affinity (Ka up to 10! liters/mole) and 
diversity of both Ig classes and antigen-binding sites. 

How can each of us make so many different Igs? The problem is not quite as 
formidable as it might first appear. Recall that the variable regions of the Ig light 
and heavy chains usually combine to form the antigen-binding site. Thus, if we 
had 1000 genes encoding light chains and 1000 genes encoding heavy chains, we 
could, in principle, combine their products in 1000 x 1000 different ways to make 
10° different antigen-binding sites. Nonetheless, we have evolved special genetic 
mechanisms to enable our B cells to generate an almost unlimited number of dif- 
ferent light and heavy chains in a remarkably economical way. We do so in two 
steps. First, before antigen stimulation, developing B cells join together separate 
gene segments in DNA to create the genes that encode the primary repertoire of 
low-affinity IgM and IgD proteins. Second, after antigen stimulation, the assem- 
bled Ig genes can undergo two further changes—mutations that can increase the 
affinity of their antigen-binding site and DNA rearrangements that switch the 
class of Ig made. Together, these changes produce the secondary repertoire of 
high-affinity IgG, IgE, and IgA proteins. 

We produce our primary Ig repertoire by joining separate Ig gene segments 
together during B cell development. Each type of Ig chain—x light chains, A light 
chains, and heavy chains—is encoded by a separate locus on a separate chromo- 
some. Each locus contains a large number of gene segments encoding the V region 
of an Ig chain, and one or more gene segments encoding the C region. During the 
development of a B cell in the bone marrow, a complete coding sequence for each 
of the two Ig chains to be synthesized is assembled by site-specific genetic recom- 
bination (discussed in Chapter 5). Once a V-region coding sequence is assembled 
next to a C-region sequence, it can then be co-transcribed and the resulting RNA 
transcript processed to produce an MRNA molecule that codes for the complete 
Ig polypeptide chain. 

Each light-chain V region, for example, is encoded by a DNA sequence assem- 
bled from two gene segments—a long V gene segment and a short joining or 
J gene segment (Figure 24-28). Each heavy-chain V region is similarly con- 
structed by combining gene segments, but here an additional diversity segment, or 
D gene segment, is also required (Figure 24-29). In addition to bringing together 
the separate gene segments of the Ig gene, these rearrangements also activate 
transcription from the gene promoter through changes in the relative positions of 
the cis-regulatory DNA sequences acting on the gene. Thus, a complete Ig chain 
can be synthesized only after the DNA has been rearranged. 

The large number of inherited V, J, and D gene segments available for encod- 
ing Ig chains contributes substantially to Ig diversity, and the combinatorial join- 
ing of these segments (called combinatorial diversification) greatly increases this 
contribution. Any of the 35 or so functional V segments in our xK light-chain locus, 
for example, can be joined to any of the 5 J segments (see Figure 24-28), so that 
this locus can encode at least 175 (35 x 5) different k-chain V regions. Similarly, 
any of the 40 V segments in the human heavy-chain locus can be joined to any of 
the 23 or so D segments and to any of the 6 J segments to encode at least 5520 (40 
x 23 x 6) different heavy-chain V regions. By this mechanism alone, called V(D)J 
recombination, a human can produce 295 different V, regions (175 K and 120 A) 
and 5520 different Vy regions. In principle, these could then be combined to make 
over 1.5 x 10° (295 x 5520) different antigen-binding sites. 

V(D)J recombination is mediated by an enzyme complex called V(D)J recom- 
binase, which recognizes recombination signal sequences in the DNA that flanks 
each gene segment to be joined. Although the process ensures that only appro- 
priate gene segments recombine, a variable number of nucleotides are often lost 
from the ends of the recombining gene segments, and one or more randomly 
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chosen nucleotides are also inserted. This random loss and gain of nucleotides 
at joining sites is called junctional diversification, and it enormously increases 
the diversity of V-region coding sequences created by V(D)J recombination (up 
to about 108-fold), specifically in the third hypervariable region. This increased 
diversification comes at a price, however. In many cases, it shifts the reading frame 
to produce a nonfunctional gene, in which case the developing B cell fails to make 
a functional Ig molecule and consequently dies in the bone marrow. Once a B cell 
makes a functional heavy chain and light chain that form an antigen-binding site, it 
turns off the V(D)J recombination process, thereby ensuring that the cell makes Ig of 
only one antigen-binding specificity. 

B cells making BCRs that bind strongly to self antigens in the bone marrow 
would be dangerous. Such B cells maintain expression of an active V(D)J recom- 
binase and are activated by such self-binding to undergo a second round of 
V(D)J recombination in a light-chain locus, thereby changing the specificity of its 
BCR—the process of receptor editing discussed earlier; self-reactive B cells that 
fail to change their specificity die by apoptosis, in the process of clonal deletion 
(see Figure 24-21). 


Antigen-Driven Somatic Hypermutation Fine-Tunes Antibody 
Responses 
As mentioned earlier, with the passage of time following an infection or vaccina- 


tion, there is usually a progressive increase in the affinity of the antibodies pro- 
duced against the pathogen. This phenomenon of affinity maturation is due to 


VI V2 V40 D1D2 D~23 J1 J6 Cu CS Cy Cg Ca 
5 m E) E ?}) ) ns åE 
germ-line DNA 


1321 


Figure 24-28 The V-J joining process 
involved in making a human 

K light chain. In the “germ-line” DNA 
(where the Ig gene segments are not 
rearranged and are therefore not being 
expressed), the cluster of five J gene 
segments is separated from the C-region 
coding sequence by a short intron and from 
the 35 or so functional V gene segments 
by thousands of nucleotide pairs. During 
the development of a B cell, a randomly 
chosen V gene segment (V3 in this case) 
is moved to lie precisely next to one of the 
J gene segments (J3 in this case). The 
“extra” J gene segments (J4 and J5) and 
the intron sequence are transcribed (along 
with the joined V3 and J3 gene segments 
and the C-region coding sequence) and 
then removed by RNA splicing to generate 
mRNA molecules with contiguous V3, 

J3, and C sequences, as shown. These 
mRNAs are then translated into x light 
chains. A J gene segment encodes the 

15 or so C-terminal amino acids of the 

V region, and a short sequence containing 
the V-J segment junction encodes the third 
hypervariable region, which is the most 
variable part of the light-chain V region. 


Figure 24-29 The human heavy-chain 
locus. There are 40 V segments, about 23 
D segments, 6 J segments, and an ordered 
cluster of C-region coding sequences, 
each cluster encoding a different class 

of heavy chain. The D segment (and part 
of the J segment) encodes amino acids 

in the third hypervariable region, which 

is the most variable part of the heavy- 
chain V region. The genetic mechanisms 
involved in producing a heavy chain 

are the same as those shown in Figure 
24-28 for light chains, except that two 
DNA rearrangement steps are required 
instead of one: first a D segment joins 

to a J segment, and then a V segment 
joins to the rearranged DJ segment. The 
rearrangements lead to the production of 
a VDJC mRNA that encodes a complete 
Ig heavy chain. The figure is not drawn to 
scale: the total length of the heavy-chain 
locus is over two megabases. Moreover, a 
number of details are omitted: for example, 
the exons encoding each C-region lg 
domain and the hinge region (See Figure 
24-27) and the different subclasses of 
Cy-coding segments are not shown. 
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the accumulation of point mutations in both heavy-chain and light-chain V-re- 
gion coding sequences. The mutations occur long after the coding regions have 
been assembled. After B cells have been stimulated by antigen and helper T cells 
in a peripheral lymphoid organ, some of the activated B cells proliferate rapidly 
in the lymphoid follicles and form germinal centers (see Figure 24-20). Here, the 
B cells mutate at the rate of about one mutation per V-region coding sequence 
per cell generation. Because this is about a million times greater than the sponta- 
neous mutation rate in other genes and occurs in somatic cells rather than germ 
cells, the process is called somatic hypermutation. 

Very few of the altered Igs generated by hypermutation will have an increased 
affinity for the antigen. But, because the same Ig genes produce both BCRs and 
secreted antibodies, the antigen will stimulate preferentially those few B cells that 
do make BCRs with increased affinity for the antigen. Clones of these altered B 
cells will preferentially survive and proliferate, especially as the amount of antigen 
decreases to very low levels late in the response. Most other B cells in the germinal 
center will die by apoptosis. Thus, as a result of repeated cycles of somatic hyper- 
mutation followed by antigen-driven proliferation of selected clones of effector 
and memory B cells, antibodies of increasingly higher affinity become abundant 
during an adaptive immune response, providing progressively better protection 
against the pathogen (Movie 24.6). 

A breakthrough in understanding the molecular mechanism of somatic hyper- 
mutation came with the identification ofan enzyme that is required for the process. 
It is called activation-induced deaminase (AID) because it is expressed specifi- 
cally in activated B cells and deaminates cytosine (C) to uracil (U) during tran- 
scription of V-region coding DNA. The deamination produces U:G mismatches in 
the DNA double helix, and the repair of these mismatches produces various types 
of mutations, depending on the repair pathway used. Somatic hypermutation 
affects only actively transcribed DNA, because AID works only on single-stranded 
DNA (which is transiently exposed during transcription) and because proteins 
involved in the transcription of V-region coding sequences are required to recruit 
the AID enzyme. AID is also required for activated B cells to switch from IgM and 
IgD production to the production of the other classes of Ig, as we now discuss. 


B Cells Can Switch the Class of lg They Make 


After a developing B cell leaves the bone marrow, before it interacts with antigen, it 
expresses both IgM and IgD BCRs on its surface, both with the same antigen-bind- 
ing sites (see Figure 24-24). Stimulation by antigen and helper T cells activates 
many of these mature naive B cells to become IgM-secreting effector cells, so that 
IgM antibodies dominate the primary antibody response. Later in the immune 
response, however, when activated B cells are undergoing somatic hypermuta- 
tion, the combination of antigen and helper-T-cell-derived cytokines (discussed 
later) stimulates many of the B cells to switch from making membrane-bound 
IgM and IgD to making IgG, IgE, or IgA, in the process of class switching. Some 
of these cells become memory cells that express the corresponding class of Ig 
as BCRs on their surface, while others become effector cells that secrete the Ig 
molecules as antibodies. The IgG, IgE, and IgA molecules retain their original 
antigen-binding site and are collectively referred to as secondary classes of Igs, 
because they are produced only after antigen stimulation, dominate secondary 
antibody responses, and make up the secondary Ig repertoire. 

As discussed earlier, the constant region of an Ig heavy chain determines 
the class of the Ig. Thus, the ability of B cells to switch the class of antibody they 
make without changing the antigen-binding site implies that the same assem- 
bled Vy-region coding sequence (which specifies the antigen-binding part of the 
heavy chain) can sequentially associate with different Cy-coding sequences. This 
has important functional implications. It means that, in an individual animal, a 
particular antigen-binding site that has been selected by environmental antigens 
can be distributed among the various classes of antibodies, thereby acquiring the 
different biological properties of each class. 
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When aB cell switches from making IgM and IgD to one of the secondary classes 
of Ig, an irreversible change occurs in the DNA—a process called class switch 
recombination. It entails the deletion of all the Cy-coding sequences between 
the assembled VDJ-coding sequence and the particular Cy-coding sequence 
that the cell is destined to express. Class switch recombination differs from V(D)J 
recombination in several ways. (1) It happens after antigen stimulation, mainly in 
germinal centers, and depends on helper T cells. (2) It uses different recombina- 
tion signal sequences, called switch sequences, which flank the different Cy-cod- 
ing segments. (3) It involves cutting and joining the switch sequences, which are 
noncoding sequences, and leaves the assembled Vy-region coding sequence 
unchanged (Figure 24-30). (4) Most importantly, the molecular mechanism is 
different. It depends on AID, which is also involved in somatic hypermutation, 
rather than on the V(D)J recombinase. The cytokines that activate class switching 
induce the production of transcription regulators that activate transcription from 
the relevant switch sequences, allowing the recruitment of AID to these sites. 

Once bound, AID initiates switch recombination by deaminating some cyto- 
sines to uracil in the vicinity of these switch sequences. Excision of these uracils is 
thought to lead to double-strand breaks in the participating switch regions, which 
are then joined by a form of nonhomologous end joining (discussed in Chapter 5). 

Thus, whereas the primary Ig repertoire in humans (and mice) is generated by 
V(D)J joining mediated by V(D)J recombinase, the secondary antibody repertoire 
is generated by somatic hypermutation and class switch recombination, both of 
which are mediated by AID. Figure 24-31 lists the main mechanisms that we have 
discussed in this chapter that diversify Igs. 


Summary 


Each B cell clone makes Ig molecules with a unique antigen-binding site. Initially, 
the Ig molecules are inserted into the plasma membrane and serve as B cell recep- 
tors (BCRs) for antigen. Antigen binding to the BCRs, together with co-stimulatory 
signals from helper T cells, activates the B cells to proliferate and differentiate into 
either memory cells or antibody-secreting effector cells. The effector cells secrete large 
amounts of antibodies with the same antigen-binding site as the BCRs. 

A typical Ig molecule is composed of four polypeptide chains—two identical 
heavy chains and two identical light chains. Parts of both the heavy and light chains 
form the two identical antigen-binding sites. There are multiple classes of Ig (IgA, 
IgD, IgE, IgG, and IgM), each with a distinctive heavy chain, which determines the 
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Figure 24-30 An example of the DNA 
rearrangement that occurs in class 
switch recombination. A B cell making 
IgM molecules with a V region encoded 

by a particular assembled VDJ DNA 
sequence is stimulated to switch to 
making IgA molecules with the same 

V region. In the process, it deletes the 
DNA between the VDJ sequence and 

the Cg-coding sequence. Specific DNA 
sequences (switch sequences) located 
upstream of each Cy-coding sequence 
(except Cs, as B cells don't switch from 
Cy to Cg) can recombine with one another, 
with the deletion of the intervening DNA, 
as shown here. As discussed in the text, 
the recombination process depends on 
AID, the same enzyme that Is involved in 
somatic hypermutation. When switching 
from IgM to IgG or IgE, the C-region coding 
sequences downstream of C, or Cg, which 
remain after the DNA deletion, are removed 
during RNA splicing. 





somatic hypermutation 
+ class switch 
recombination 





Figure 24-31 The main mechanisms of 
Ig diversification in mice and humans. 
Those shaded in green occur during B cell 
development in the bone marrow, whereas 
the two mechanisms shaded in red occur 
when B cells are stimulated by foreign 
antigen and helper T cells in germinal 
centers in peripheral lymphoid organs, 
either late in a primary response or in a 
secondary response. 
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biological properties of the Ig class. Each light and heavy chain is composed of a 
number of Ig domains. The amino acid sequence variation in the variable domains 
of both light and heavy chains is concentrated in several small hypervariable 
regions, which form loops at one end of these domains to produce the antigen-bind- 
ing site. 

Igs are encoded by loci on three different chromosomes, each of which is respon- 
sible for producing a different polypeptide chain—a xk light chain, a À light chain, 
or a heavy chain. Each locus contains separate gene segments that encode different 
parts of the variable region of the particular Ig chain. Each light-chain locus con- 
tains one or more constant- (C-) region coding sequences and sets of variable (V) 
and joining (J) gene segments. The heavy-chain locus contains sets of C-region cod- 
ing sequences and sets of V, diversity (D), andJ gene segments. 

During B cell development in the bone marrow, before antigen stimulation, sepa- 
rate gene segments are brought together by site-specific recombination that depends 
on a V(D)J recombinase. A Vi gene segment recombines with a Ji, gene segment to 
produce a DNA sequence coding for the V region of a light chain, and a Vy gene seg- 
ment recombines with a D and a Jy gene segment to produce a DNA sequence cod- 
ing for the V region of a heavy chain. Each of the newly assembled V-region coding 
sequences is then co-transcribed with the appropriate C-region sequence to produce 
an RNA molecule that codes for the complete Ig polypeptide chain. 

By randomly combining inherited gene segments that code for the variable 
regions during B cell development, humans can make hundreds of different light 
chains and thousands of different heavy chains. Because the antigen-binding site 
is formed where the hypervariable loops of the V; and Vy domains come together 
in the final Ig molecule, the heavy and light chains can potentially pair to form 
Igs with millions of different antigen-binding sites. A loss or gain of nucleotides at 
the site of gene-segment joining increases this number enormously. The Igs made 
by such V(D)J recombination before antigen stimulation are IgMs and IgDs with low 
affinity for binding antigen, and they constitute the primary Ig repertoire. 

Igs are further diversified following antigen stimulation in peripheral lymphoid 
organs by the AID- and helper-T-cell-dependent processes of somatic hypermuta- 
tion and class switch recombination, which together produce the high-affinity IgG, 
IgE, and IgA Igs that constitute the secondary Ig repertoire. The process of class 
switching allows the same antigen-binding site to be incorporated into antibodies 
that have different tails and therefore different biological properties. 
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Like antibody responses, T-cell-mediated immune responses are exquisitely anti- 
gen-specific, and they are at least as important as antibodies in defending ver- 
tebrates against infection. Indeed, most adaptive immune responses, including 
most antibody responses, require helper T cells for their initiation. Most impor- 
tantly, unlike B cells, T cells can help eliminate pathogens that have entered the 
interior of host cells, where they are invisible to B cells and antibodies. Much of 
the rest of this chapter is concerned with how T cells accomplish this feat. 

T cell responses differ from B cell responses in at least two crucial ways. First, 
a T cell is activated by foreign antigen to proliferate and differentiate into effector 
cells only when the antigen is displayed on the surface of an antigen-presenting 
cell (APC), usually a dendritic cell in a peripheral lymphoid organ. One reason 
T cells require APCs for activation is that the form of antigen they recognize is 
different from that recognized by the Igs produced by B cells. Whereas Igs can 
recognize antigenic determinants on the surface of pathogens and soluble folded 
proteins, for example, T cells can only recognize fragments of protein antigens 
that have been produced by partial proteolysis inside a host cell. As mentioned 
earlier, newly formed MHC proteins capture these peptide fragments and carry 
them to the surface of the host cell, where T cells can recognize them. 

The second difference is that, once activated, effector T cells act mainly at short 
range, either within a secondary lymphoid organ or after they have migrated to a 
site of infection. Effector B cells, by contrast, secrete antibodies that can act far 
away. Effector T cells interact directly with another host cell in the body, which 
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they either kill (if it is an infected host cell, for example) or signal in some way (if 
it is a B cell or macrophage, for example). We will refer to such host cells as target 
cells. As is the case with APCs, target cells must display an antigen bound to an 
MHC protein on their surface for a T cell to recognize them. 

There are three main classes of T cells—cytotoxic T cells, helper T cells, and 
regulatory T cells. When activated, they function as effector cells (see Figure 
24-17), each with their own distinct activities. Effector cytotoxic T cells directly kill 
cells that are infected with a virus or some other intracellular pathogen. Effector 
helper T cells help stimulate the responses of other immune cells—mainly mac- 
rophages, dendritic cells, B cells, and cytotoxic T cells; as we will see, there are a 
variety of functionally distinct subtypes of helper T cells. Effector regulatory T cells 
suppress the activity of other immune cells. 

In this section, we describe these classes and subclasses of T cells and their 
respective functions. We discuss how they recognize foreign antigens on the sur- 
face of APCs or target cells and the crucial part played by MHC proteins in the rec- 
ognition process. We begin by considering the cell-surface receptors that T cells 
use to recognize antigen. 


T Cell Receptors (TCRs) Are Ig-like Heterodimers 


T cell receptors (TCRs), unlike Igs made by B cells, exist only in membrane-bound 
form. They are composed of two transmembrane, disulfide-linked polypeptide 
chains, each of which contains two Ig-like domains—one variable and one con- 
stant. On most T cells, the TCRs have one a chain and one p chain (Figure 24-32). 

The genetic loci that encode the a and £ chains are located on different chro- 
mosomes. Like an Ig heavy-chain locus (see Figure 24-29), the TCR loci contain 
separate V, D, and J gene segments (or just V and J gene segments in the case 
of the a-chain locus), which are brought together by site-specific recombination 
during T cell development in the thymus. With one exception, T cells use the same 
mechanisms to generate antigen-binding site diversity of their TCRs as B cells 
use to generate antigen-binding site diversity of their Igs, and they use the same 
V(D)J recombinase; thus, humans or mice deficient in this recombinase can- 
not make functional B or T cells. The mechanism that does not operate in TCR 
diversification is antigen-driven somatic hypermutation. Thus, the affinities of 
TCRs tend to be low (Ka = 10°-10“ liters/mole). Various co-receptors and cell-cell 
adhesion proteins, however, greatly strengthen the binding of a T cell to an APC 
or target cell. 
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Figure 24-32 A T cell receptor (TCR) 
heterodimer. (A) Schematic drawing 
showing that the receptor is composed of 
an a and a B polypeptide chain. Each chain 
has a large extracellular part that is folded 
into two Ig-like domains—one variable 

(V) and one constant (C). A Vg and a 

Vg domain (shaded in blue) form the 
antigen-binding site. Unlike Igs, which 
have two binding sites for antigen, TCRs 
have only one. The aB-heterodimer is 
noncovalently associated with a large set 
of invariant membrane-bound proteins (not 
shown), which help activate the T cell when 
the TCRs bind their specific antigen (See 
Figure 24—45B). A typical T cell has about 
30,000 TCRs on its surface. (B) The three- 
dimensional structure of the extracellular 
part of a TCR. The antigen-binding site 

is formed by the hypervariable loops of 
both the Va and Vg domains (black), and 

it is similar in its overall dimensions and 
geometry to the antigen-binding site of an 
Ig molecule. (B, based on K.C. Garcia et 
al., Science 274:209-219, 1996.) 
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Instead of making a and £ chains, a minority of T cells makes a different but 
related type of TCR heterodimer, composed of y chains and 6 chains. Although 
these y/o T cells normally make up only 5-10% of the T cells in human blood, 
they can be the dominant T cell population in epithelia (in the skin and gut, for 
example). They have some properties in common with natural killer (NK) cells 
and with an enlarging category of T-like cells that have features of both innate 
and adaptive immune cells, which are sometimes collectively referred to as innate 
lymphoid cells. The cells in all these categories tend to be enriched in mucosal tis- 
sues, respond early to infection, display little immunological memory, and, com- 
pared with B and T cells, have surface receptors of restricted diversity. We will not 
discuss them further. 

As with BCRs, TCRs are tightly associated in the plasma membrane with a 
number of invariant membrane-bound proteins that are involved in passing the 
signal from an antigen-activated receptor to the cell interior. We will discuss these 
proteins in more detail later, when we consider some of the molecular events 
involved in T and B cell activation. First, we consider the special ways in which T 
cells recognize foreign antigen on the surface of an APC or target cell. 


Activated Dendritic Cells Activate Naive T Cells 


Generally, naive T cells, including naive helper and cytotoxic T cells, proliferate 
and differentiate into effector cells and memory cells only when they see their 
specific antigen on the surface of an activated dendritic cell in a peripheral lym- 
phoid organ (Figure 24-33). The activated dendritic cell displays the antigen in 
a complex with MHC proteins on its surface, along with co-stimulatory proteins 
(see Figure 24-11). The memory T cells that develop, however, can be activated 
by the same antigen-MHC complex on the surface of other types of APCs (target 
cells), including macrophages and B cells—as well as by dendritic cells. 

Immature dendritic cells are located in most tissues—underlying epithelial 
layers of the skin and gut, for example—where they are constantly sampling and 
processing proteins in their environment. They become activated to mature when 
their pattern recognition receptors (PRRs) encounter pathogen associated molec- 
ular patterns (PAMPs) on an invading pathogen or its products. The pathogen or 
products are ingested, and the microbial proteins are cleaved into peptide frag- 
ments, which are loaded onto MHC proteins, as we discuss later. The activated 
dendritic cells then migrate via the lymph from the site of infection to local lymph 
nodes or gut-associated lymphoid organs, where they present the foreign anti- 
gens, displayed as peptide-MHC complexes on the dendritic cell surface, for rec- 
ognition by the relevant T cells (see Figure 24-11). 

Activated dendritic cells display three types of protein molecules on their sur- 
face that have a role in activating a T cell to become an effector cell or memory 
cell (Figure 24-34): (1) MHC proteins, which present foreign peptides to the TCRs; 
(2) co-stimulatory proteins, which bind to complementary receptors on the T cell 
surface; and (3) cell-cell adhesion molecules, which enable a T cell to bind to the 
dendritic cell for long enough to become activated, typically several hours. In 
addition, activated dendritic cells secrete a variety of cytokines that influence the 
type of effector helper T cell that develops, and different types of dendritic cells 
promote different outcomes (discussed later). 


T Cells Recognize Foreign Peptides Bound to MHC Proteins 


MHC proteins capture and display peptide fragments of foreign proteins for 
presentation to T cells. There are two main classes of MHC proteins, which are 
structurally and functionally distinct. Class I MHC proteins mainly present for- 
eign peptides to cytotoxic T cells, whereas class II MHC proteins mainly present 
foreign peptides to helper and regulatory T cells (Figure 24-35). Some class-I-like 
MHC proteins present microbial lipid and glycolipid antigens to T cells, but they 
are not encoded within the MHC region of the genome, and we will not consider 
them further. 
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Figure 24-33 Immunofluorescence 
micrograph of a dendritic cell in culture. 
These APCs derive their name from their 
long processes, or “dendrites.” The cell has 
been labeled with a monoclonal antibody 
that recognizes a surface antigen on these 
cells. (Courtesy of David Katz.) 
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Both class I and class II MHC proteins are heterodimers, in which two extra- 
cellular domains form a peptide-binding groove, which always has a variable small 
peptide bound in it. In class I MHC proteins, the two domains that form the pep- 
tide-binding groove are provided by the transmembrane a chain, which is nonco- 
valently associated with a small subunit called B.-microglobulin; in class II MHC 
proteins, a different a chain and a large noncovalently associated B chain each 
contribute an extracellular domain to form the peptide-binding groove (Figure 
24-36). A TCR binds to both the peptide and the ridges of the binding groove. 
Humans have three major class I proteins, called HLA-A, HLA-B, and HLA-C, 
and three class II proteins, called HLA-DR, HLA-DP, and HLA-DQ (HLA stands 
for human-leukocyte-associated, as these proteins were first demonstrated on 
human leukocytes). Figure 24-37 shows how the genes that encode these pro- 
teins are arranged on human chromosome 6. 

There are important differences between the class I and class II MHC pro- 
teins with regard to the cell types that express them and the origin of the pep- 
tides in their peptide-binding grooves. Almost all of our nucleated cells express 
class I proteins. Their peptide-binding groove displays one of a diverse collection 
of peptides (typically 8-10 amino acids in length). In a healthy cell, the peptides 
originate from the cell’s own cytosolic and nuclear proteins that have undergone 
partial degradation in proteasomes in the processes of normal protein turnover 
and quality control mechanisms. Some of the peptide fragments produced in this 
way are actively transported into the lumen of the endoplasmic reticulum (ER), 
through a specialized transporter in the ER membrane, where they are loaded 
onto newly synthesized class I MHC a chains; once a peptide binds, the a chain 
can assemble with its partner chain. The resulting self-peptide-MHC complex is 
then transported through the Golgi apparatus to the cell surface. Such complexes 
are not dangerous, however, because the cytotoxic T cells that could recognize 
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Figure 24—34 The three general types of 
proteins on the surface of an activated 
dendritic cell involved in activating a 
T cell. Although only membrane-bound 
co-stimulatory molecules are shown, 
activated dendritic cells also secrete 
soluble co-stimulatory molecules. The 
invariant polypeptide chains that are 
always stably associated with the 

TCR are not shown; they are illustrated 

in Figure 24—45B and Movie 24.7. 


Figure 24-35 Recognition by T cells 
of foreign peptides bound to MHC 
proteins. Cytotoxic T cells recognize 
foreign peptides in association with class 
| MHC proteins, whereas helper T cells 
and regulatory T cells recognize foreign 
peptides in association with class || 
MHC proteins. In both cases, the T cell 
recognizes the peptide-MHC complexes 
on the surface of an APC —either a 
dendritic cell or a target cell. Some 
regulatory T cells recognize self peptides 
in association with class II MHC proteins 
(not shown). 
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Figure 24-36 Class | and class II MHC proteins. (A) The a chain of the class | molecule has three extracellular domains, a1, 2, and 
a3, each encoded by a separate exon. The a chain is noncovalently associated with a smaller polypeptide chain, B2-microglobulin, 
which is not encoded within the MHC region of the genome. The ag domain and B2-microglobulin are Ig-like. While B2-microglobulin is 
invariant, the a chain is extremely polymorphic, mainly in the a4 and ap domains. (B) In class II MHC proteins, both the a chain and the 
B chain are encoded within the MHC and are polymorphic, mainly in the a; and B4 domains; the a2 and B2 domains are Ig-like. Thus, 
there are striking similarities between class | and class II MHC proteins. In both, the two outermost domains (shaded in blue) are 
polymorphic and interact to form a groove that binds peptide fragments. (C) The three-dimensional structure of the peptide-binding 
groove of a human class | MHC protein is viewed from above, with bound peptide shown schematically; a peptide must be bound in the 
groove for the MHC protein to assemble and be transported to the cell surface. The sides of the groove are formed by two a helices, 
and the floor is formed by a B pleated sheet. The S-S disulfide bond is shown in red (Movie 24.8 and Movie 24.9). (C, adapted from 
P.J. Bjorkman et al., Nature 329:506-512, 1987. With permission from Macmillan Publishers Ltd.) 


them have been either eliminated or inactivated, or suppressed by regulatory T 
cells in the process of self-tolerance. By contrast, in a cell infected by a pathogen 
such as a virus, the pathogen proteins will be processed in the same way, and 
peptides derived from them will be displayed on the infected cell surface bound 
to class I MHC proteins; there, they are recognized by cytotoxic T cells expressing 
the appropriate TCRs, thereby targeting the infected cell for destruction (Figure 
24-38). 

In general, only antigen-presenting cells (APCs) express class II MHC pro- 
teins. Dendritic cells are referred to as professional APCs, as they are specialized 
for this function and only they can activate naive T cells. Other immune cells that 
are targets of effector T cell regulation, including B cells and macrophages, are 
nonprofessional APCs. All APCs load their newly synthesized class II MHC pro- 
teins with peptides derived mainly from extracellular proteins that are endocy- 
tosed and delivered to endosomes. The newly synthesized class II MHC proteins 
initially contain an invariant chain, which occupies the peptide-binding groove 
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Figure 24-37 Human MHC genes. This 
simplified schematic drawing shows the 
location of the genes that encode the 
transmembrane subunits of class | (light 
green) and class Il (dark green) MHC 
proteins. The genes shown encode three 
types of class | MHC proteins (HLA-A, 
HLA-B, and HLA-C) and three types of 
class II MHC proteins (HLA-DP, HLA-DQ, 
and HLA-DR). An individual can therefore 
make six types of class | MHC proteins 
(three encoded by maternal genes and 
three by paternal genes) and more than six 
types of class II MHC proteins. Because 
of the extreme polymorphism of the MHC 
genes, the chances are very low that 

the maternal and paternal alleles will be 
the same. The number of class I] MHC 
proteins that can be made is greater than 
six because there are two DR fB genes 
and because maternally encoded and 
paternally encoded polypeptide chains can 
sometimes pair. The entire region shown 
spans about seven million base pairs and 
contains other genes that are not shown. 
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and prevents it from prematurely binding a peptide until the class II MHC pro- 
tein reaches specialized vesicles, which fuse with endosomes. Here, the invari- 
ant chain is removed and peptide fragments (typically 12-20 amino acids long) 
produced from endocytosed proteins can bind to the groove of the class II MHC 
proteins, which are then transported to the plasma membrane for display on the 
surface of the APC. In a healthy host cell, class II MHC protein grooves are loaded 
with self-peptides derived from normal proteins and will be ignored by T cells 
because of self-tolerance mechanisms. During an infection, however, pathogen 
proteins are also endocytosed and processed in the same way, enabling APCs to 
present pathogen peptides bound to class II MHC proteins to T cells expressing an 
appropriate TCR (Figure 24-39). 

The distinction just discussed between the antigen-processing pathways for 
loading peptides onto class I and class II MHC proteins is not absolute. Dendritic 
cells, for example, need to be able to activate cytotoxic T cells to kill virus-in- 
fected cells even when the virus does not infect dendritic cells themselves. To 
do so, specialized subsets of dendritic cells use a process called cross-presenta- 
tion, which begins when these noninfected dendritic cells phagocytose virus-in- 
fected host cells or their fragments. The ingested viral proteins are then released 
by an unknown mechanism from phagolysosomes into the cytosol, where they 
are degraded in proteasomes; the resulting protein fragments are then trans- 
ported into the ER lumen, where they load onto assembling class I MHC proteins. 
Cross-presentation in dendritic cells is not confined to endocytosed pathogens 
and their products: it also operates to activate cytotoxic T cells against tumor anti- 
gens of cancer cells and the MHC proteins of foreign organ grafts. 
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Figure 24-38 The processing of an extracellular foreign protein for presentation to cytotoxic T cells. An effector cytotoxic T cell kills a virus- 
infected cell when it recognizes fragments of an internal viral protein bound to class | MHC proteins on the surface of the infected cell. Not all viruses 
enter the cell in the way that this enveloped RNA virus does, but fragments of internal viral proteins always follow the pathway shown. Only a small 
proportion of the viral proteins synthesized in the cytosol are degraded and transported to the cell surface, but this is sufficient to attract an attack 
by a cytotoxic T cell. Several chaperone proteins in the ER lumen aid the folding and assembly of class | MHC proteins (not shown). The assembly 
of class | MHC proteins and their transport to the cell surface require the binding of either a self or foreign peptide (Movie 24.10). 
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nee LATE ENDOSOME a helper T cell. This simplified depiction shows how peptide-class-lI- MHC complexes 
are formed in endosomes and delivered via vesicles to the cell surface. Viral envelope 


oe ae glycoproteins can also be processed by this pathway for presentation to helper T cells 
(not shown): these glycoproteins are normally made in the ER and transported via the 
Golgi for insertion into the plasma membrane; although most of these glycoproteins will 
trans Golgi be incorporated into the envelope of budding viral particles, some will be endocytosed 
invariant chain network and enter endosomes, from where they can enter the class II MHC processing pathway. 


During an infection, only a small fraction of the many thousands of MHC pro- 
teins on the surface of an APC or target cell will have pathogen peptides bound 
to them. This is sufficient, however: fewer than 50 copies of such a peptide-MHC 
complex on a dendritic cell, for example, can activate a helper T cell that has a 
TCR that binds the complex with a high-enough affinity. Table 24-3 compares the 
properties of class I and class II MHC proteins. 


MHC Proteins Are the Most Polymorphic Human Proteins Known 


Although any individual can make only asmall number of different class I and class 
II MHC proteins, together, these proteins must be able to present peptide frag- 
ments from almost any foreign protein to T cells. Thus, unlike the antigen-binding 
site of an Ig protein, the peptide-binding groove of each MHC protein must be 


TABLE 24-3 


Geneticloci Geneticloci HLA-A, HLA-B, HLA- HLA-A, HLA-B, HLA-C -HLA-DP HLA-DQ, HLA-DR DP HLA- _HLA-DP HLA-DQ, HLA-DR HLA-DR 


Chain structure i OLA + a chain + B chain 
B2-microglobulin 


Cell distribution Most nucleated cells Dendritic cells, B cells, 
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Source of peptide Mainly proteins made Mainly endocytosed plasma 
fragments in cytoplasm membrane and extracellular 
proteins 
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able to bind a very large number of different peptides. The genes encoding class 
I and class II MHC proteins (see Figure 24-37) are the most polymorphic known 
in higher vertebrates: in the human population, for example, there are more than 
2000 allelic variants of these genes. The corresponding variations in the MHC pro- 
teins are concentrated in the floor and walls of the peptide-binding grooves and 
allow MHC molecules in different individuals to bind different arrays of peptides. 

It is thought that infectious diseases have been an important driving force for 
generating this remarkable MHC polymorphism. In the evolutionary war between 
pathogens and the adaptive immune system, pathogens will tend to change their 
proteins through mutation so that the peptides derived from them will not fit 
in the MHC peptide-binding grooves. When a pathogen succeeds, it can sweep 
through a population as an epidemic. In such circumstances, the few individuals 
who produce a new allelic form of MHC protein that can bind peptides derived 
from the altered pathogen will have a large selective advantage. This type of 
selection will tend to promote and maintain a large diversity of MHC proteins 
in the population. In West Africa, for example, individuals with a specific MHC 
allele (HLA-B53) have a reduced susceptibility to a severe form of malaria that is 
endemic there; although this allele is rare elsewhere, it is found in 25% of the West 
African population. 

The extensive diversity of human MHC proteins is the main reason that indi- 
viduals who receive a foreign organ transplant must be treated with strong immu- 
nosuppressive drugs to prevent the immunological rejection of the grafted organ. 
Of all the foreign proteins that the graft expresses, the MHC proteins are by far the 
most powerful stimulators of the recipient’s T cells, which would rapidly destroy 
the graft if they were not prevented from doing so by such drugs. Foreign MHC 
proteins are powerful T cell stimulants because T cells respond to them in the 
same way they respond to self MHC proteins that have foreign peptides bound 
to them; for this reason, the proportion of a person’s T cells that can specifically 
recognize any foreign MHC protein is relatively high. 


CD4 and CD8 Co-receptors on T Cells Bind to Invariant Parts of 
MHC Proteins 


The affinity of TCRs for peptide- MHC complexes on an APC is usually too low by 
itself to mediate a functional interaction between the two cells. T cells normally 
require accessory receptors to help stabilize the interaction by increasing the over- 
all strength of the cell-cell adhesion. Unlike TCRs or MHC proteins, the accessory 
receptors are invariant and do not bind to foreign peptides. Once bound to the 
surface of a dendritic cell, for example, a T cell increases the strength of the bind- 
ing by activating an integrin adhesion protein (discussed in Chapter 19), which 
then binds more strongly to an Ig-like protein on the surface of the dendritic 
cell. This increased adhesion enables the T cell to remain bound long enough to 
become activated. 

When an accessory receptor has a direct role in activating the T cell by gener- 
ating its own intracellular signals, it is called a co-receptor. The most important 
and best understood of the co-receptors on T cells are the CD4 and CD8 proteins, 
both of which are single-pass transmembrane proteins with extracellular Ig-like 
domains. Like TCRs, they recognize MHC proteins, but, unlike TCRs, they bind 
to invariant parts of the MHC protein, far away from the peptide-binding groove. 
CD4 is expressed on both helper T cells and regulatory T cells and binds to class II 
MHC proteins, whereas CD8 is expressed on cytotoxic T cells and binds to class I 
MHC proteins (Figure 24-40). 

CD4 and CD8 contribute to T cell recognition by helping the T cell to focus on 
particular MHC proteins, and thereby on particular types of target cells. Thus, the 
recognition of class I MHC proteins by CD8 allows cytotoxic T cells to focus on 
any type of infected host cell, while the recognition of class II MHC proteins by 
CD4 allows helper and regulatory T cells to focus on the target immune cells that 
they help or suppress, respectively. The cytoplasmic tail of the CD4 and CD8 pro- 
teins is associated with a member of the Src family of cytoplasmic tyrosine kinases 
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Figure 24-40 CD4 and CD8 co-receptors 
on the surface of T cells. Cytotoxic T cells 
(Tc) express CD8, which recognizes class | 
MHC proteins, whereas helper T cells 

(TH) and regulatory T cells (not shown) 
express CD4, which recognizes class ll 
MHC proteins. Note that the co-receptors 
bind to the same MHC protein that the 
TCR has engaged, so that they are brought 
together with TCRs during the antigen- 
recognition process. Whereas the TCR 
binds to the variable (polymorphic) parts 

of the MHC protein that form the peptide- 
binding groove, the co-receptor binds 

to the invariant part, well away from the 
binding groove. 


1332 Chapter 24: The Innate and Adaptive Immune Systems 


(discussed in Chapter 15) called Lck, which phosphorylates various intracellular 
proteins on tyrosines and thereby participates in the activation of the T cell (dis- 
cussed later). 

The AIDS virus (HIV) uses CD4 molecules (as well as chemokine receptors) to 
enter helper T cells (see Figure 23-17). AIDS patients are susceptible to infection 
by microbes that are not normally dangerous because HIV depletes helper T cells. 
As aresult, most AIDS patients die of infection within several years of the onset of 
symptoms, unless they are treated with a combination of anti-HIV drugs. HIV also 
uses CD4 and chemokine receptors to enter macrophages, which also have both 
types of receptors on their surface. 


Developing Thymocytes Undergo Negative and Positive Selection 


T cell development begins when bone-marrow-derived lymphoid progenitor 
cells enter the thymus from the bloodstream. There, the cells receive a variety of 
signals from thymus stromal cells, epithelial cells, macrophages, and dendritic 
cells, which promote their stepwise development into mature thymocytes. At one 
step, the progenitor cells are induced to express V(D)J recombinase and begin to 
rearrange their TCR gene segments. Soon thereafter, the cells express both CD4 
and CD8 co-receptors, and these so-called double-positive thymocytes migrate 
inward and interact with thymus dendritic cells or epithelial cells expressing self 
peptides bound to class I and class II MHC proteins. If the TCR on the thymocyte 
binds with high affinity to these complexes, a strong signal will be transmitted, 
causing the cell to undergo apoptosis. This process, called negative selection, is 
an example of clonal deletion (see Figure 24-21), and it eliminates thymocytes 
that could potentially attack normal host cells and tissues and thereby cause an 
autoimmune disease if the cells were to continue to mature and leave the thymus. 
If its TCR is unable to bind at all to a self-peptide- MHC complex in the thy- 
mus, the thymocyte will fail to receive the signals it needs to survive and will die 
of “neglect;” without the ability to recognize self-MHC proteins, a T cell would 
generally be of no use, as T cells can only see pathogen-derived peptides in the 
context of self-MHC proteins. Thymocytes that express a TCR that binds with an 
appropriate affinity to a self peptide bound to either a class I MHC protein (using 
CD8 as a co-receptor) or a class II MHC protein (using CD4 as a co-receptor) will 
receive an optimal signal to survive and continue to mature, a process called pos- 
itive selection (Figure 24-41). As part of this maturation process, and depending 
on the TCR’s preference for class I or class II MHC proteins, the CD4 or CD8 co-re- 
ceptor that is not needed is silenced by DNA methylation of the respective gene; 
this results in the development of CD4 or CD8 single-positive thymocytes, which 
exit the thymus as naive T cells and enter the recirculating pool of T cells—the CD4 
cells as either helper or regulatory T cells and the CD8 cells as cytotoxic T cells. 

Although naive helper and cytotoxic T cells constantly receive survival signals 
in the form of self peptides bound to MHC proteins that the T cells bind weakly, 
a T cell is only activated to proliferate and mount an immune response if its TCR 
binds with high affinity to a peptide-MHC complex and receives co-stimulatory 
signals at the same time. Generally, this happens only when the T cell encoun- 
ters an activated dendritic cell (in a peripheral lymphoid organ) that expresses 
an MHC protein with a foreign peptide derived from a pathogen in its binding 
groove. Only then will the naive T cell proliferate and differentiate into an effector 
or memory T cell. 

Negative selection in the thymus is a major mechanism for ensuring that 
peripheral T cells do not react with host cells expressing MHC proteins with pep- 
tides derived from self proteins in their peptide-binding grooves. This mecha- 
nism, however, requires that the APCs in the thymus display an array of peptides 
on their MHC molecules that will reflect the self proteins in peripheral tissues, as 
well as in the thymus. The thymus, however, would not be expected to produce 
many of the proteins that are specifically expressed in other organs. As an exam- 
ple, it would not be expected to produce insulin, and yet it is crucial to delete thy- 
mocytes with TCRs that could recognize insulin-derived peptides bound to MHC 
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proteins on the surface of insulin-secreting B cells in the pancreas. Any failure to 
do so would result in the T-cell-dependent destruction of the B cells and, as a con- 
sequence, cause type 1 (or juvenile) diabetes. 

The mechanism that enables the deletion of all such cells in the thymus 
depends on a subpopulation of epithelial cells in the thymus that express a tran- 
scriptional regulator called AIRE (autoimmune regulator). By a poorly under- 
stood mechanism, the AIRE protein promotes the production of small amounts 
of mRNA from many genes that encode such “organ-specific” proteins, including 
the insulin gene. When the peptides derived from the proteins encoded by these 
genes are bound by MHC proteins and displayed on the surface of the epithelial 
cells in the thymus medulla, this is sufficient to provoke the deletion of the poten- 
tially self-reactive thymocytes. Mutations that inactivate the AIRE gene cause a 
severe multiorgan autoimmune disease in both mice and humans, indicating the 
importance of AIRE in self-tolerance. 


Cytotoxic T Cells Induce Infected Target Cells to Kill Themselves 


Cytotoxic T cells (Tc cells), like the NK cells discussed earlier, protect us against 
intracellular pathogens, including viruses, bacteria, and parasites, that multiply 
in the cytoplasm of a host cell. Tc cells kill infected host cells before the pathogen 
can escape to infect neighboring host cells. Before it can kill, however, a naive 
Tc cell has to become an effector cell by activation on an APC, usually an acti- 
vated dendritic cell that has pathogen-derived peptides bound to class I MHC 
proteins—a process that depends on helper T cells. The effector Tc cell can then 
recognize any target cell harboring the same pathogen and expressing some of 
the same peptide-MHC complexes on its surface: its TCRs cluster, along with CD8 
co-receptors, adhesion molecules, and intracellular signaling proteins (discussed 
later), at the interface between the two cells, forming an immunological synapse. 
In this process, the effector Tc cell reorganizes its cytoskeleton to focus its kill- 
ing apparatus on the target cell, secreting its toxic proteins into a confined space 
(Figure 24-42); in this way, it avoids killing neighboring cells. A similar synapse 
forms when an effector helper T cell interacts with its target cell, except that the 
co-receptor is CD4 (Movie 24.11). 
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Figure 24-41 Positive and negative 
selection in the thymus. Developing 
thymocytes with TCRs that would 
potentially enable them to respond to 
peptides in association with self MHC 
proteins after they leave the thymus are 
positively selected: the binding of their 
TCRs to self peptides bound to self 

MHC proteins in the thymus signals such 
cells to survive, mature, and migrate to 
peripheral lymphoid organs. All of the other 
thymocytes undergo apoptosis — either 
because they do not express TCRs that 
recognize self MHC proteins with self 
peptides bound or because they recognize 
such complexes too well and undergo 
negative selection. 

The regulatory T cells (Treg cells) that 
are positively selected in the thymus are 
called natural Treg Cells to distinguish them 
from induced Treg cells, which develop 
in peripheral lymphoid organs from naive 
helper T cells (Ty cells), as we discuss 
shortly. 
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Figure 24-42 Effector cytotoxic T cells killing target cells in culture. (A) Electron micrograph showing an effector 

cytotoxic T cell (Tc cell) binding to a target cell. The Tc cells were obtained from mice immunized with the target cells, which 

are foreign tumor cells. (B) Electron micrograph showing a Tc cell and a tumor cell that the Tc cell has killed. In an animal, 

as opposed to a culture dish, the killed target cell would be phagocytosed by neighboring cells (especially macrophages) 

long before it disintegrated in the way that it has here. (C) Immunofluorescence micrograph of a Tc cell and tumor cell after 
immunofluorescence staining with anti-tubulin antibodies. Note that the centrosome in the Tc cell is located at the point of 
cell-cell contact with the target cell—an immunological synapse. The secretory granules (not visible) in the Tc cell are initially 
transported along microtubules to the centrosome, which then moves to the synapse, delivering the granules to where they can 
release their contents. (A and B, from D. Zagury et al., Eur. J. Immunol. 5:818-822, 1975. With permission from John Wiley & 
Sons, Inc; C, from B. Geiger, D. Rosen and G. Berke, J. Cell Biol. 95:137-148, 1982. With permission from the authors.) 


An effector Tc cell (or an NK cell) can employ one of two strategies to kill the 
target, both of which operate by inducing the target cell to activate caspases and 
kill itself by undergoing apoptosis. One mechanism uses a protein called Fas 
ligand on the killer-cell surface, which binds to a transmembrane receptor pro- 
tein called Fas on the target cell; this mechanism is discussed in Chapter 18 (see 
Figure 18-5). The other mechanism is the main one used by both NK cells and 
Tc cells to kill an infected target cell. The killer cell stores various toxic proteins 
within secretory vesicles in its cytoplasm that it releases into the synaptic space 
by exocytosis. The toxic proteins include perforin and proteases called granzymes. 
The perforin is homologous to complement component C9 and polymerizes in Figure 24-43 The main way that an 
the target-cell plasma membrane (see Figure 24-8), forming a transmembrane effector Tc cell (or NK cell) kills an 
pore that disrupts the membrane and allows the granzymes to enter the target infected target cell. This simplified 


cell. Once in the cytosol, the granzymes help activate caspases, thereby inducing — @/@'"g shows how the killer cell releases 


i ‘ perforin and granzymes onto the surface 
apoptosis (Figure 24-43). of an infected target cell by localized 


exocytosis at an immunological synapse. 

The high concentration of Ca?+ in the 
effector cytotoxic T cell extracellular fluid causes the perforin to 
assemble into transmembrane channels in 
the target-cell plasma membrane, allowing 
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target cell (Movie 24.12 and Movie 24.13). 
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Effector Helper T Cells Helo Activate Other Cells of the Innate and 
Adaptive Immune Systems 


In contrast to Tc cells, helper T cells (Ty cells) are crucial for defense against both 
extracellular and intracellular pathogens, and they express CD4 rather than CD8 
co-receptors and recognize foreign peptides bound to class II rather than class 
I MHC proteins. Once naive Ty cells are induced on activated dendritic cells to 
become effector cells, they can help activate other cells: they help activate B cells 
to become antibody-secreting cells and later to undergo Ig class switching and 
somatic hypermutation; they help activate macrophages to destroy any intra- 
cellular pathogens multiplying within the macrophage’s phagosomes; they help 
induce naive Tc cells to become effector cells that can kill infected target cells; 
and they stimulate the activated dendritic cell that activated them to maintain the 
dendritic cell in an activated state. In each case, the effector Ty cell recognizes the 
same complex of foreign peptide and class II MHC protein on the target-cell sur- 
face that it initially recognized on the activated dendritic cell. As discussed later, 
the Ty cell stimulates the target cell both by secreting a variety of cytokines and by 
displaying co-stimulatory proteins on its surface. 


Naive Helper T Cells Can Differentiate Into Different Types of 
Effector T Cells 


When activated by binding to a foreign peptide bound to a class II MHC protein 
on an activated dendritic cell, a naive Ty cell can differentiate into several distinct 
types of effector T cells, depending on the nature of the pathogen and the cyto- 
kines they encounter. These cells include four subtypes of helper cells—Ty1, Ty2, 
Try, and Ty17 cells—and regulatory (suppressor) T cells. Figure 24-44 summa- 
rizes both the cytokines that induce these effector T cells and some of the cyto- 
kines the effector cells secrete, as well as the master transcription regulators that 
control the effector cell’s development. 

Naive Ty cells activated by dendritic cells secreting the cytokine interleukin-12 
(IL12) develop into Ty1 cells. These effector cells produce interferon-y (IFNy), 
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Figure 24-44 Differentiation of naïve 
helper T cells into different types of 
effector helper cells or regulatory 

T cells in a peripheral lymphoid organ. 
The cytokines produced by the activating 
dendritic cell (and by other cells in the 
environment) mainly determine which type 
of effector T cell develops, as indicated. 
Some of the main cytokines produced by 
each type of effector cell are also shown, 
and the master transcription regulator for 
each subset is indicated in the nucleus. 
There is increasing evidence that some 
of the effector cells are plastic and can 
change the cytokines they produce in 
response to changes in their environment 
(not shown). 


1336 Chapter 24: The Innate and Adaptive Immune Systems 


which is critical for the activation of macrophages to destroy pathogens that either 
invaded the macrophage or were ingested by it; the IFNy can also induce B cells 
to switch the class of Ig they are making. Naive Ty cells activated in the presence 
of IL4 develop into Ty2 cells. These effector cells are important for the control 
of extracellular pathogens, including parasites. They stimulate B cells to undergo 
somatic hypermutation and to switch the class of Ig they produce: for example, 
the Ty2 cells themselves produce IL4, which can induce B cells to switch from 
making IgM and IgD to making IgE antibodies, which can bind to mast cells, as 
discussed earlier. Naive Ty cells activated in the presence of JL6 and IL21 develop 
into follicular helper T cells (Tfn), which are located in lymphoid follicles and 
secrete a variety of cytokines, including IL4 and IL21; these cells are especially 
important for stimulating B cells to undergo Ig class switching and somatic hyper- 
mutation. Naïve Ty cells activated in the presence of IL6 and TGF/ develop into 
Ty17 cells. These effector cells secrete IL17, which recruits neutrophils and stim- 
ulates epithelial cells and fibroblasts in the skin and gut to produce pro-inflam- 
matory cytokines. Ty17 cells are important in controlling extracellular bacterial 
and fungal infections and in wound healing, but they can also have a major role in 
autoimmune diseases and allergy. 

In some cases, naive Ty cells that encounter their antigen in a peripheral lym- 
phoid organ in the presence of TGF and the absence of IL6 develop into induced 
regulatory T cells (Treg cells), which suppress rather than help immune cells; as 
mentioned earlier, natural Treg cells develop in the thymus during thymocyte 
development (see Figure 24-41). In either case, the Treg cells suppress the devel- 
opment, activation, or function of most other types of immune cells, by means of 
both secreted suppressive cytokines such as JL10 and TGFB and inhibitory pro- 
teins on the Treg cell surface. Induced Treg cells seem mainly to suppress immune 
responses to foreign antigens—preventing responses to harmless ingested or 
inhaled antigens and limiting responses against pathogens to avoid excessive 
responses that cause unwanted pathology; natural Tyeg cells are needed to pre- 
vent immune responses to self molecules (see Figure 24-21). Treg cells express the 
transcription regulator FoxP3, which serves as both a marker of these cells and a 
master controller of their development: if the gene encoding this protein is inac- 
tivated in mice or humans, the individuals fail to produce Tyeg cells and develop a 
fatal autoimmune disease involving multiple organs—findings that establish the 
crucial importance of Treg cells in self-tolerance. 


Both T and B Cells Require Multiple Extracellular Signals 
For Activation 


Foreign antigen binding to BCRs or TCRs initiates the process whereby the T and 
B cells are stimulated to proliferate and differentiate into effector or memory 
cells. As mentioned earlier, these antigen receptors do not act on their own: they 
are stably associated with invariant transmembrane polypeptide chains that are 
required to relay the signal into the cell. In B cells, these are called Iga and Igp 
(Figure 24-45A), while in T cells they exist in a complex called CD3, composed 
of four types of polypeptide chains (Figure 24-45B). In both cases, the associated 
proteins help convert extracellular antigen binding to the TCR or BCR into intra- 
cellular signals, and they do so in similar ways. 

Antigen binding to BCRs or TCRs clusters these receptors and their associated 
invariant chains (and CD4 or CD8 co-receptors in the case of TCRs). This clus- 
tering activates a Src family cytoplasmic tyrosine kinase to phosphorylate tyro- 
sines on the cytoplasmic tails of some of the invariant chains. The phosphotyro- 
sines then serve as docking sites for a second cytoplasmic tyrosine kinase, which 
becomes phosphorylated and activated by the first kinase; the second kinase then 
relays the signal downstream by phosphorylating other intracellular signaling 
proteins on tyrosines. Some of these early events in the signaling pathway acti- 
vated by BCRs are shown in Figure 24-46. 

Signaling through BCRs or TCRs and their associated proteins alone is not 
sufficient to activate a lymphocyte to proliferate and differentiate. Extracellular 
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co-stimulatory signals produced by another cell are also required, and they are 
provided by membrane-bound proteins (see Figure 24-34) and secreted cyto- 
kines. Indeed, signaling through the BCR or TCR with insufficient co-stimulation 
can either eliminate the lymphocyte (clonal deletion) or inactivate it, with both of 
these mechanisms contributing to self-tolerance (see Figure 24-21). For a naïve T 
cell, an activated dendritic cell provides the co-stimulatory signals; these include 
the transmembrane B7 proteins, which are recognized by the co-receptor protein 
CD28 on the surface of the T cell (Figure 24-47A). For a B cell, an effector Ty 
cell provides the co-stimulatory signals; these include the transmembrane CD40 
ligand, which binds to CD40 receptors on the B cell (Figure 24-47B). The CD40 
ligand on effector Ty cells acts in two other situations: (1) it acts back on CD40 
receptors on the dendritic cell surface to increase and sustain the activation of the 
dendritic cell, creating a positive feedback loop; and (2) it acts as a co-stimulatory 
signal on the surface of an effector T1 cell, allowing the T cell to help activate an 
infected macrophage to destroy the pathogens it harbors. 

In addition to receptors for co-stimulatory proteins, both B and T cells have 
inhibitory proteins on their surface that help regulate the cell’s activity, preventing 
excessive or inappropriate responses. Two such proteins expressed by T cells have 
attracted great attention because of their roles in suppressing the ability of T cells 
to inhibit cancer progression: CTLA4 and PD1 proteins inhibit T cell activity in 
different ways, and monoclonal antibodies against either or especially both can 
relieve the inhibition and allow T cells to dramatically destroy the tumors in some 


patients with metastatic cancer (see Figure 20-45). 
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Figure 24-45 The invariant chains 
associated with BCRs and TCRs. 

(A) Each BCR is associated with two 
invariant heterodimers, each composed of 
an Iga and an IgB polypeptide chain linked 
by a sulfide bond (red). (B) Each TCR is 
associated with an invariant CD3 complex 
composed of two disulfide-bonded C 
chains, two € chains, and one ò and one 
y chain; these chains form homodimers or 
heterodimers, as shown. 


Figure 24—46 Early signaling events 

in a B cell activated by the binding of 
specific foreign antigen to its BCRs. If 
the antigen is on the surface of a pathogen 
or is a Soluble macromolecule with two 

or more identical antigenic determinants 
(as shown), it cross-links adjacent BCRs, 
causing them and their associated invariant 
chains to cluster, as shown. A Src-like 
cytoplasmic tyrosine kinase (which can 

be Fyn or Lyn) is associated with the 
cytosolic tail of Igp; it joins the cluster 

and phosphorylates both the Iga and 

IgB invariant chains (for simplicity, only 

the phosphorylation on IgB is shown). 

A transmembrane protein tyrosine 
phosphatase called CD45 is also required 
to remove inactivating phosphates from 
these Src-like kinases (not shown). The 
resulting phosphotyrosines on Iga and IgB 
serve as docking sites for another Src-like 
tyrosine kinase called Syk, which becomes 
phosphorylated and thereby activated to 
relay the signal downstream. 

The pathway from TCRs is similar 
(including a requirement for CD45), except 
that the first Src-like kinase is Lck, which is 
associated with a CD4 or CD8 co-receptor 
and phosphorylates tyrosines on all the 
CD83 polypeptide chains shown in Figure 
24—45B; the second Src-like kinase is 
ZAP70, which is homologous to the Syk 
kinase in B cells (Movie 24.14). 
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Many Cell-Surface Proteins Belong to the Ig Superfamily 


Most of the proteins that mediate antigen recognition and cell-cell recognition 
in the immune system contain one or more Ig or Ig-like domains, suggesting that 
the proteins have a common evolutionary history. Included in this very large Ig 
superfamily are antibodies, TCRs, MHC proteins, the CD4, CD8, and CD28 co-re- 
ceptors, the B7 co-stimulatory proteins, and most of the invariant polypeptide 
chains associated with TCRs and BCRs, as well as the various Fc receptors on lym- 
phocytes and other leukocytes. Many of these proteins are dimers or higher oligo- 
mers, in which Ig or Ig-like domains of one chain interact with those in another 


(Figure 24-48). 
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Figure 24-47 Comparison of the co- 
stimulatory proteins required to activate 
a helper T cell and a B cell in response 
to the same foreign protein. (A) A naive 
helper T cell is activated by a peptide 
fragment of a foreign protein bound to a 
class Il MHC protein on the surface of an 
activated dendritic cell. The co-stimulatory 
protein on the dendritic cell (a B7 protein— 
either CD80 or CD86) binds to the CD28 
co-receptor on the T cell, providing a 
necessary co-stimulatory signal to the 

T cell; in addition, cytokines secreted by the 
dendritic cell (or other nearby cells) influence 
what subtype of effector helper cell the 

T cell becomes (see Figure 24—44). (B) Once 
activated to become an effector cell, the 
helper T cell can help activate 

B cells that have the same peptide-MHC 
protein complexes on their surface as 

the dendritic cell that activated the T cell. 
These B cells have BCRs that bind an 
antigenic determinant on the surface of 

a folded foreign protein and endocytose 
the protein (red arrow); the protein is then 
cleaved into peptides, which are carried to 
the B cell surface by class II MHC proteins, 
where some of them can be recognized by 
the TCRs on the helper T cell (see Figure 
24-39). Note that the BCRs and TCRs 
recognize different antigenic determinants of 
the protein. As indicated, the co-stimulatory 
protein used by the effector helper T cell 

is CD40 ligand, which binds to the CD40 
co-receptor on the B cell; the T cell also 
secretes cytokines such as IL4 to help 
stimulate the B cell to undergo somatic 
hypermutation and class switching (not 
shown). The CD4 co-receptor on Ty cells is 
omitted in both (A) and (B) for simplicity. 


Figure 24-48 Some of the cell-surface 
proteins discussed in this chapter that 
belong to the Ig superfamily. The lg 

and lg-like domains are shaded in gray, 
except for the antigen-binding domains 
(not all of which are lg domains—the 

class | and class II MHC proteins are the 
exception), which are shaded in blue. 

The Ig superfamily also includes many 
cell-surface proteins involved in cell-cell 
interactions outside the immune system, 
such as the neural cell adhesion molecule 
(N-CAM) discussed in Chapter 19 and the 
receptors for various protein growth factors 
discussed in Chapter 15 (not shown). There 
are more than 750 members of the lg 
superfamily in humans. 


T CELLS AND MHC PROTEINS 


In both vertebrates and invertebrates, many proteins in the Ig superfamily are 
also found outside immune systems, where they often function in cell-cell rec- 
ognition and adhesion processes, both during development and in adult tissues. 
It seems likely that the entire gene superfamily evolved from a primordial gene 
coding for a single Ig-like domain, similar to that encoding h2-microglobulin (see 
Figure 24-36). In present-day family members, a separate exon usually encodes 
the amino acids in each Ig-like domain, consistent with the likelihood that new 
family members arose during evolution by exon and gene duplications. 


Summary 


There are three main functionally distinct classes of T cells. Cytotoxic T cells (Tc 
cells) directly kill infected cells by secreting perforins and granzymes that induce the 
infected cells to undergo apoptosis. Helper T cells (Ty cells) help activate cytotoxic 
T cells to kill their target cells, B cells to make antibody responses, macrophages 
to destroy the microorganisms they harbor, and dendritic cells to activate T cells. 
Regulatory T cells (Treg cells) produce suppressive proteins (such as the cytokines 
IL10 and TGF) to inhibit other immune cells. 

All T cells express cell-surface antigen receptors (TCRs), which are encoded by 
genes that are assembled from multiple gene segments during T cell development 
in the thymus. TCRs recognize peptide fragments of foreign proteins that are dis- 
played in association with MHC proteins on the surface of antigen-presenting cells 
(APCs) and target cells. Naive T cells are activated in peripheral lymphoid organs 
by activated dendritic cells, which secrete cytokines and express peptide-MHC com- 
plexes, co-stimulatory proteins, and various cell-cell adhesion molecules on their 
cell surface. 

Class I MHC proteins present foreign peptides to Tc cells, whereas class IT MHC 
proteins present foreign peptides to Ty cells and Treg cells. Whereas class I MHC 
proteins are expressed on almost all nucleated vertebrate cells, class II MHC pro- 
teins are normally restricted to APCs, including dendritic cells, macrophages, and 
B lymphocytes. Both classes of MHC proteins have a single peptide-binding groove, 
which binds a large set of small peptide fragments produced intracellularly by nor- 
mal protein-degradation processes: class I MHC proteins mainly bind fragments 
produced in the cytosol, whereas class II MHC proteins mainly bind fragments pro- 
duced in endocytic compartments. The peptide-MHC complexes are transported 
to the cell surface, where complexes that contain a peptide derived from a foreign 
protein are recognized by TCRs, which interact with both the peptide and the walls 
of the peptide-binding groove. T cells also express CD4 or CD68 co-receptors, which 
recognize invariant regions of MHC proteins: Ty cells and Treg cells express CD4, 
which recognizes class II MHC proteins; Tc cells express CD8, which recognizes 
class I MHC proteins. 

A combination of positive and negative selection operates during T cell devel- 
opment in the thymus to help ensure that only T cells with potentially useful TCRs 
survive, mature, and emigrate, while all of the others die by apoptosis. The naive 
Ty and Tc cells that leave the thymus constantly receive survival signals when their 
TCRs recognize self-peptide-MHC complexes, but they can only be activated when 
their TCRs encounter foreign peptides in the grooves of MHC proteins on an acti- 
vated dendritic cell. The natural T,eg cells that leave the thymus suppress self-reac- 
tive lymphocytes to help maintain self-tolerance. 

The production of an effector T cell from a naive T cell requires multiple signals 
from an activated dendritic cell. MHC-peptide complexes on the dendritic cell sur- 
face provide one signal, by binding to both TCRs and a CD4 co-receptor on a Ty or 

Treg cell. Co-stimulatory proteins on the dendritic cell surface and secreted cytokines 
are the other signals. When naive Ty, cells are initially activated on a dendritic cell, 
they differentiate into Ty1, Ty2, Try, or Ty17 effector helper cells or into induced 
Treg cells, depending mainly on the cytokines in their environment. Ty] cells secrete 
interferon-y (IFNy) to activate macrophages and to induce B cells to switch the class 
of Ig they make; Ty2 and Try cells secrete other cytokines that also induce B cells 
to switch Ig class; and Ty17 cells secrete IL17 to promote inflammatory responses 
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WHAT WE DON’T KNOW 


e What initiates an autoimmune 
disease such as type 1 diabetes or 
multiple sclerosis? 


e When a naive or memory T or 

B cell is activated by antigen and 
co-stimulatory signals, how does it 
decide whether to become an effector 
cell or memory cell? Are there cells 
that are pre-committed to becoming 
either effector or memory cells, for 
example, or is the decision determined 
solely by extracellular signals? 


e Why do some of us make IgE 
antibodies against harmless antigens 
and thereby develop hay fever and 
allergic asthma, while most of us do 
not, and why is the proportion of such 
allergic individuals increasing? 


e How does a cytotoxic T cell (or NK 
cell) avoid being killed by the perforin 
and granzymes that it secretes to kill a 
target cell? 


1340 Chapter 24: The Innate and Adaptive Immune Systems 


and wound healing. The effector helper Ty cells recognize the same complex of for- 
eign peptide and class II MHC protein on the target-cell surface as they initially 
recognized on the dendritic cell that activated them. They activate their target cells 
by producing a combination of membrane-bound and secreted co-stimulatory pro- 
teins. Treg cells suppress immune cells using cell-surface and secreted inhibitory pro- 


teins. 


Both T cells and B cells require multiple signals for activation. Antigen binding 
to the TCRs or BCRs provides one signal, while co-stimulatory proteins binding to 
co-receptors and cytokines binding to their complementary receptors provide the 
others. Effector Ty cells provide the co-stimulatory signals for B cells, whereas APCs 


provide them for T cells. 


PROBLEMS 


Which statements are true? Explain why or why not. 


24-1 T cells whose receptors strongly bind a self-pep- 
tide-MHC complex are killed off in peripheral lymphoid 
organs when they encounter the self peptide on an anti- 
gen-presenting dendritic cell. 


24-2 To guarantee that the antigen-presenting cells in 
the thymus will display a complete repertoire of self pep- 
tides to allow elimination of self-reactive T cells, the thy- 
mus recruits dendritic cells from all over the body. 


24-3 The antibody diversity created by the combinato- 
rial joining of V, D, and J segments by V(D)J recombination 
pales in comparison to the enormous diversity created by 
the random gain and loss of nucleotides at V, D, and J join- 
ing sites. 


Discuss the following problems. 


24-4 Why do living trees not rot? Redwood trees, for 
example, can live for centuries, but once they die they 
decay fairly quickly. What might this suggest? 


24-5 It would be disastrous if a complement attack were 
not confined to the surface of the pathogen that is the tar- 
get of the attack. Yet, the proteolytic cascade involved in 
the attack liberates biologically active molecules at several 
steps: one that diffuses away and one that remains bound 
to the target surface. How does the complement reaction 
remain localized when active products leave the surface? 


24-6 Based on its sequence similarity to Apobecl, 
which deaminates Cs to Us in RNA, activation-induced 
deaminase (AID) was originally proposed to work on RNA. 
But definitive experiments in E. coli demonstrated that 
AID deaminates Cs to Us in DNA. The authors of the paper 
expressed AID in bacteria and followed mutations in a 
selectable gene. They found that AID expression increased 
mutations about fivefold above the background level in 
the absence of AID expression. More importantly, they 
found that 80% of the induced mutations were G—A or 
C—>T. Does this fit with your expectation if AID-induced 
mutations arose by deamination of C to U in the DNA? 


[Hint: imagine what would happen if the G:U mismatch 
created by AID was replicated several times; how would 
the sequences of the final mutations relate to the original 
G-C base pair? | 


24-/ For many years it was a complete mystery how 
cytotoxic T cells could see a viral protein that seemed to be 
present only in the nucleus of the virus-infected cell. The 
answer was revealed in a classic paper that took advan- 
tage of aclone of T cells whose T cell receptor was directed 
against an antigen assoicated with the nuclear protein of 
the 1968 strain of influenza virus. The authors of the paper 
found that when they incubated high concentrations of 
certain peptides derived from the viral nuclear protein, the 
cells became sensitive to lysis by subsequent incubation 
with the cytotoxic T cells. Using various peptides from the 
1968 strain and the 1934 strain (with which the cytotoxic T 
cells did not react), the authors defined the particular pep- 
tide responsible for the T cell response (Figure Q24-1). 


A. Which part of the viral protein gives rise to the 
peptide that is recognized by the clone of cytotoxic T cells? 


(A) 345-360 365-380 


1968 DLRVLSFIRGTKVS PRGKLSTRGVOITASNENMDAMESSTLELRS 





1934 DLRVLSFIKGTKVVPRGKLSTRGVOIASNENMETMESSTLELRS 


369-382 
j 968 1934 
7 80 
£ 60 
g 40 
20 
0 = E m E e 


none 1968 1934 345- 365- 369- 365- 369- 
strain strain 360 380 382 380 382 


Figure Q24-1 Viral nuclear protein recognition by cytotoxic T cells 
(Problem 24-7). (A) Sequences of a segment of the nuclear protein 
from the 1968 and 1934 strains of influenza virus. Peptides used 

in the experiments in (B) are highlighted by pink bars. The amino 
acid differences between the viral proteins are highlighted in blue. 
(B) Cytotoxic T-cell-mediated lysis of target cells. The target cells 
were untreated (none), infected with virus (1968 or 1934 strain), or 
preincubated with high concentrations of the indicated viral peptide. 
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Why do not all viral peptides sensitize the target cells for 
lysis by the cytotoxic T cells? 

B. It is thought the MHC molecules come to the cell 
surface with peptides already bound. If that is so, how do 
you imagine that these experiments worked? 


24-8 Working out the rules by which T cells interact 
with their target cells was complicated. Some of the key 
observations came from studying the way cytotoxic T cells 
killed cells infected with choriomeningitis virus (LCMV). 
Cytotoxic T cells derived from mice expressing “k-type” 
class I MHC proteins lysed LCMV-infected cells express- 
ing the same k-type MHC protein, but they did not lyse 
infected cells from mice expressing “d-type” class I MHC 
proteins (Figure Q24-2). Similarly, cytotoxic T cells from 
d-type mice lysed infected d-type cells, but not infected 
k-type cells. LCMV can kill both k-type and d-type mice. 





k-type mouse 
_-y fibroblasts 





ose 
z~a d-type mouse 
RPE MOUSS: eytt orbo No 
T cells 
added to 


Figure Q24-2 Pattern of killing of LCMV-infected fibroblasts by 
cytotoxic T cells from an LCMV-infected k-type mouse (Problem 24-8). 


A. If homozygous d-type mice were bred to homozy- 
gous k-type mice to generate d-type/k-type heterozygous 
progeny, would you expect that cytotoxic T cells from these 
heterozygotes, when infected with LCMV, to be able to lyse 
infected d-type cells? How about infected k-type cells? 
Explain your answers. 

B. Oddly enough, LCMV infection does not kill mice 
that lack a thymus—such as “nude” mice, so called because 
they also lack hair. If a thymus is transplanted back into a 
nude mouse, it will die when infected with LCMV. Suppose 
that a d-type/k-type heterozygous nude mouse was given 
a thymus from an d-type donor. Would you expect its cyto- 
toxic T cells to be able to lyse infected d-type cells? How 
about infected k-type cells? Explain your answers. 


24-9 Before exposure to a foreign antigen, T cells with 
receptors specific for the antigen are a tiny fraction of the T 
cells—on the order of 1 in 10° or 1 in 10° T cells. After expo- 
sure to the antigen, only a small number of dendritic cells 
typically display the antigen on their surface. How long 
does it take for such antigen-presenting dendritic cells to 
interact with the antigen-specific T cells, which is the key 
first step in T cell activation and clonal expansion? The 
dynamics of the search process were examined by labeling 
dendritic cells red and T cells green, so that contacts in an 
intact lymph node could be scored visually using two-pho- 
ton fluorescence microscopy (Figure Q24-3A). The fre- 
quency of contacts between dendritic cells and T cells 
from such experiments is given in Figure 24-3B. Assuming 
that 100 dendritic cells present the specific antigen, how 
long would it take them to scan 10° T cells? How long for 
10° T cells? 


(A) 


(B) 
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Figure Q24-3 Scanning 
of the T cell repertoire by 
dendritic cells (Problem 24-9). 
(A) Contacts between different 
T cells and one dendritic 

cell. T cells are green and 
dendritic cells are red. The 
dendritic cell labeled with 

an asterisk contacts a total 
of three T cells (numbered) 
over time in this sequence 

of images. Times are shown 
as hours: minutes. (B) Plot of 
T cell contacts for individual 
dendritic cells over time. 

(A, from P Bousso and 

E. Robey, Nat. /mmunol. 

60 4:579-581, 2003. With 
permission from Macmillan 
Publishers Ltd.) 





encounters 
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30 
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40 50 


24-10 At first glance, it would seem a dangerous strategy 
for the thymus to actively promote the survival, matura- 
tion, and emigration of developing T cells that bind weakly 
to self peptides bound to self MHC molecules. Would it 
not be safer to get rid of these T cells, along with those that 
bind strongly to such self-peptide- MHC complexes, as this 
would seem a more secure way to avoid autoimmune reac- 
tions? 


24-11 CD4 proteins on helper and regulatory T cells 
serve as co-receptors that bind to invariant parts of class 
I MHC proteins. CD4 is thought to increase the adhesion 
between T cells and antigen-presenting cells (APCs) that 
are initially connected only weakly by the T cell receptor 
bound to its specific peptide-MHC complex. To test this 
possibility, you label cell-surface MHC molecules with a 
fluorescently labeled peptide so that you can detect indi- 
vidual peptide-MHC complexes at the interface between 
the APCs and the T cells in a culture dish. To detect T cell 
responses—the sign of a productive contact—you load 
them with a Ca** indicator dye, as cytosolic Ca** increases 
when lymphocytes are active. You now count the peptide- 
MHC complexes at a large number of interfaces (immuno- 
logical synapses) and measure the resulting uptake of Ca** 
in the adherent T cells (Figure Q24-4, red circles). When 
you repeat the experiment in the presence of blocking anti- 
bodies against CD4, you get a different result (blue circles). 
Do these results support or refute the notion that CD4 aug- 
ments T cell receptor binding? Explain your answer. 

Figure Q24—4 Role of 

CD4 in the T cell response 

(Problem 24-11). The uptake 


of Ca?* in cells with different 
numbers of fluorescently 


40 labeled peptide-MHC 
és e complexes at the interface 
E 30 , ae a between the T cells and the 
Q. antigen-presenting cells. 
g 20 The results in the absence 
3 of CD4-blocking antibodies 
o k are shown by the red curve; 
0 results in the presence of 
0 20 40 60 80 CD4 antibodies are shown 


number of peptide-MHC complexes by the blue curve. 
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Glossary 


ABC transporters A large family of membrane transport 
proteins that use the energy of ATP hydrolysis to transfer 
peptides or small molecules across membranes. (Figure 11-16) 


acetyl CoA Small water-soluble activated carrier molecule. 
Consists of an acetyl group linked to coenzyme A (CoA) by an 
easily hydrolyzable thioester bond. (Figure 2-38) 


acetylcholine receptor (AChR) Membrane protein that 
responds to binding of acetylcholine (ACh). The nicotinic AChR 
is a transmitter-gated ion channel that opens in response 

to ACh. The muscarinic AChR is not an ion channel, but a 
G-protein-coupled cell-surface receptor. 


acid A proton donor. Substance that releases protons (H*) 
when dissolved in water, forming hydronium ions (H30*) and 
lowering the pH. (Panel 2-2, pp. 92-93) 


acid hydrolases Hydrolytic enzymes—including proteases, 
nucleases, glycosidases, lipases, phospholipases, 
phosphatases, and sulfatases—that work best at acidic pH; 
these enzymes are found within the lysosome. 


action potential Rapid, transient, self-propagating electrical 
excitation in the plasma membrane of a cell such as a neuron 
or muscle cell. Action potentials, or nerve impulses, make 
possible long-distance signaling in the nervous system. 
(Figure 11-31) 


activated carrier Small diffusible molecule that stores easily 
exchangeable energy in the form of one or more energy-rich 
covalent bonds. Examples are ATP, acetyl CoA, FADHp9, 
NADH, and NADPH. (Figure 2-31) 


activation energy The extra energy that must be acquired by 
atoms or molecules in addition to their ground-state energy in 
order to reach the transition state required for them to undergo 
a particular chemical reaction. (Figure 2-21) 


activation-induced deaminase (AID) The enzyme catalyzing 
the processes of somatic hypermutation and immunoglobulin 
class switching in activated B cells. 


active site Region of an enzyme surface to which a substrate 
molecule binds in order to undergo a catalyzed reaction. 
(Figure 1-7) 


active transport Movement of a molecule across a 
membrane or other barrier driven by energy other than that 
stored in the electrochemical or concentration gradient of the 
transported molecule. 


adaptation (1) Adaptation (desensitization): adjustment of 
sensitivity following repeated stimulation. The mechanism 

that allows a cell to react to small changes in stimuli even 
against a high background level of stimulation. (2) Evolutionary 
adaptation: an evolved trait. 


G:1 


adaptive immune system System of lymphocytes providing 
highly specific and long-lasting defense against pathogens in 
vertebrates. It consists of two major classes of lymphocytes: 

B lymphocytes (B cells), which secrete antibodies that bind 
specifically to the pathogen or its products, and T lymphocytes 
(T cells), which can either directly kill cells infected with the 
pathogen or produce secreted or cell-surface signal proteins 
that stimulate other host cells to help eliminate the pathogen. 
(Figure 24-2) 


adaptor protein, adaptor General term for a protein that 
functions solely to link two or more different proteins together in 
an intracellular signaling pathway or protein complex. 

(Figure 15-11) 


adenylyl cyclase (adenylate cyclase) Membrane-bound 
enzyme that catalyzes the formation of cyclic AMP from ATP. An 
important component of some intracellular signaling pathways. 


adherens junction Cell junction in which the cytoplasmic 
face of the plasma membrane is attached to actin filaments. 
Examples include adhesion belts linking adjacent epithelial cells 
and focal contacts on the lower surface of cultured fibroblasts. 


adhesins Specific proteins or protein complexes of 
pathogenic bacteria that recognize and bind cell-surface 
molecules on the host cells to enable tight adhesion and 
colonization of tissues. 


adhesion belt Adherens junctions in epithelia that form a 
continuous belt (zonula adherens) just beneath the apical face 
of the epithelium, encircling each of the interacting cells in the 
sheet. 


ADP (adenosine 5’-diphosphate) Nucleotide produced by 
hydrolysis of the terminal phosphate of ATP. Regenerates ATP 
when phosphorylated by an energy-generating process such as 
oxidative phosphorylation. (Figure 2-33) 


aerobic respiration Process by which a cell obtains energy 
from sugars or other organic molecules by allowing their carbon 
and hydrogen atoms to combine with the oxygen in air to 
produce COs and H20, respectively. 


affinity maturation Progressive increase in the affinity of 
antibodies for the immunizing antigen with the passage of time 
after immunization. 


Agrin Signal protein released by an axonal growth cone 
during formation of the synapse between it and a muscle cell. 


AIRE (autoimmune regulator) A protein expressed by a 
subpopulation of epithelial cells in the thymus that stimulates 
the production of small amounts of self proteins characteristic 
of other organs, exposing developing thymocytes to these 
proteins for the purpose of self-tolerance. 


G:2 Glossary 


Akt Serine/threonine protein kinase that acts in the PI-3- 
kinase/Akt intracellular signaling pathway involved especially in 
signaling cells to grow and survive. Also called protein kinase B 
(PKB). 


allele One of several alternative forms of a gene. In a diploid 
cell, each gene will typically have two alleles, occupying the 
corresponding position (locus) on homologous chromosomes. 


allosteric protein A protein that can adopt at least two 
distinct conformations, and for which the binding of a ligand at 
one site causes a conformational change that alters the activity 
of the protein at a second site; this allows one type of molecule 
in a cell to alter the fate of a molecule of another type, a feature 
widely exploited in enzyme regulation. 


allostery (adjective allosteric) Change in a protein’s 
conformation brought about by the binding of a regulatory 
ligand (at a site other than the protein’s catalytic site), or by 
covalent modification. The change in conformation alters 
the activity of the protein and can form the basis of directed 
movement. (Figures 3-57 and 16-29) 


alpha helix (a helix) Common folding pattern in proteins, 

in which a linear sequence of amino acids folds into a right- 
handed helix stabilized by internal hydrogen-bonding between 
backbone atoms. (Figure 3-7) 


alternative RNA splicing Production of different RNAs from 
the same gene by splicing the transcript in different ways. 
(Figure 7—57) 


amino acid Organic molecule containing both an amino 
group and a carboxyl group. Those that serve as building 
blocks of proteins are aloha amino acids, having both the 
amino and carboxyl groups linked to the same carbon atom. 
(NHaCHRCOOH, Panel 3-1, pp. 112-113) 


aminoacyl-tRNA synthetase Enzyme that attaches the 
correct amino acid to a tRNA molecule to form an aminoacyl- 
tRNA. (Figure 6-54) 


AMPA receptor Glutamate-gated ion channel in the 
mammalian central nervous system that carries most of the 
depolarizing current responsible for excitatory postsynaptic 
potentials. 


amphiphilic Having both hydrophobic and hydrophilic 
regions, as in a phospholipid or a detergent molecule. 


amyloid fibrils Self-propagating, stable B-sheet aggregates 
built from hundreds of identical polypeptide chains that become 
layered one over the other to create a continuous stack of 

B sheets. The unbranched fibrous structure can contribute to 
human diseases when not controlled. 


anaphase (1) Stage of mitosis during which sister chromatids 
separate and move away from each other. (2) Anaphase | and 
Il: stages of meiosis during which chromosome homolog pairs 
separate (I), and then sister chromatids separate (Il). (Panel 
17-1, pp. 980-981) 


anaphase A Stage of mitosis during which chromosome 
segregation occurs as chromosomes move toward the two 
spindle poles. 


anaphase B Stage of mitosis during which chromosome 
segregation occurs as spindle poles separate and move apart. 


anaphase-promoting complex (APC/C; cyclosome) 
Ubiquitin ligase that catalyzes the ubiquitylation and destruction 
of securin and M- and S-cyclins, initiating the separation of 
sister chromatids in the metaphase-to-anaphase transition 
during mitosis. 


anchorage dependence Dependence of cell growth, 
oroliferation, and survival on attachment to a substratum. 


anchoring junction Cell junction that attaches cells to 
neighboring cells or to the extracellular matrix. 
(Table 19-1, p. 1037) 


angiogenesis Growth of new blood vessels by sprouting 
from existing ones. 


antenna complex Part of a photosystem that captures light 
energy and channels it into the photochemical reaction center. 
It consists of protein complexes that bind large numbers of 
chlorophyll molecules and other pigments. 


Antennapedia complex One of two gene clusters in 
Drosophila that contain Hox genes; genes in the Antennapeala 
complex control the differences among the thoracic and head 
segments of the body. 


anti-apoptotic Bcl2 family proteins Proteins (e.g., Bcl2, 
BclX,) on the cytosolic surface of the outer mitochondrial 
membrane that bind and inhibit pro-apoptotic Bcl2 family 
proteins and thereby helo prevent inappropriate activation of the 
intrinsic pathway of apoptosis. 


anti-IAP Produced in response to various apoptotic stimuli 
and, by binding to IAPs and preventing their binding to a 
caspase, neutralize the inhibition of apoptosis provided by IAPs. 


antibiotic Substance such as penicillin or streptomycin that is 
toxic to microorganisms. Often a natural product of a particular 
microorganism or plant. 


antibody Protein secreted by activated B cells in response to 
a pathogen or foreign molecule. Binds tightly to the pathogen 
or foreign molecule, inactivating it or marking it for destruction 
by phagocytosis or complement-induced lysis. (Figure 24-23) 


antibody response Adaptive immune response in which 
B cells are activated to secrete antibodies that circulate in 
the bloodstream or enter other body fluids, where they can 
bind specifically to the foreign antigen that stimulated their 
production. 


anticodon Sequence of three nucleotides in a transfer RNA 
(tRNA) molecule that is complementary to a three-nucleotide 
codon in a messenger RNA (mRNA) molecule. 


antigen A molecule that can induce an adaptive immune 
response or that can bind to an antibody or T cell receptor. 


antigen-presenting cell Cell that displays foreign antigen 
complexed with an MHC protein on its surface for presentation 
to T lymphocytes. 


antigenic determinant Specific region of an antigen that 
binds to an antibody or a complementary receptor on the 
surface of a B cell (BCR) or T cell (TCR). 


antigenic variation Ability to change the antigens 
displayed on the cell surface; a property of some pathogenic 
microorganisms that enables them to evade attack by the 
adaptive immune system. 


antiparallel Describes the relative orientation of the two 
strands in a DNA double helix or two paired regions of a 
polypeptide chain; the polarity of one strand is opposite to that 
of the other. 


antiporter Carrier protein that transports two different ions 
or small molecules across a membrane in opposite directions, 
either simultaneously or in sequence. (Figure 11-8) 


Apafi Adaptor protein of the intrinsic apoptotic pathway; on 
binding cytochrome c, oligomerizes to form an apoptosome. 


apical Referring to the tip of a cell, a structure, or an organ. 
The apical surface of an epithelial cell is the exposed free 
surface, opposite to the basal surface. The basal surface rests 


on the basal lamina that separates the epithelium from other 
tissue. 


apoptosis Form of programmed cell death, in which a 
“suicide” program is activated within an animal cell, leading to 
rapid cell death mediated by intracellular proteolytic enzymes 
called caspases. 


apoptosome Heptamer of Apafi proteins that forms 
on activation of the intrinsic apoptotic pathway; it recruits 
and activates initiator caspases that subsequently activate 
downstream executioner caspases to induce apoptosis. 


aquaporin (water channel) Channel protein embedded 

in the plasma membrane that greatly increases the cell’s 
permeability to water, allowing transport of water, but not ions, 
at a high rate across the membrane. 


archaeon (plural arch[aJea) (archaebacterium) Single- 
celled organism without a nucleus, superficially similar 

to bacteria. At a molecular level, more closely related to 
eukaryotes in genetic machinery than are bacteria. Archaea and 
bacteria together make up the prokaryotes. (Figure 1-17) 


ARF proteins Monomeric GTPase in the Ras superfamily 
responsible for regulating both COPI coat assembly and clathrin 
coat assembly. (Table 15-5, p. 854) 


ARP (actin-related protein) complex (Arp 2/3 complex) 
Complex of proteins that nucleates actin filament growth from 
the minus end. 


arrestin Member of a family of proteins that contributes to 
GPCR desensitization by preventing the activated receptor from 
interacting with G proteins and serving as an adaptor to couple 
the receptor to clathrin-dependent endocytosis. (Figure 15—42) 


astral microtubule In the mitotic spindle, any of the 
microtubules radiating from the aster which are not attached to 
a kinetochore of a chromosome. 


asymmetric cell division Cell division in which some 
important molecule or molecules are distributed unequally 
between the two daughter cells, causing these cells to become 
different from each other. 


ATM (ataxia telangiectasia mutated protein) Protein 
kinase activated by double-strand DNA breaks. If breaks are 
not repaired, ATM initiates a signal cascade that culminates in 
cell cycle arrest. Related to ATR. 


ATP (adenosine 5’-triphosphate) Nucleoside triphosphate 
composed of adenine, ribose, and three phosphate groups. 
The principal carrier of chemical energy in cells. The terminal 
phosphate groups are highly reactive in the sense that their 
hydrolysis, or transfer to another molecule, takes place with the 
release of a large amount of free energy. (Figure 2-33) 


ATP synthase (FF o ATPase) Transmembrane enzyme 
complex in the inner membrane of mitochondria and the 
thylakoid membrane of chloroplasts. Catalyzes the formation 
of ATP from ADP and inorganic phosphate during oxidative 
phosphorylation and photosynthesis, respectively. Also present 
in the plasma membrane of bacteria. 


ATR (ataxia telangiectasia and Rad3 related protein) 
Protein kinase activated by DNA damage. If damage remains 
unrepaired, ATR helps initiate a signal cascade that culminates 
in cell cycle arrest. Related to ATM. 


autoimmune disease Pathological state in which the body 
mounts a disabling adaptive immune response against one or 
more of its own molecules. 


autophagosome Organelle surrounded by a double 
membrane contains engulfed cytoplasmic cargo in the initial 
stages of autophagy. 
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autophagy Digestion of cytoplasm and worn-out organelles 
by the cell’s own lysosomes. 


auxin Plant hormone, commonly indole-3-acetic acid, with 
numerous roles in plant growth and development. 


axon Long nerve cell projection that can rapidly conduct 
nerve impulses over long distances so as to deliver signals to 
other cells. 


axoneme_ Bundle of microtubules and associated proteins 
that forms the core of a cilium or a flagellum in eukaryotic cells 
and is responsible for their movements. 


bacterial artificial chromosome (BAC) Cloning vector that 
can accommodate large pieces of DNA, typically up to 1 million 
base pairs. 


bacteriorhodopsin Pigmented protein found in the plasma 
membrane of a salt-loving archaeon, Halobacterium salinarium 
(Halobacterium halobium). Pumps protons out of the cell in 
response to light. 


bacterium (plural bacteria) (eubacterium) Member of 

the domain bacteria, one of the three main branches of the 

tree of life (archaea, bacteria, and eukaryotes). Bacteria and 
archaea both lack a distinct nuclear compartment, and together 
comprise the prokaryotes. (Figure 1-1 7) 


Bak A main effector Bcl2 family protein of the intrinsic 
pathway of apoptosis in mammalian cells that is bound to 

the mitochondrial outer membrane even in the absence of an 
apoptotic signal; activation is usually by activated pro-apoptotic 
BHS-only proteins. 


basal Situated near the base. Opposite the apical surface. 


basal lamina (plural basal laminae) Thin mat of extracellular 
matrix that separates epithelial sheets, and many other types 
of cells such as muscle or fat cells, from connective tissue. 
Sometimes called basement membrane. (Figure 19-51) 


base (1) A substance that can reduce the number of protons 
in solution, either by accepting H* ions directly, or by releasing 
OH- ions, which then combine with Ht to form H20. 

(2) The purines and pyrimidines in DNA and RNA are organic 
nitrogenous bases and are often referred to simply as bases. 
(Panel 2-2, pp. 92-93) 


base excision repair DNA repair pathway in which single 
faulty bases are removed from the DNA helix and replaced. 
Compare nucleotide excision repair. (Figure 5—41) 


base pair Two nucleotides in an RNA or DNA molecule that 
are held together by hydrogen bonds—for example, G paired 
with C, and A paired with T or U. 


basement membrane Thin mat of extracellular matrix that 
separates epithelial sheets, and many other types of cells such 
as muscle or fat cells, from connective tissue. Also called basal 
lamina. (Figure 19-51) 


Bax A main effector Bcl2 family protein of the intrinsic 
pathway of apoptosis in mammalian cells; located mainly 

in the cytosol and translocates to the mitochondria only 
after activation, usually by activated pro-apoptotic BH3-only 
proteins. 


B cell receptor (BCR) The transmembrane immunoglobulin 
protein on the surface of a B cell that serves as its receptor for 
antigen. 


Bcl2 Anti-apoptotic Bcl2 family protein of the outer 
mitochondrial membrane that binds and inhibits pro-apoptotic 
Bcl2 family proteins and prevents inappropriate activation of the 
intrinsic pathway of apoptosis. 
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Bcl2 family Family of intracellular proteins that either promote 
or inhibit apoptosis by regulating the release of cytochrome 

c and other mitochondrial proteins from the intermembrane 
space into the cytosol. 


BelX_ Anti-apoptotic Bcl2 family protein of the outer 
mitochondrial membrane that binds and inhibits pro-apoptotic 
Bcl2 family proteins and prevents inappropriate activation of the 
intrinsic pathway of apoptosis. 


benign Of tumors: self-limiting in growth, and noninvasive. 


beta sheet (B sheet) Common structural motif in proteins in 
which different sections of the polypeptide chain run alongside 
each other, joined together by hydrogen-bonding between 
atoms of the polypeptide backbone. Also known as a B pleated 
sheet. (Figure 3-7) 


beta-catenin (B-catenin) Multifunctional cytoplasmic protein 
involved in cadherin-mediated cell-cell adhesion, linking 
cadherins to the actin cytoskeleton. Can also act independently 
as a transcription regulatory protein. Has an important role in 
animal develooment as part of a Wnt signaling pathway. 


BH3-only proteins The largest subclass of Bcl2 family 
proteins. Produced or activated in response to an apoptotic 
stimulus and promote apoptosis mainly by inhibiting anti- 
apoptotic Bcl2 family proteins. 


bi-orientation The attachment of sister chromatids to 
opposite poles of the mitotic spindle, so that they move to 
opposite ends of the cell when they separate in anaphase. 


binding site Region on the surface of one molecule (usually a 
protein or nucleic acid) that can interact with another molecule 
through noncovalent bonding. 


BiP Endoplasmic reticulum (ER)-resident chaperone protein 
member of the family of hsp70-type chaperone proteins. 


Bithorax complex One of two gene clusters in Drosophila 
that contain Hox genes; genes in the Bithorax complex control 
the differences among the abdominal and thoracic segments of 
the body. 


bivalent A four-chromatid structure formed during meiosis, 
consisting of a duplicated chromosome tightly paired with its 
homologous duplicated chromosome. 


blastomere One of the many cells formed by the cleavage of 
a fertilized egg. 


blastula Early stage of an animal embryo, usually consisting 
of a hollow ball of epithelial cells surrounding a fluid-filled cavity, 
before gastrulation begins. 


blebbing Membrane protrusion formed when the plasma 
membrane detaches locally from the underlying actin cortex, 
allowing cytoplasmic flow and hydrostatic pressure within the 
cell to push the membrane outward. 


bone Dense and rigid connective tissue comprising a mixture 
of tough fibers (type | collagen fibrils), which resist pulling forces, 
and solid particles (calcium phosphate as hydroxylapatite 
crystals), which resist compression. 


brassinosteroids Class of steroid signal molecules in plants 
that regulate the growth and differentiation of plants throughout 
their life cycle via binding to a cell-surface receptor kinase to 
initiate a signaling cascade. 


bright-field microscope Normal light microscope in which 
the image is obtained by simple transmission of light through 
the object being viewed. 


buffer Solution of weak acid or weak base that resists the OH 


change that would otherwise occur when small quantities of 
acid or base are added. 


C3 The pivotal complement protein that is activated by the 
early components of all three complement pathways (the 
classical pathway, the lectin pathway, and the alternative 
pathway). (Figure 24-7) 


Ca?+ pump (calcium pump, Ca2+ ATPase) Transport protein 
in the membrane of sarcoplasmic reticulum of muscle cells 
(and elsewhere). Pumps Ca?* out of the cytoplasm into the 
sarcoplasmic reticulum using the energy of ATP hydrolysis. 


Ca*t-activated K+ channel Opens in response to the raised 
concentration of Ca?* in nerve cells that occurs in response 

to an action potential. Increased K* permeability makes the 
membrane harder to depolarize, increasing the delay between 
action potentials and decreasing the response of the cell to 
constant, prolonged stimulation (adaptation). 


Ca?*/calmodulin-dependent kinase (CaM-kinase) Serine/ 
threonine protein kinase that is activated by Ca?+/calmodulin. 
Indirectly mediates the effects of an increase in cytosolic Ca?+ 
by phosphorylating specific target proteins. (Figure 15-33) 


cadherin Member of the large cadherin superfamily of 
transmembrane adhesion proteins. Mediates homophilic 
Ca?*-dependent cell-cell adhesion in animal tissues. (Figure 
19-3 and Table 19-1, p. 1037) 


cadherin superfamily Family of classical and nonclassical 
cadherin proteins with more than 180 members in humans. 


calmodulin Ubiquitous intracellular Ca?+-binding protein that 
undergoes a large conformation change when it binds Ca?*, 
allowing it to regulate the activity of many target proteins. In 

its activated (Ca2t+-bound) form, it is called Ca2+/calmodulin. 
(Figure 15-33) 


calnexin Carbohydrate-binding chaperone protein in 

the endoplasmic reticulum (ER) membrane that binds to 
oligosaccharides on incompletely folded proteins and retains 
them in the ER. 


calreticulin Carbohydrate-binding chaperone protein 

in the endoplasmic reticulum (ER) lumen that binds to 
oligosaccharides on incompletely folded proteins and retains 
them in the ER. 


CaM-kinase II Multifunctional Ca2*/calmodulin-dependent 
protein kinase that phosphorylates itself and various target 
proteins when activated. Found in most animal cells but is 
especially abundant at synapses in the brain, and is involved in 
some forms of synaptic plasticity in vertebrates. (Figure 15-34) 


cancer stem cells Rare cancer cells capable of dividing 
indefinitely. 


cancer-critical genes Genes whose alteration contributes to 
the causation or evolution of cancer by driving tumorigenesis. 


capsid Protein coat of a virus, formed by the self-assembly of 
one or more types of protein subunit into a geometrically regular 
structure. (Figure 3-27) 


carbohydrate layer The carbohydrate-rich zone on the 
eukaryotic cell surface attributable to glycoproteins, glycolipids, 
and proteoglycans of the plasma membrane. 


carbon-fixation reaction Process by which inorganic carbon 
(as atmospheric CO») is incorporated into organic molecules. 
The second stage of photosynthesis. (Figure 14—40) 


carcinogenesis [he generation of cancer. 


carcinoma Cancer of epithelial cells. The most common form 
of human cancer. 


cargo [he membrane components and soluble molecules 
carried by transport vesicles. 


cartilage Form of connective tissue composed of cells 
(chondrocytes) embedded in a matrix rich in type II collagen 
and chondroitin sulfate proteoglycan. 


caspase Intracellular protease that is involved in mediating 
the intracellular events of apoptosis. 


catalyst Substance that can lower the activation energy of a 
reaction (thus increasing its rate), without itself being consumed 
by the reaction. 


caveola (plural caveolae) |nvaginations at the cell surface 
that bud off internally to form pinocytic vesicles. Thought to 
form from lipid rafts, regions of membrane rich in certain lipids. 


caveolins Family of unusual integral membrane proteins that 
are the major structural proteins in caveolae, 


CD4 Co-receptor protein on helper T cells and regulatory 

T cells that binds to a nonvariable part of class Il MHC proteins 
(on antigen-presenting cells) outside the peptide-binding 
groove. (Figure 24—40) 


CD8 Co-receptor protein on cytotoxic T cells that binds to a 
nonvariable part of class | MHC proteins (on antigen-presenting 
cells and infected target cells) outside the peptide-binding 
groove. (Figure 24—40) 


Cdc20 Activating subunit of the anaphase-promoting 
complex (APC/C). 


Cdc25 Protein phosphatase that deohosphorylates Cdks and 
increases their activity. 


Cdc42 Member of the Rho family of monomeric GTPases 
that regulate the actin and microtubule cytoskeletons, cell-cycle 
progression, gene transcription, and membrane transport. 


Cdc6 Protein essential in the preparation of DNA for 
replication. With Cdt1 it binds to an origin recognition complex 
on chromosomal DNA and helps load the Mcm proteins onto 
the complex to form the prereplicative complex. 


Cdh1 Activating subunit of the anaphase-promoting complex 
(APC/C), 


Cdk inhibitor protein (CKI) Protein that binds to and inhibits 
cyclin-Cdk complexes, primarily involved in the control of G4 
and S phases. 


Cdk-activating kinase (CAK) Protein kinase that 
phosphorylates Cdks in cyclin-—Cdk complexes, activating the 
Cdk. 


cDNA clone Clone containing double-stranded cDNA 
molecules derived from the protein-coding MRNA molecules 
oresent in a cell. 


cDNA library Collection of cloned DNA molecules 
representing complementary DNA copies of the mRNA 
produced by a cell. 


Cdt1 Protein essential in the preparation of DNA for 
replication. With Cdc6 it binds to origin recognition complexes 
on chromosomes and helps load the Mcm proteins on to the 
complex, forming the prereplicative complex. 


cell cortex Specialized layer of cytoplasm on the inner face 
of the plasma membrane. In animal cells it is an actin-rich layer 
responsible for movements of the cell surface. 


cell cycle (cell-division cycle) Reproductive cycle of a cell: 
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the orderly sequence of events by which a cell duplicates its 
chromosomes and, usually, the other cell contents, and divides 
into two. (Figure 17-4) 


cell determination Process whereby a cell progressively 
loses the potential to form other cell tyoes, as develooment 
proceeds. 


cell doctrine The proposal that all living organisms are 
composed of one or more cells and that all cells arise from the 
division of other living cells. 


cell memory Retention by cells and their descendants of 
persistently altered patterns of gene expression, without any 
change in DNA sequence. See also epigenetic inheritance. 


cell plate Flattened membrane-bounded structure that forms 
by fusing vesicles in the cytoplasm of a dividing plant cell and is 
the precursor of the new cell wall. 


cell-cycle control system Network of regulatory proteins 
that governs progression of a eukaryotic cell through the cell 
cycle. 


cellulose Long, unbranched chains of glucose; major 
constituent of plant cell walls. 


cellulose microfibril Highly ordered crystalline aggregate 
formed from bundles of about 40 cellulose chains, arranged 
with the same polarity and stuck together in overlapping 
parallel arrays by hydrogen bonds between adjacent cellulose 
molecules. 


central (primary) lymphoid organ Organ in which T or 

B lymphocytes are produced from precursor cells. In adult 
mammals, these are the thymus and bone marrow, respectively. 
(Figure 24-12) 


centriole Short cylindrical array of microtubules, closely 
similar in structure to a basal body. A pair of centrioles is usually 
found at the center of a centrosome in animal cells. (Figure 
16-48) 


centromere Constricted region of a mitotic chromosome 
that holds sister chromatids together. This is also the site 
on the DNA where the kinetochore forms so as to capture 
microtubules from the mitotic spindle. (Figure 4—43) 


centrosome Centrally located organelle of animal cells that is 
the primary microtubule-organizing center (MTOC) and acts as 
the spindle pole during mitosis. In most animal cells it contains 
a pair of centrioles. (Figures 16-47 and 17-24) 


cerebral cortex Outermost layer of the hemispheres of the 
brain; the most complex structure in the human body. 


CG island Region of DNA in vertebrate genomes with a 
greater than average density of CG sequences; these regions 
generally remain unmethylated. 


channel (membrane channel) Transmembrane protein 
complex that allows inorganic ions or other small molecules to 
diffuse passively across the lipid bilayer. (Figure 11-3) 


channelrhodopsin Photosensitive protein forming a cation 
channel across the membrane that opens in response to light. 


charge separation |n photosynthesis, the light-induced 
transfer of a high-energy electron from chlorophyll to an 
acceptor molecule resulting in the formation of a positive 
charge on the chlorophyll and a negative charge on a mobile 
electron carrier. 


chemical biology Name given to a strategy that uses large- 
scale screening of hundreds of thousands of small molecules 
in biological assays to identify chemicals that affect a particular 
biological process and that can then be used to study it. 
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chemical carcinogens Disparate chemicals that are 
carcinogenic — due to the ability to cause mutations — when fed 
to experimental animals or painted repeatedly on their skin. 


chemical group Certain combinations of atoms—such as 
methyl (-CHs3), hydroxyl (-OH), carboxyl (-COOH), carbonyl 
(-C=O), phosphate (-PO3*), sulfhydryl (-SH), and amino (-NH>) 
groups—that have distinct chemical and physical properties 
and influence the behavior of the molecule in which the group 
OCCUIS. 


chemiosmotic coupling (chemiosmosis) Mechanism in 
which an electrochemical proton gradient across a membrane 
(composed of a pH gradient plus a membrane potential) is used 
to drive an energy-requiring process, such as ATP production 
or the rotation of bacterial flagella. 


chemotaxis Movement of a cell toward or away from some 
diffusible chemical. 


chiasma (plural chiasmata) X-shaped connection visible 
between paired homologous chromosomes during meiosis. 
Represents a site of chromosomal crossing-over, a form of 
genetic recombination. 


chlorophyll Light-absorbing green pigment that plays a 
central part in photosynthesis in bacteria, plants, and algae. 


chloroplast Organelle in green algae and plants that contains 
chlorophyll and carries out photosynthesis. 


cholera toxin Secreted toxic protein of Vibrio cholerae 
responsible for causing the watery diarrhea associated with 
cholera. Comprises an A subunit with enzymatic activity and a 
B subunit that binds to host-cell receptors to direct subunit A to 
the host-cell cytosol. 


cholesterol An abundant lipid molecule with a characteristic 
four-ring steroid structure. An important component of the 
plasma membranes of animal cells. (Figure 10-4) 


chromatin Complex of DNA, histones, and non-histone 
proteins found in the nucleus of a eukaryotic cell. The material 
of which chromosomes are made. 


chromatin immunoprecipitation Technique by which 
chromosomal DNA bound by a particular protein can be 
isolated and identified by precipitating it by means of an 

antibody against the protein. (Figures 8-66 and 8-67) 


chromosome Structure composed of a very long DNA 
molecule and associated proteins that carries part (or all) of 
the hereditary information of an organism. Especially evident 

in plant and animal cells undergoing mitosis or meiosis, during 
which each chromosome becomes condensed into a compact 
rodlike structure visible in the light microscope. 


cilium (plural cilia) Hairlike extension of a eukaryotic cell 
containing a core bundle of microtubules. Many cells contain a 
single nonmotile cilium, while others contain large numbers that 
perform repeated beating movements. Compare flagellum. 


circadian clock Internal cyclical process that produces a 
particular change in a cell or organism with a period of around 
24 hours, for example the sleep-wakefulness cycle in humans. 


cis face Face on the same or near side. 


cis Golgi network (CGN) Network of fused vesicular tubular 
clusters that is closely associated with the cis face of the Golgi 
apparatus and is the compartment at which proteins and lipids 
enter the Golgi. 


cis-regulatory sequences DNA sequences to which 
transcription regulators bind to control the rate of gene 


transcription. In nearly all cases, these sequences must be on 
the same chromosome (that is, in cis) to the genes they control. 
(Figure 7-18) 


cisternal maturation model One hypothesis for how the 
Golgi apparatus achieves and maintains its polarized structure 
and how molecules move from one cisterna to another. This 
model views the cisternae as dynamic structures that mature 
from early to late by acquiring and then losing specific Golgi- 
resident proteins as they move through the Golgi stack with 
Cargo. 


citric acid cycle [tricarboxylic acid (TCA) cycle, Krebs 
cycle] Central metabolic pathway found in aerobic organisms. 
Oxidizes acetyl groups derived from food molecules, generating 
the activated carriers NADH and FADH2, some GTP, and waste 
COs. In eukaryotic cells, it occurs in the mitochondria. 

(Panel 2-9, pp. 106-107) 


clamp loader Protein complex that utilizes ATP hydrolysis to 
load the sliding clamp on to a primer-template junction in the 
orocess of DNA replication. 


class | MHC protein One of two classes of major 
histocompatibility complex (MHC) protein. Found on the surface 
of almost all vertebrate cell types, where it can present foreign 
peptides derived from a pathogen such as a virus to cytotoxic 
T cells. (Figures 24-35 and 24—36A) 


class II MHC protein One of two classes of major 
histocompatibility complex (MHC) protein. Found on the surface 
of various antigen-presenting cells, where it presents peptides 
to helper and regulatory T cells. (Figures 24-35 and 24—36B) 


class switching Change from making one class of 
immunoglobulin (for example, IgM) to making another class (for 
example, IgG) that many B cells undergo during the course of 
an adaptive immune response. Involves DNA rearrangements 
called class-switch recombination. (Figure 24-30) 


class-switch recombination An irreversible change at the 
DNA level when a B cell switches from making IgM and IgD to 
making one of the secondary classes of immunoglobulin. 


classical cadherins Family of cadherin proteins, including 
E-cadherin, N-cadherin, and P-cadherin, that are closely related 
in Sequence throughout their extracellular and intracellular 
domains. 


clathrin Protein that assembles into a polyhedral cage on the 
cytosolic side of amembrane so as to form a clathrin-coated 
pit, which buds off by endocytosis to form an intracellular 
clathrin-coated vesicle. (Figure 13-6) 


clathrin-coated pits Specialized regions typically occupying 
about 2% of the total plasma membrane area at which the 
endocytic pathway often begins. 


clathrin-coated vesicles Coated vesicles that transport 
material from the plasma membrane and between endosomal 
and Golgi compartments. 


cleavage (1) Physical splitting of a cell into two. 

(2) Specialized type of cell division seen in many early embryos 
whereby a large cell becomes subdivided into many smaller 
cells without growth. 


clonal selection From a population of T and B lymphocytes 
with a vast repertoire of randomly generated antigen-specific 
receptors, a given foreign antigen activates (selects) only those 
lymphocyte clones that display a receptor that fits the antigen. 
Explains how the adaptive immune system can respond to 
millions of different antigens in a highly specific way. 

(Figure 24—15) 


co-receptor In immunology: an accessory receptor on 

B cells or T cells that does not bind antigen but binds to a 
co-stimulatory signal and helps activate the lymphocyte, by 
helping to activate an intracellular signaling pathway. 


co-stimulatory signal In immunology: a secreted or 
membrane-bound signal protein that helps activate an antigen- 
responding B cell or T cell. 


co-translational Occurring as translation proceeds. 
Examples include the import of a protein into the endoplasmic 
reticulum before the polypeptide chain is completely 
synthesized (co-translational translocation, Figure 12-32), and 
the folding of a nascent protein into its secondary and tertiary 
structure as it emerges from a ribosome. (Figure 6-79) 


coat-recruitment GTPases Members of a family of 
monomeric GTPases that have important roles in vesicle 
transport, being responsible for coat assembly at the 
membrane. 


coated vesicle Small membrane-enclosed organelle with a 
cage of proteins (the coat) on its cytosolic surface. Formed by 
the pinching off of a coated region of membrane (coated pit). 
Some coats are made of clathrin, others are made from other 
proteins. 


codon Sequence of three nucleotides in a DNA or mRNA 
molecule that represents the instruction for incorporation of a 
specific amino acid into a growing polypeptide chain. 


coenzyme Small molecule tightly associated with an enzyme 
that participates in the reaction that the enzyme catalyzes, often 
by forming a covalent bond to the substrate. Examples include 
biotin, NAD+, and coenzyme A. 


cohesin, cohesin complex Complex of proteins that holds 
sister chromatids together along their length before their 
separation. (Figure 17-19) 


coiled-coil Especially stable rodlike protein structure formed 
by two or more a helices coiled around each other. (Figure 3—9) 


collagen Fibrous protein rich in glycine and proline that 

is amajor component of the extracellular matrix in animals, 
conferring tensile strength. Exists in many forms: type I, the 
most common, is found in skin, tendon, and bone; type Il is 
found in cartilage; type IV is present in basal laminae. 
(Figures 3-23 and 19—40) 


collagen fibril A higher-order collagen polymer of fibrillar 
collagens that assemble into thin structures (10-300 nm in 
diameter) many hundreds of micrometers long in mature 
tissues. 


colony-stimulating factor (CSF) General name for 
numerous signal molecules that control differentiation of blood 
cells. 


colorectal cancer Cancer arising from the epithelium lining 
the colon (the large intestine) and rectum (the terminal segment 
of the gut). 


column chromatography Technique for separation of a 
mixture of substances in solution by passage through a column 
containing a porous solid matrix. Substances are retarded to 
different extents by their interaction with the matrix and can be 
collected separately from the column. Depending on the matrix, 
separation can be according to charge, hydrophobicity, size, or 
the ability to bind to other molecules. 


commensalism Ecological relationship between microbes 
and their host in which the microbe benefits but offers no 
benefit and causes no harm. 
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committed precursor Cell derived from a stem cell 
that divides for a limited number of times before terminally 
differentiating; also known as a transit amplifying cell. 


complement system System of blood proteins that can 

be activated by antibody-antigen complexes or pathogens to 
help eliminate the pathogens, by directly causing their lysis, by 
promoting their phagocytosis, or activating an inflammatory 
response. (Figure 24-7) 


complementary (1) Of nucleic acid sequences: capable of 
forming a perfect base-paired duplex with each other. 
(Flgure 4—4) (2) Of other interacting molecules, such as an 
enzyme and its substrate: having biochemical or structural 
features that marry up, so that noncovalent bonding is 
facilitated. (Figure 2-3) 


complementation test Test to determine whether two 
mutations that produce similar phenotypes are in the same or 
different genes. (Panel 8-2, pp. 487) 


complex oligosaccharides Broad class of N-linked 
oligosaccharides, attached to mammalian glycoproteins in the 
endoplasmic reticulum and modified in the Golgi apparatus, 
containing N-acetylglucosamine, galactose, sialic acid, and 
fucose residues. 


condensin, condensin complex Complex of proteins 
involved in chromosome condensation prior to mitosis. Target 
for M-Cdk. (Figure 17-22) 


conditional mutation Mutation that changes a protein or 
RNA molecule so that its function is altered only under some 
conditions, such as at an unusually high or unusually low 
temperature. 


cone photoreceptor (cone) Photoreceptor cell in the 
vertebrate retina that is responsible for color vision in bright 
light. 


confocal microscope Type of light microscope that 
produces a clear image of a given plane within a solid object. 
It uses a laser beam as a pinpoint source of illumination and 
scans across the plane to produce a two-dimensional “optical 
section.” (Figure 9-19) 


conformation The folded, three-dimensional structure of a 
polypeptide chain. 


connective tissue Any supporting tissue that lies between 
other tissues and consists of cells embedded in a relatively 
large amount of extracellular matrix. Includes bone, cartilage, 
and loose connective tissue. 


connexin Protein component of gap junctions, a four-pass 
transmembrane protein. Six connexins assemble in the plasma 
membrane to form a connexon, or “hemichannel.” (Figure 
19-25) 


connexon Water-filled pore in the plasma membrane formed 
by a ring of six connexin protein subunits. Half of a gap 
junction: connexons from two adjoining cells join to form a 
continuous channel through which ions and small molecules 
can pass. (Figure 19-25) 


consensus nucleotide sequence A summary or “average” 
of a large number of individual nucleotide sequences derived 
by comparing many sequences with the same basic function 
and tallying up the most common nucleotides found at each 
position. (Figure 6-1 2) 


consensus sequence Average or most typical form of a 
sequence that is reproduced with minor variations in a group 
of related DNA, RNA, or protein sequences. Indicates the 
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nucleotide or amino acid most often found at each position. 
Preservation of a sequence implies that it is functionally 
important. 


conservative site-specifc recombination A type of 
DNA recombination that takes place between short, specific 
sequences of DNA and occurs without the gain or loss of 
nucleotides. It does not require extensive homology between 
the recombining DNA molecules. 


constant region In immunology: region of an immunoglobulin 
or T cell receptor chain that has a constant amino acid 
sequence. 


constitutive secretory pathway Pathway present in all cells 
by which molecules such as plasma membrane proteins are 
continually delivered to the plasma membrane from the Golgi 
apparatus in vesicles that fuse with the plasma membrane. 
The default route to the plasma membrane if no other sorting 
signals are present. (Figure 13-63) 


contact-dependent signaling Form of intercellular signaling 
in which signal molecules remain bound to the surface of the 
signaling cell and influence only cells that contact it. 


contractile ring Ring containing actin and myosin that forms 
under the surface of animal cells undergoing cell division. 
Contracts to pinch the two daughter cells apart. (Figure 17-42) 


convergent extension Rearrangement of cells within a tissue 
that causes it to extend in one dimension and shrink in another. 
(Figure 21-50) 


COPI-coated vesicles Coated vesicles that transport 
material early in the secretory pathway, budding from Golgi 
compartments. 


COPIll-coated vesicles Coated vesicles that transport 
material early in the secretory pathway, budding from the 
endoplasmic reticulum. 


copy number variations (CNVs) A difference between two 
individuals in the same population in the number of copies of 
a particular block of DNA sequence. This variation arises from 
occasional duplications and deletions of these sequences. 


cortex Ihe cytoskeletal network in the cortical region of the 
cytosol just beneath the plasma membrane. 


coupled reaction Linked pair of chemical reactions in which 
the free energy released by one serves to drive the other. 
(Figure 2—29) 


covalent bond Stable chemical link between two atoms 
produced by sharing one or more pairs of electrons. 
(Panel 2-1, pp. 90-91) 


CRE-binding (CREB) protein Transcription regulator that 
recognizes the cyclic AMP response element (CRE) in the 
regulatory region of genes activated by cAMP. On activation 
by PKA, phosphorylated CREB recruits a transcriptional 
coactivator (CREB-binding protein; CBP) to stimulate 
transcription of target genes. 


CRISPR A defense mechanism in bacteria using small 
noncoding RNA molecules (crRNAs) to seek out and destroy 
invading viral genomes through complementary base-pairing 
and targeted nuclease digestion. 


crista (plural cristae) A specialized invagination of the inner 
mitochondrial membrane. 


cross-linking glycan One of a heterogeneous group of 
branched polysaccharides that help to cross-link cellulose 
microfibrils into a complex network. Has a long linear backbone 


of one sugar type (glucose, xylose, or mannose) with short side 
chains of other sugars. 


cross-presentation A process in which extracellular proteins 
taken up by specialized dendritic cells can give rise to peptides 
that can be presented by class | MHC proteins to cytotoxic 

T cells. 


crRNAs_ Small noncoding RNAs (=380 nucleotides) that are the 
effectors of CRISPR-mediated immunity in bacteria. 


cryoelectron microscopy Technique for examining a thin 
film of an aqueous suspension of biological material that 

has been frozen rapidly enough to create vitreous ice. The 
specimen is then kept frozen and transferred to the electron 
microscope. Image contrast is low, but is generated solely by 
the macromolecular structures present. 


cryptochrome Plant flavoprotein sensitive to blue light. 
Structurally related to blue-light-sensitive enzymes called 
ohotolyases (involved in the repair of ultraviolet-induced DNA 
damage) but do not have a role in DNA repair. Also found in 
animals, where they have an important role in circadian clocks. 


Cubitus interruptus (Ci) Latent transcription regulator that 
mediates the effects of Hedgehog. 


cyclic AMP (cAMP) Nucleotide that is generated from 

ATP by adenylyl cyclase in response to various extracellular 
signals. It acts as a small intracellular signaling molecule, mainly 
by activating cAMP-dependent protein kinase (PKA). It is 
hydrolyzed to AMP by a phosphodiesterase. (Figure 15-25) 


cyclic AMP phosphodiesterase Specific enzyme that 
rapidly and continuously destroys cyclic AMP, forming 5'-AMP. 
(Figure 15-25). 


cyclic AMP-dependent protein kinase (protein kinase A, 
PKA) Enzyme that phosphorylates target proteins in response 
to arise in intracellular cyclic AMP. (Figure 15-26) 


cyclic GMP (cGMP) Nucleotide that is generated from GTP 
by guanylyl cyclase in response to various extracellular signals. 


cyclic GMP phosphodiesterase Specific enzyme that 
rapidly hydrolyzes and degrades cyclic GMP. 


cyclin Protein that periodically rises and falls in concentration 
in step with the eukaryotic cell cycle. Cyclins activate crucial 
orotein kinases (called cyclin-dependent protein kinases, or 
Cdks) and thereby help control progression from one stage of 
the cell cycle to the next. 


cyclin-dependent kinase (Cdk) Protein kinase that has to 
be complexed with a cyclin protein in order to act. Different 
cyclin-Cdk complexes trigger different steps in the cell-division 
cycle by phosphorylating specific target proteins. (Figure 17—10) 


cyclin-Cdk complex Protein complex formed periodically 
during the eukaryotic cell cycle as the level of a particular cyclin 
increases. A cyclin-dependent kinase (Cdk) then becomes 
partially activated. (Figures 17-10 and 17-11, and 

Table 17-1, p. 969) 


cyclosome_ see anaphase-promoting complex 


cytochrome Colored heme-containing protein that transfers 
electrons during respiration and photosynthesis. 


cytochrome c Soluble component of the mitochondrial 
electron-transport chain. Its release into the cytosol from the 
mitochondrial intermembrane space also initiates apoptosis. 
(Figure 14-26) 


cytochrome c oxidase complex Third of the three electron- 
driven proton pumps in the respiratory chain. It accepts 


electrons from cytochrome c and generates water using 
molecular oxygen as an electron acceptor. (Figure 14-18) 


cytochrome c reductase Second of the three electron- 
driven proton pumps in the respiratory chain. Accepts electrons 
from ubiquinone and passes them to cytochrome c. 

(Figure 14-18) 


cytokine Extracellular signal protein or peptide that acts as a 
local mediator in cell-cell communication. 


cytokine receptor Cell-surface receptor that binds a specific 
cytokine or hormone and acts through the JAK-STAT signaling 
pathway. (Figure 15-56) 


cytokinesis Division of the cytoplasm of a plant or animal cell 
into two, as distinct from the associated division of its nucleus 
(which is mitosis). Part of M phase. (Panel 17-1, pp. 980-981) 


cytoplasm Contents of a cell that are contained within its 
plasma membrane but, in the case of eukaryotic cells, outside 
the nucleus. 


cytoplasmic tyrosine kinase Enzyme activated by certain 
cell-surface receptors (tyrosine-kinase-associated receptors) 
that transmits the receptor signal onward by phosphorylating 
target cytoplasmic proteins on tyrosine side chains. 


cytoskeleton System of protein filaments in the cytoplasm of 
a eukaryotic cell that gives the cell shape and the capacity for 
directed movement. Its most abundant components are actin 
filaments, microtubules, and intermediate filaments. 


cytosol Contents of the main compartment of the cytoplasm, 
excluding membrane-bounded organelles such as endoplasmic 
reticulum and mitochondria. 


cytotoxic T cell (Tc cell) Type of T cell responsible for killing 
host cells infected with a virus or another type of intracellular 
pathogen. (Figure 24—42) 


dark-field microscopy Type of light microscopy in which 
oblique rays of light focused on the specimen do not enter the 
objective lens, but light that is scattered by components in the 
living cell can be collected to produce a bright image on a dark 
background. (Figure 9-7) 


death receptor Transmembrane receptor protein that 
can signal the cell to undergo apoptosis when it binds its 
extracellular ligand. (Figure 18-5) 


death-inducing signaling complex (DISC) Activation 
complex in which initiator caspases interact and are activated 
following binding of extracellular ligands to cell-surface death 
receptors in the extrinsic pathway of apoptosis. 


deep RNA sequencing see RNA-seq 


default pathway The transport pathway of proteins directly 
to the cell surface via the nonselective constitutive secretory 
pathway, entry into which does not require a particular signal. 


defensin Positively charged, amphipathic, antimicrobial 
peptide — secreted by epithelial cells—that binds to and 
disrupts the membranes of many pathogens. 


delayed K+ channel Neuronal voltage-gated Kt channel 
that opens following membrane depolarization but during the 
falling phase of an action potential due to slower activation 
kinetics than Na* channels; opening permits K* efflux, driving 
the membrane potential back toward its original negative value, 
ready to transmit a second impulse. 


Delta Single-pass transmembrane signal protein displayed 
on the surface of cells that binds to the Notch receptor protein 
on a neighboring cell, activating a contact-dependent signaling 
mechanism. 
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dendrite Extension of a nerve cell, often elaborately 
branched, that receives stimuli from other nerve cells. 


dendritic cell The most potent type of antigen-presenting 
cell, which takes up antigen and processes It for presentation 
to T cells. It is required for activating naive T cells. 

(Figure 24-11) 


deoxyribonucleic acid (DNA) Polynucleotide formed 
from covalently linked deoxyribonucleotide units. The store 
of hereditary information within a cell and the carrier of this 
information from generation to generation. (Figure 4—3 and 
Panel 2-6, pp. 100-101) 


depolarization Deviation in the electric potential across the 
plasma membrane towards a positive value. A depolarized cell 
has a potential that is positive outside and negative inside. 


desensitization see adaptation 


desmosome Anchoring cell-cell junction, usually formed 
between two epithelial cells. Characterized by dense plaques 
of protein into which intermediate filaments in the two adjoining 
cells insert. (Figure 19-2) 


detergent Small amphiphilic molecule, more soluble in water 
than lipids, that disrupts hydrophobic associations and destroys 
the lipid bilayer thereby solubilizing membrane proteins. 


D gene segment A short DNA sequence that encodes a part 
of the variable region of an immunoglobulin heavy chain or the 
B chain of a T cell receptor (TCR). 


diacylglycerol (DAG) Lipid produced by the cleavage of 
inositol phospholipids in response to extracellular signals. 
Composed of two fatty acid chains linked to glycerol, it serves 
as a small signaling molecule to help activate protein kinase C 
(PKC). (Figure 15-28) 


dideoxy sequencing Ihe standard enzymatic method of 
DNA sequencing. (Panel 8-1, p. 478) 


differential-interference-contrast microscope Type of 
light microscope that exploits the interference effects that occur 
when light passes through parts of a cell of different refractive 
indices. Used to view unstained living cells. 


differentiation Process by which a cell undergoes a change 
to an overtly specialized cell type. 


diffusion The net drift of molecules through space due to 
random thermal movements. 


Dishevelled Scaffold protein recruited to the Frizzled family of 
cell-surface receptors upon their activation by Wnt binding that 
helps relay the signal to other signaling molecules. 


DNA cloning (1) The act of making many identical copies 
(typically billions) of a DNA molecule—the amplification of a 
particular DNA sequence. (2) Also, the isolation of a particular 
stretch of DNA (often a particular gene) from the rest of the 
cell’s genome. 


DNA helicase Enzyme that is involved in opening the DNA 
helix into its single strands for DNA replication. 


DNA library Collection of cloned DNA molecules, 
representing either an entire genome (genomic library) or 
complementary DNA copies of the mRNA produced by a cell 
(CDNA library). 


DNA ligase Enzyme that joins the ends of two strands of 
DNA together with a covalent bond to make a continuous DNA 
strand. 


DNA methylation Addition of methyl groups to DNA. 
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Extensive methylation of the cytosine base in CG sequences 
is used in plants and animals to help keep genes in an inactive 
state. 


DNA microarray A large array of short DNA molecules 
(each of Known sequence) bound to a glass microscope 
slide or other suitable support. Used to monitor expression of 
thousands of genes simultaneously: mRNA isolated from test 
cells is converted to CDNA, which in turn is hybridized to the 
microarray. (Figure 8-64) 


DNA polymerase Enzyme that synthesizes DNA by joining 
nucleotides together using a DNA template as a guide. 


DNA primase Enzyme that synthesizes a short strand of RNA 
on a DNA template, producing a primer for DNA synthesis. 
(Figure 5—10) 


DNA repair A set of processes for repairing the many 
accidental lesions that occur continually in DNA. 


DNA replication Process by which a copy of a DNA molecule 
is made. 


DNA supercoiling A conformation with loops or coils that 
DNA adopts in response to superhelical tension; conversely, 
creating various loops or coils in the helix can create such 
tension. 


DNA topoisomerase (topoisomerase) Enzyme that binds 
to DNA and reversibly breaks a phosphodiester bond in one 

or both strands. Topoisomerase | creates transient single- 
strand breaks, allowing the double helix to swivel and relieving 
superhelical tension. Topoisomerase II creates transient double- 
strand breaks, allowing one double helix to pass through 
another and thus resolving tangles. (Figures 5-21 and 5-22) 


DNA tumor virus General term for a variety of different DNA 
viruses that can cause tumors. 


DNA-only transposon Transposable element that exists as 
DNA throughout its life cycle. Many move by cut-and-paste 
transposition. See a/so transposon. 


dolichol |lsoprenoid lipid molecule that anchors the precursor 
oligosaccharide in the endoplasmic reticulum membrane during 
protein glycosylation. 


domain (protein domain) Portion of a protein that has 

a tertiary structure of its own. Larger proteins are generally 
composed of several domains, each connected to the next 
by short flexible regions of polypeptide chain. Homologous 
domains are recognized in many different proteins. 


Dorsal protein Transcription regulator of the NF«B family 
regulating gene expression and involved in establishing the 
dorsoventral axis in the embryo. 


double helix The three-dimensional structure of DNA, in 
which two antiparallel DNA chains, held together by hydrogen- 
bonding between the bases, are wound into a helix. 

(Figure 4-5) 


drivers Mutations that are causal factors in the develooment 
of cancer. 


dynamic instability Sudden conversion from growth to 
shrinkage, and vice versa, in a protein filament such as a 
microtubule or actin filament. (Panel 16-2, pp. 902-903) 


dynamin Cytosolic GTPase that binds to the neck of a 
clathrin-coated vesicle in the process of budding from the 
membrane, and which is involved in completing vesicle 
formation. 


dynein Large motor protein that undergoes ATP-dependent 
movement along microtubules. 


E2F protein Transcription regulatory protein that switches on 
many genes that encode proteins required for entry into the 
S phase of the cell cycle. 


early endosome Common receiving compartment with 
which most endocytic vesicles fuse and where internalized 
cargo is sorted either for return to the plasma membrane or for 
degradation by inclusion in a late endosome. 


ectoderm Embryonic epithelial tissue that is the precursor of 
the epidermis and nervous system. 


edema factor One of the two A subunits of anthrax toxin; an 
adenylyl cyclase that catalyzes production of cAMP, leading to 
ion imbalance and consequent edema in the skin or lung. 


effector Bcl2 family proteins Pro-apoptotic proteins of the 
intrinsic pathway of apoptosis that in response to an apoptotic 
stimulus become activated and aggregate to form oligomers 
in the mitochondrial outer membrane, inducing the release of 
cytochrome c and other intermembrane proteins. Bax and Bak 
are the main effector Bcl2 family proteins in mammalian cells. 


effector cell Cell that carries out the final response or 
function in a particular process. The main effector cells of the 
immune system, for example, are activated lymphocytes and 
phagocytes that help eliminate pathogens. 


egg-polarity genes Genes in the Drosophila egg that define 
the anteroposterior and dorsoventral axes of the future embryo 
through the creation of landmarks (mRNA or protein) in the egg 
that provide signals organizing the develoomental process. 


elastic fiber Extensible fiber formed by the protein elastin in 
many animal connective tissues, such as in skin, blood vessels, 
and lungs, which gives them their stretchability and resilience. 


elastin Extracellular protein that forms extensible fibers 
(elastic fibers) in connective tissues. 


electrochemical gradient Combined influence of a 
difference in the concentration of an ion on two sides of a 
membrane and the electrical charge difference across the 
membrane (membrane potential). lons or charged molecules 
can move passively only down their electrochemical gradient. 


electron microscope Microscope that uses a beam of 
electrons to create the image. 


electron microscope (EM) tomography Technique 

for viewing three-dimensional specimens in the electron 
microscope in which multiple views are taken from different 
directions by tilting the specimen holder. The views are 
combined computationally to give a three-dimensional image. 


electron-transport chain Series of reactions in which 
electron carrier molecules pass electrons “down the chain” 
from higher to successively lower energy levels. The energy 
released during such electron movement can be used to power 
various processes. Electron-transport chains present in the 
inner mitochondrial membrane (called the respiratory chain) 
and in the thylakoid membrane of chloroplasts generate a 
proton gradient across the membrane that is used to drive ATP 
synthesis. See especially Figures 14-18 and 14-52. 


electrostatic attraction A noncovalent, ionic bond between 
two molecules carrying groups of opposite charge. 
(Panel 2-3, pp. 94-95) 


embryonic stem cells (ES cells) Cells derived from the 
inner cell mass of the early mammalian embryo. Capable of 


giving rise to all the cells in the body. Can be grown in culture, 
genetically modified, and inserted into a blastocyst to develop a 
transgenic animal. 


endocrine cell Specialized animal cell that secretes a 
hormone into the blood. Usually part of a gland, such as the 
thyroid or pituitary gland. 


endocytic vesicle Vesicle formed as material ingested by 
the cell during endocytosis is progressively enclosed by a small 
portion of the plasma membrane, which first invaginates and 
then pinches off to form the vesicle. 


endocytosis Uptake of material into a cell by an invagination 
of the plasma membrane and its internalization in a membrane- 
enclosed vesicle. See a/so pinocytosis and phagocytosis. 


endoderm Embryonic tissue that is the precursor of the gut 
and associated organs. 


endoplasmic reticulum (ER) Labyrinthine membrane- 
bounded compartment in the cytoplasm of eukaryotic cells, 
where lipids are synthesized and membrane-bound proteins 
and secretory proteins are made. (Figure 12-33) 


endosome maturation Process by which early endosomes 
mature to late endosomes and endolysosomes; in the 
conversion process, the endosome membrane protein 
composition changes, the endosome moves from the cell 
periphery to close to the nucleus, and the endosome ceases 
to recycle material to the plasma membrane and irreversibly 
commits its remaining contents to degradation. 


endothelial cell Flattened cell type that forms a sheet (the 
endothelium) lining all blood and lymphatic vessels. 


entropy (S) Thermodynamic quantity that measures the 
degree of disorder or randomness in a system; the higher the 
entropy, the greater the disorder. (Panel 2-7, pp. 102-103) 


enveloped virus Virus with a capsid surrounded by a lipid 
bilayer membrane (the envelope), which is often derived from 
the host-cell plasma membrane when the virus buds from the 
cell. (Figure 23-1 2) 


enzyme Protein that catalyzes a specific chemical reaction. 


enzyme-coupled receptor A major type of cell-surface 
receptor that has a cytoplasmic domain that either has 
enzymatic activity or is associated with an intracellular enzyme. 
In either case, the enzymatic activity is stimulated by an 
extracellular ligand binding to the receptor. (Figure 15-6) 


ephrin One of a family of membrane-bound protein ligands 
for the Eph receptor tyrosine kinases (RTKs) that, among many 
other functions, stimulate repulsion or attraction responses that 
guide the migration of cells and nerve cell axons during animal 
development. 


epidermis Epithelial layer covering the outer surface of the 
body. Has different structures in different animal groups. The 
outer layer of plant tissue is also called the epidermis. 


epigenetic inheritance Inheritance of phenotypic changes 
in a cell or organism that do not result from changes in the 
nucleotide sequence of DNA. Can be due to positive feedback 
loops of transcription regulators or to heritable modifications in 
chromatin such as DNA methylation or histone modifications. 
(Figure 7-53) 


epistasis analysis Analysis to discover the order in which the 
genes act, by investigating if a mutation in one gene can mask 
the effect of a mutation in another gene when both mutations 
are present in the same organism or cell. 


epithelial tissues Tissues, such as the lining of the gut or the 
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epidermal covering of the skin, in which cells are closely bound 
together into sheets called epithelia. 


epithelium (plural epithelia) Sheet of cells covering the outer 
surface of a structure or lining a cavity. 


equilibrium State in a chemical reaction where there is no net 
change in free energy to drive the reaction in either direction. 
The ratio of product to substrate reaches a constant value at 
chemical equilibrium. (Figure 2-30) 


equilibrium constant (K) The ratio of forward and reverse 
rate constants for a reaction. Equal to the association or affinity 
constant (Ka) for a simple binding reaction (A + B = AB). See 
also affinity constant, dissociation constant. (See page 62) 


ER lumen Space enclosed by the membrane of the 
endoplasmic reticulum (ER). 


ER resident protein Protein that remains in the endoplasmic 
reticulum (ER) or its membranes and carries out its function 
there, as opposed to proteins that are present in the ER only in 
transit. 


ER retention signal Short amino acid sequence on a protein 
that prevents it from moving out of the endoplasmic reticulum 
(ER). Found on proteins that are resident in the ER and function 
there. 


ER signal sequence N-terminal signal sequence that directs 
proteins to enter the endoplasmic reticulum (ER). Cleaved off by 
signal peptidase after entry. 


ER tail-anchored proteins Membrane proteins anchored 
in the endoplasmic reticulum (ER) membrane by a single 
transmembrane a helix contained at their C-terminus. 


erythrocyte Small hemoglobin-containing blood cell of 
vertebrates that transports oxygen to, and carbon dioxide from, 
tissues. Also called a red blood cell. 


erythropoietin A hormone produced by the kidney that 
stimulates the production of red blood cells in bone marrow. 


ESCRT protein complexes Four protein complexes 
(ESCRT-0, ESCRT-1, ESCRT-2, and ESCRT-3) that act 
sequentially to shepherd mono-ubiquitylated membrane 
proteins on endosomal membranes into intralumenal vesicles. 
ESCRIT-3 complex catalyzes the pinching-off reaction. 


ethylene Small gas molecule that is a plant growth regulator 
influencing plant develooment in various ways including 
promoting fruit ripening, leaf abscission, and plant senescence 
and functioning as a stress signal in resoonse to wounding, 
infection, and flooding. 


euchromatin Region of an interohase chromosome that 
stains diffusely; “normal” chromatin, as opposed to the more 
condensed heterochromatin. 


eukaryote Organism composed of one or more cells that 
have a distinct nucleus. Member of one of the three main 
divisions of the living world, the other two being bacteria and 
archaea. (Figure 1-1 7) 


eukaryotic initiation factor (elF) Protein that helps load 
initiator tRNA on to the ribosome, thus initiating translation. 


excitatory neurotransmitter Neurotransmitter that opens 
cation channels in the postsynaptic membrane, causing an 
influx of Na+, and in many cases Ca?+, that depolarizes the 
postsynaptic membrane toward the threshold potential for firing 
an action potential. 


executioner caspases Apoptotic caspases that catalyze the 
widespread cleavage events during apoptosis that kill the cell. 
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exocytosis Excretion of material from the cell by vesicle 
fusion with the plasma membrane; can occur constitutively or 
be regulated. 


exon Segment of a eukaryotic gene that consists of a 
sequence of nucleotides that will be represented in mRNA or 
in a final transfer, ribosomal, or other mature RNA molecule. 
In protein-coding genes, exons encode the amino acids in 
the protein. An exon is usually adjacent to a noncoding DNA 
segment called an intron. (Figure 4—15) 


exosome Large protein complex with an interior rich in 
3’-to-5’ RNA exonucleases; degrades RNA molecules to 
produce ribonucleotides. 


extracellular pathogens Pathogens that disturb host cells 
and can cause serious disease without replicating in host cells. 


extracellular signal molecule Any secreted or cell-surface 
chemical signal that binds to receptors and regulates the 
activity of the cell expressing the receptor. 


extrinsic pathway Pathway of apoptosis triggered by 
extracellular signal proteins binding to cell-surface death 
receptors. 


facultative pathogens Bacteria that replicate in an 
environmental reservoir such as water or soil and only cause 
disease if they happen to encounter a susceptible host. 


FAD/FADH;, (flavin adenine dinucleotide/reduced flavin 
adenine dinucleotide) Electron carrier system that functions 
in the citric acid cycle and fatty acid oxidation. One molecule 
of FAD gains two electrons plus two protons in becoming the 
activated carrier FADHb. (Figure 2-39) 


Fas (Fas protein, Fas death receptor) Transmembrane 
death receptor that initiates apoptosis when it binds its 
extracellular ligand (Fas ligand). (Figure 18-5) 


Fas ligand Ligand that activates the cell-surface death 
receptor, Fas, triggering the extrinsic pathway of apoptosis. 


fat Energy-storage lipid in cells. Composed of triglycerides — 
fatty acids esterified with glycerol. 


fate map Representation showing which cell types will later 
derive from which regions of a tissue; e.g. from the blastula. 
(Figure 21-28) 


Fc receptor One of a family of cell-surface receptors that 
bind the tail region (Fc region) of an antibody molecule. Different 
Fc receptors are specific for different classes of antibodies, 
such as IgG, IgA, or IgE. 


feedback inhibition The process in which a product of a 
reaction feeds back to inhibit a previous reaction in the same 
pathway. (Figures 3-55 and 3-56) 


fermentation Anaerobic energy-yielding metabolic pathway 
involving the oxidation of organic molecules. Anaerobic 
glycolysis refers to the process whereby pyruvate is converted 
into lactate or ethanol, with the conversion of NADH to NAD*t. 
(Figure 2—47) 


fibril-associated collagen Mediates the interactions 

of collagen fibrils with one another and with other matrix 
macromolecules to help determine the organization of the fibrils 
in the matrix. This collagen (including types IX and XII) has a 
flexible triple-stranded helical structure and binds to the surface 
of the fibrils rather than forming aggregates. 


fibrillar collagen Class of fibril-forming collagens (including 
type | collagen, the most common type and the principal 
collagen of skin and bone) that have long ropelike structures 


with few or no interruptions and which assemble into collagen 
fibrils. 


fibroblast Common cell type found in connective tissue. 
Secretes an extracellular matrix rich in collagen and other 
extracellular matrix macromolecules. Migrates and proliferates 
readily in wounded tissue and in tissue culture. 


fibronectin Extracellular matrix protein involved in adhesion 
of cells to the matrix and guidance of migrating cells during 
embryogenesis. Integrins on the cell surface are receptors for 
fibronectin. 


filopodium (plural filopodia) (microspike) Thin, spike-like 
protrusion with an actin filament core, generated on the leading 
edge of a crawling animal cell. (Figure 16-21) 


firing rule Important principle governing synapse 
reinforcement and elimination during development of the 
nervous system: when two (or more) neurons synapsing on 
the same target cell fire at the same time, they reinforce their 
connections to that cell; when they fire at different times, they 
compete, so that all but one of them tend to be eliminated. 


flagellum (plural flagella) Long, whiplike protrusion whose 
undulations drive a cell through a fluid medium. Eukaryotic 
flagella are longer versions of cilia. Bacterial flagella are smaller 
and completely different in construction and mechanism of 
action. Compare cilium. 


fluorescence microscope Microscope designed to view 
material stained with fluorescent dyes or proteins. Similar to a 
light microscope but the illuminating light is passed through one 
set of filters before the specimen, to select those wavelengths 
that excite the dye, and through another set of filters before 

it reaches the eye, to select only those wavelengths emitted 
when the dye fluoresces. (Figure 9-1 2) 


fluorescence recovery after photobleaching (FRAP) 
Technique for monitoring the kinetic parameters of a protein by 
analyzing how fluorescent protein molecules move into an area 
of the cell bleached by a beam of laser light. (Figure 9-29) 


fluorescence resonance energy transfer (FRET) 
Technique for monitoring the closeness of two fluorescently 
labeled molecules (and thus their interaction) in cells. Also 
known as Forster resonance energy transfer. (Figure 9-26) 


focal adhesion kinase (FAK) Cytoplasmic tyrosine kinase 
present at cell-matrix junctions (focal adhesions) in association 
with the cytoplasmic tails of integrins. 


focal adhesion kinase (FAK) Cytoplasmic tyrosine kinase 
present at cell-matrix junctions (focal adhesions) in association 
with the cytoplasmic tails of integrins. 


follicular helper T cell (Tey) Type of T cell located in 
lymphoid follicles that secretes various cytokines to stimulate 
B cells to undergo antibody class switching and somatic 
hypermutation. 


formin Dimeric protein that nucleates the growth of straight, 
unbranched actin filaments that can be cross-linked by other 
proteins to form parallel bundles. 


Forster resonance energy transfer see fluorescence 
resonance energy transfer (FRET) 


FRAP see fluorescence recovery after photobleaching 


free energy (G) (Gibbs free energy) The energy that can be 
extracted from a system to drive reactions. Takes into account 
changes in both energy and entropy. (Panel 2-7, pp. 102-108) 


free ribosome _ Ribosome that is free in the cytosol, 
unattached to any membrane. 


free-energy change (AG) see AG. 
FRET see fluorescence resonance energy transfer 


Frizzled Family of cell-surface receptors that are seven-pass 
transmembrane proteins that resemble GPCRs in structure 
but do not generally work through the activation of G proteins. 
Activated by Wnt binding to recruit the scaffold protein 
Dishevelled, which helps relay the signal to other signaling 
molecules. 


fungus (plural fungi) Kingdom of eukaryotic organisms that 
includes the yeasts, molds, and mushrooms. Many plant 
diseases and a relatively small number of animal diseases are 
caused by fungi. 


fusion protein Engineered protein that combines two or more 
normally separate polypeptides. Produced from a recombinant 
gene. 


AG Change in the free energy during a reaction: the free 
energy of the product molecules minus the free energy of the 
starting molecules. A large negative value of AG indicates that 
the reaction has a strong tendency to occur. (Panel 2-7, 

pp. 102-108) 


Go State of withdrawal from the eukaroytic cell-division cycle 
by entry into a quiescent digression from the G; phase. A 
common, sometimes permanent, state for differentiated cells. 


Gi phase Gap 1 phase of the eukaryotic cell-division cycle, 
between the end of mitosis and the start of DNA synthesis. 
(Figure 17-4) 


Gi-Cdk Cyclin-Cdk complex formed in vertebrate cells by 
a Gy-cyclin and the corresponding cyclin-dependent kinase 
(Cdk). (Table 17-1, p. 969) 


Gy-cyclin Cyclin present in the G4 phase of the eukaryotic cell 
cycle. Forms complexes with Cdks that helo govern the activity 
of the G/S-cyclins, which control progression to S phase. 


G/S-Cdk Cyclin-Cdk complex formed in vertebrate cells by 
a G4/S-cyclin and the corresponding cyclin-dependent kinase 
(Cdk). (Figure 17-11 and Table 17-1, p. 969) 


G,/S-cyclin Cyclin that activates Cdks in late G4 of the 
eukaryotic cell cycle and thereby helps trigger progression 
through Start, resulting in a commitment to cell-cycle entry. Its 
level falls at the start of S phase. (Figure 17-11) 


G2 phase Gap 2 phase of the eukaryotic cell-division cycle, 
between the end of DNA synthesis and the beginning of 
mitosis. (Figure 17-4) 


G2/M transition Point in the eukaryotic cell cycle at which the 
cell checks for completion of DNA replication before triggering 
the early mitotic events that lead to chromosome alignment on 
the spindle. (Figure 17-9) 


ganglioside Any glycolipid having one or more sialic acid 
residues in its structure. Found in the plasma membrane of 
eukaryotic cells and especially abundant in nerve cells. 
(Figure 10-16) 


gap gene In Drosophila development, a gene that is 
expressed in specific broad regions along the anteroposterior 
axis of the early embryo, and which helps designate the main 
divisions of the insect body. (Figure 21-20) 


gap junction Communicating channel-forming cell-cell 
junction present in most animal tissues that allows ions and 
small molecules to pass from the cytoplasm of one cell to the 
cytoplasm of the next. 
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gastrulation Important stage in animal embryogenesis 
during which the embryo is transformed from a ball of cells to a 
structure with a gut (a gastrula). 


gated transport Movement of proteins between the cytosol 
and the nucleus through nuclear pore complexes in the nuclear 
envelope that function as selective gates. 


geminin Protein that prevents the formation of new 
orereplicative complexes during S phase and mitosis, thus 
ensuring that the chromosomes are replicated only once in 
each cell cycle. 


gene Region of DNA that is transcribed as a single unit and 
carries information for a discrete hereditary characteristic, 
usually corresponding to (1) a single protein (or set of related 
proteins generated by variant post-transcriptional processing), 
or (2) a single RNA (or set of closely related RNAs). 


gene control region The set of linked DNA sequences 
regulating expression of a particular gene. Includes promoter 
and cis-regulatory sequences required to initiate transcription of 
the gene and control the rate of transcription. (Figure 7—17) 


gene conversion Process by which DNA sequence 
information can be transferred from one DNA helix (which 
remains unchanged) to another DNA helix whose sequence is 
altered. It often accompanies general recombination events. 
(Figure 5—59) 


gene family The set of genes in an organism related in DNA 
sequence due to their derivation from the same ancestor. 


gene segments |n immunology: short DNA sequences that 
are joined together during B cell and T cell development to 
produce the coding sequences for immunoglobulins and T cell 
receptors, respectively. (Figure 24-28) 


general transcription factor Any of the proteins whose 
assembly at all promoters of a given type is required for the 
binding and activation of RNA polymerase and the initiation of 
transcription. (Table 6-3, p. 311) 


genetic code The set of rules specifying the correspondence 
between nucleotide triplets (codons) in DNA or RNA and amino 
acids in proteins. (Figure 6-48) 


genetic instability Abnormally increased spontaneous 
mutation rate, such as occurs in cancer cells. 


genetic screen Procedure for discovery of genes affecting a 
specific phenotype by surveying large numbers of mutagenized 
individuals. 


genetics Ihe study of the genes of an organism on the basis 
of heredity and variation. 


genome The totality of genetic information belonging to a 
cell or an organism; in particular, the DNA that carries this 
information. 


genome annotation Process attempting to mark out all 
the genes (protein-coding and noncoding) in a genome and 
ascribing functions to each. 


genomic imprinting Phenomenon in which a gene is either 
expressed or not expressed in the offspring depending on 
which parent it is inherited from. (Figure 7-49) 


genomic library Collection of cloned DNA molecules 
representing an entire genome. 


genotype Genetic constitution of an individual cell or 
organism. The particular combination of alleles found in a 
specific individual. (Panel 8-2, p. 486) 
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germ cell A cell in the germ line of an organism, which 
includes the haploid gametes and their specified diploid 
precursor cells. Germ cells contribute to the formation of a new 
generation of organisms and are distinct from somatic cells, 
which form the body and leave no descendants. 


germ layer One of the three primary tissue layers (endoderm, 
mesoderm, and ectoderm) of an animal embryo. (Figure 21-3) 


glial cell Supporting non-neural cell of the nervous system. 
Includes oligodendrocytes and astrocytes in the vertebrate 
central nervous system and Schwann cells in the peripheral 
nervous system. 


glycogen Polysaccharide composed exclusively of glucose 
units. Used to store energy in animal cells. Large granules of 
glycogen are especially abundant in liver and muscle cells. 
(Figure 2-51 and Panel 2-4, pp. 96-97) 


glycolipid Lipid molecule with a sugar residue or 
oligosaccharide attached. (Panel 2-5, pp. 98-99) 


glycolysis Ubiquitous metabolic pathway in the cytosol in 
which sugars are incompletely degraded with production of 
ATP. Literally, “sugar splitting.” (Figure 2-46 and Panel 2-8, 
pp. 104-105) 


glycoprotein Any protein with one or more saccharide or 
oligosaccharide chains covalently linked to amino acid side 
chains. Most secreted proteins and most proteins exposed on 
the outer surface of the plasma membrane are glycoproteins. 


glycosaminoglycan (GAG) Long, linear, highly charged 
polysaccharide composed of a repeating pair of sugars, one of 
which is always an amino sugar. Mainly found covalently linked 
to a protein core in extracellular matrix proteoglycans. Examples 
include chondroitin sulfate, hyaluronan, and heparin. (Figure 
19-32) 


glycosylphosphatidylinositol anchor (GPI anchor) Lipid 
linkage by which some membrane proteins are bound to the 
membrane. The protein is joined, via an oligosaccharide linker, 
to a phosphatidylinositol anchor during its travel through the 
endoplasmic reticulum. (Figure 12-52) 


Golgi apparatus (Golgi complex) Complex organelle in 
eukaryotic cells, centered on a stack of flattened, membrane- 
enclosed spaces, in which proteins and lipids transferred from 
the endoplasmic reticulum are modified and sorted. It is the 
site of synthesis of many cell wall polysaccharides in plants and 
extracellular matrix glycosaminoglycans in animal cells. (Figure 
13-26) 


GPCR kinase (GRK) Member of a family of enzymes that 
phosphorylates multiple serines and threonines on a GPCR to 
produce receptor desensitization. (Figure 15—42) 


G protein (trimeric GTP-binding protein) A trimeric GTP- 
binding protein with intrinsic GTPase activity that couples 
GPCRs to enzymes or ion channels in the plasma membrane. 
(Table 15-3, p. 846) 


G-protein-coupled receptor (GPCR) A seven-pass cell- 
surface receptor that, when activated by its extracellular ligand, 
activates a G protein, which in turn activates either an enzyme 
or ion channel in the plasma membrane. (Figures 15-6 and 
15-21) 


Gq Class of G protein that couples GPCRs to phospholipase 
C-B to activate the inositol phospholipid signaling pathway. 


Gram negative Description for bacteria that do not stain 
with Gram stain as a result of having a thinner peptidoglycan 


cell wall outside their inner (plasma) membrane, and on an 
additional outer membrane. 


Gram positive Description for bacteria that stain positive 
with Gram stain due to a thick layer of peptidoglycan cell wall 
outside their inner (plasma) membrane. 


Gram staining A technique for classifying bacteria based on 
differences in the structure of the bacterial cell wall and outer 
Surface. 


granulocyte Category of white blood cell distinguished 
by conspicuous cytoplasmic granules. Includes neutrophils, 
basophils, and eosinophils. Arises from a granulocyte/ 
macrophage (GM) progenitor cell. (Figure 22-27) 


granulocyte/macrophage (GM) progenitor cell Committed 
progenitor cell in the bone marrow that gives rise to neutrophils 
and macrophages. (Figure 22-31) 


green fluorescent protein (GFP) Fluorescent protein 
isolated from a jellyfish. Widely used as a marker in cell biology. 
(Figure 9-24) 


growth cone Migrating motile tio of a growing nerve cell axon 
or dendrite. (Figure 21-72) 


growth factor Extracellular signal protein that can stimulate a 
cell to grow. They often have other functions as well, including 
stimulating cells to survive or proliferate. Examples include 
epidermal growth factor (EGF) and platelet-derived growth 
factor (PDGF). 


growth hormone (GH) Mammalian hormone secreted by 
the pituitary gland into the bloodstream that stimulates growth 
throughout the body. 


GTP (guanosine 5/-triphosphate) Nucleoside triohosphate 
produced by the phosphorylation of GDP (guanosine 
diphosphate). Like ATP, it releases a large amount of free 
energy on hydrolysis of its terminal phosphate group. Has a 
special role in microtubule assembly, protein synthesis, and cell 
signaling. (Figure 2-58) 


GTP-binding protein Also called GTPase; an enzyme that 
converts GTP to GDP. 


GTPase An enzyme that converts GTP to GDP. GTPases 
fall into two large families. Large trimeric G proteins are 
composed of three different subunits and mainly couple GPCRs 
to enzymes or ion channels in the plasma membrane. Small 
monomeric GTP-binding proteins (also called monomeric 
GTPases) consist of a single subunit and help relay signals 
from many types of cell-surface receptors and have roles in 
intracellular signaling pathways, regulating intracellular vesicle 
trafficking, and signaling to the cytoskeleton. Both trimeric 

G proteins and monomeric GTPases cycle between an active 
GTP-bound form and an inactive GDP-bound form and 
frequently act as molecular switches in intracellular signaling 
pathways. See page 820. 


GTPase-activating protein (GAP) Protein that binds to 
a GTPase and inhibits it by stimulating its GTPase activity, 
causing the enzyme to hydrolyze its bound GTP to GDP. 
(Figure 15-8) 


guanine nucleotide exchange factor (GEF) Protein that 
binds to a GTPase and activates it by stimulating it to release its 
tightly bound GDP, thereby allowing it to bind GTP in its place. 
(Figure 15-8) 


haplotype block Combination of alleles and DNA markers 
that has been inherited in a large, linked block on one 
chromosome of a homologous pair—undisturbed by genetic 
recombination — across many generations. 


Hedgehog protein Secreted extracellular signal molecule 
that has many different roles controlling cell differentiation 
and gene expression in animal embryos and adult tissues. 
Excessive Hedgehog signaling can lead to cancer. 


helper T cell (Ty cell) Type of T cell that helps activate B cells 
to make antibodies, cytotoxic T cells to become effector cells, 
and macrophages to kill ingested pathogens. They can also 
help activate dendritic cells. 


heterochromatin Chromatin that is highly condensed even 
in interphase; generally transcriptionally inactive. (Compare with 
euchromatin.) 


heterochronic Describes genes involved in develoomental 
timing; mutation results in cells of a specific fate behaving as 
cells at a different stage of development. 


high-mannose oligosaccharides Broad class of N-linked 
oligosaccharides, attached to mammalian glycoproteins in the 
endoplasmic reticulum, containing two N-acetylglucosamine 
residues and many mannose residues. 


high-performance liquid chromatography (HPLC) Type of 
chromatography that uses columns packed with tiny beads of 
matrix; the solution to be separated is pushed through under 
high pressure. 


histone One of a group of small abundant proteins, rich in 
arginine and lysine, that combine to form the nucleosome cores 
around which DNA is wrapped in eukaryotic chromosomes. 
(Figure 4—24) 


histone chaperone (chromatin assembly factor) Protein 
that binds free histones, releasing them once they have been 
incorporated into newly replicated chromatin. (Figure 4—27) 


histone H1 “Linker” (as opposed to “core”) histone protein 
that binds to DNA where it exits from a nucleosome and helps 
package nucleosomes into the 30-nm chromatin fiber. 

(Figure 4—30) 


Holliday junction (cross-strand exchange) X-shaped 
structure observed in DNA undergoing recombination, in 
which the two DNA molecules are held together at the site of 
crossing-over, also called a cross-strand exchange. 

(Figure 5-55) 


homeotic selector gene In Drosophila development, a 
gene that defines and preserves the differences between body 
segments. 


homolog One of two or more genes that are similar in 
sequence as a result of derivation from the same ancestral 
gene. The term covers both orthologs and paralogs. 
(Figure 1-21) See homologous chromosomes. 


homologous Genes, proteins, or body structures that are 
similar as a result of a shared evolutionary origin. 


homologous chromosomes (homologs) The maternal and 
paternal copies of a particular chromosome in a diploid cell. 


homologous recombination (general recombination) 
Genetic exchange between a pair of identical or very similar 
DNA sequences, typically those located on two copies of the 
same chromosome. Also a DNA repair mechanism for double- 
strand breaks. (Figures 5-48, 5-50, and 5-54) 


homophilic Binding between molecules of the same kind, 
especially those involved in cell-cell adhesion. (Figure 19-5) 


horizontal gene transfer Gene transfer between bacteria via 
natural transformation by released naked DNA, transduction by 
bacteriophages, or sexual exchange by conjugation. 
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hormone Signal molecule secreted by an endocrine cell into 
the bloodstream, which can then carry the signal to distant 
target cells. 


Hox complex A gene complex consisting of a series of Hox 
genes 


Hox genes Genes coding for transcription regulators, 

each gene containing a homeodomain, and specifying body- 
region differences. Hox mutations typically cause homeotic 
transformations. 


Hox proteins Transcription regulator proteins encoded by 
Hox genes; possess a highly conserved, 60-amino-acid-long 
DNA-binding homeodomain. 


HPV Human papillomavirus; infects the cervical epithelium 
and is important as a cause of carcinoma of the uterine cervix. 


hyaluronan (hyaluronic acid) Type of nonsulfated 
glycosaminoglycan with a regular repeating sequence of up to 
25,000 identical disaccharide units, not linked to a core protein. 
Found in the fluid lubricating joints and in many other tissues. 
(Figures 19-33 and 19-34) 


hybridization |In molecular biology, the process whereby two 
complementary nucleic acid strands form a base-paired duplex 
DNA-DNA, DNA-RNA, or RNA-RNA molecule. Forms the 

basis of a powerful technique for detecting specific nucleotide 
sequences. (Figures 5-47 and 8-33) 


hybridoma Hybrid cell line generated by fusion of a tumor cell 
and another cell tyoe. Monoclonal antibodies are produced by 
hybridoma lines obtained by fusing antibody-secreting B cells 
with cells of a B lymphocyte tumor. (Figure 8—4) 


hydrogen bond Noncovalent bond in which an 
electropositive hydrogen atom is partially shared by two 
electronegative atoms. (Panel 2-3, pp. 94-95) 


hydronium ion (H30*) Water molecule associated with 
an additional proton. The form generally taken by protons in 
aqueous solution. 


J 


hydrophilic Dissolving readily in water. Literally, “water loving.’ 


hydrophobic (lipophilic) Not dissolving readily in water. 
Literally, “water-fearing.” 


hydrophobic force Force exerted by the hydrogen-bonded 
network of water molecules that brings two nonpolar 
surfaces together by excluding water between them. 

(Panel 2-3, pp. 94—95) 


hypervariable region In immunology: any of the three small 
parts of the variable region of an immunoglobulin or T cell 
receptor chain that show the highest variability from molecule 
to molecule and contribute to the antigen-binding site. 

(Figure 24—26) 


hypoxia-inducible factor 1a (HIF1a). Transcription 
regulator, the intracellular levels of which increase in response 
to a shortage of oxygen, that stimulates transcription of the 
VEGF gene to promote angiogenesis. 


IkKB_ Inhibitory proteins that bind tightly to NFKB dimers 
and hold them in an inactive state within the cytoplasm of 
unstimulated cells. 


Ig superfamily Large and diverse family of proteins that 
contain immunoglobulin or immunoglobulin-like domains. Most 
are involved in cell-cell interactions or antigen recognition. 
(Figure 24—48) 


IgA Immunoglobulin A; the principal class of antibody in 
secretions, including saliva, tears, milk, and respiratory and 
intestinal secretions. 
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IgD Immunoglobulin D; produced by immature naive B cells 
after leaving the bone marrow. Transmembrane IgD and IgM 
proteins, with the same antigen-binding site, form the B cell 
receptors (BCRs) on these cells. 


IgE Immunoglobulin E; binds with high affinity via its tail region 
to aclass of Fc receptors on the surface of mast cells (tissues) 
or basophils (blood), where it acts as an antigen receptor; 
antigen binding stimulates the secretion of cytokines and 
biologically active amines, which help attract white blood cells, 
antibodies, and complement proteins to the site of activation. 


IgG Immunoglobulin G; the major antibody class in the blood, 
produced in especially large quantities during secondary 
antibody responses. The tail region of some IgG subclasses 
can bind to specific Fc receptors on macrophages and 
neutrophils. Antigen—IgG complexes can activate complement. 


IgM Immunoglobulin M; the first class of immunoglobulin that 
a developing B cell in the bone marrow makes, forming B-cell 
receptors on its surface. IgM antibodies are the major class of 
antibody secreted into the blood in the early stages of a primary 
antibody response on first exposure to an antigen, where their 
pentameric structure (with 10 antigen-binding sites) allows 
strong binding to pathogens. When bound to antigen, it is 
highly efficient activation of complement. 


IHog Protein with four or five immunoglobulin-like domains 
and two or three fibronectin-type-lll-likKe domains; located 
on the cell surface and thought to serve as co-receptors for 
Hedgehog proteins. 


image processing Computer based techniques in 
microscopy that process digital images in order to extract latent 
information. Enables compensation for some optical faults in 
microscopes, enhanced contrast to improve detection of small 
differences in light intensity, and subtraction of background 
irregularities in the optical system. 


imaginal disc Group of cells that are set aside, apparently 
undifferentiated, in the Drosophila embryo and which will 
develop into an adult structure, e.g., eye, leg, wing. Overt 
differentiation occurs at metamorphosis. (Figure 21—60) 


immunization Method of inducing adaptive immune 
responses to pathogens or foreign molecules, usually involving 
the co-injection of an adjuvant, a molecule (often of microbial 
origin) that helps activate innate immune responses required for 
the adaptive responses. 


immunoblotting see Western blotting 


immunoglobulin (lg) superfamily Large and diverse 
family of proteins that contain immunoglobulin domains or 
immunoglobulin-like domains. Most are involved in cell-cell 
interactions or antigen recognition. (Figure 24—48) 


immunoglobulin domain (lg domain) Characteristic 
protein domain of about 100 amino acids that is found in 
immunoglobulin light and heavy chains. Similar domains, 
known as immunoglobulin-like (Ig-like) domains, are present in 
many other proteins, which, together with Igs, constitute the 
Ig superfamily. (Figure 24-27) 


immunogold electron microscopy Method to localize 
specific macromolecules using a primary antibody that binds to 
the molecule of interest and is then detected with a secondary 
antibody to which a colloidal gold particle has been attached. 
The gold particle is electron-dense and can be seen as a black 
dot in the electron microscope. (Figure 9-45) 


immunological memory Long-lived property of the adaptive 
immune system that follows a primary immune response to 
many antigens, such that a subsequent encounter with the 


same antigen will provoke a more rapid and stronger secondary 
immune response. (Figure 24—16) 


immunological self-tolerance The lack of response of 
the adaptive immune system to an antigen. Tolerance to self 
molecules is crucial to avoid autoimmune diseases. (Figure 
24-21) 


immunological synapse The highly organized interface that 
develops between a T cell and an antigen-presenting cell (APC) 
or target cell it is in contact with, formed by T-cell receptors 
binding to antigen—MHC complexes on the APC and cell- 
adhesion proteins binding to their counterparts on the APCs. 


induced fit A principle for increasing the specificity of 
substrate recognition by proteins and RNAs. In protein 
synthesis, a ribosome, or enzyme folds around a codon- 
anticodon interaction and only when the match is correct is the 
subsequent reaction allowed to proceed. 


induced pluripotent stem cells (iPS cells) Cells that 
are induced by artificial expression of specific transcription 
regulators to look and behave like the pluripotent embryonic 
stem cells that are derived from embryos. 


induced regulatory T cell A regulatory T cell (Treg cell) that 
develops from naive helper T cells when they are activated in 
the presence of TGFB in the absence of IL6. 


inflammasome Intracellular protein complex formed after 
activation of cytoplasmic NOD-like receptors with adaptor 
oroteins. It contains a caspase enzyme that cleaves pro- 
inflammatory cytokines from their precursor proteins. 


inflammatory response Local response of a tissue to injury 
or infection — characterized clinically by redness, swelling, heat, 
and pain. Caused by invasion of white blood cells, which are 
attracted by and secrete various cytokines. 


inhibitors of apoptosis (IAPs) 
of apoptosis. 


Intracellular protein inhibitors 


inhibitory G protein (Gi) Trimeric G protein that can regulate 
ion channels and inhibit the enzyme adenylyl cyclase in the 
plasma membrane. See also G protein. (Table 15-3, p. 846) 


inhibitory neurotransmitter Neurotransmitter that opens 
transmitter-gated CIF or Kt channels in the postsynaptic 
membrane of a nerve or muscle cell and thus tends to inhibit 
the generation of an action potential. 


initial segment Specialized membrane region at the base of 
a nerve axon (adjacent to the cell body) that is rich in voltage- 
gated Nat channels plus other classes of ion channels that all 
contribute to the encoding of membrane depolarization into 
action potential frequency. 


initiator caspases Apoptotic caspases that begin the 
apoptotic process, activating the executioner caspases. 


initiator tRNA Special tRNA that intiates translation. It always 
carries the amino acid methionine, forming the complex Met- 
tRNAi. (Figure 6-70) 


innate immune response An early immune response in all 
organisms to a pathogen, which includes the production of 
antimicrobial molecules and the activation of phagocytic cells. 
Such a response is not specific for the pathogen, in contrast to 
an adaptive immune response. 


inner membrane Mitochondrial membrane that encloses the 
matrix space and forms extensive invaginations called cristae. 


inner mitochondrial membrane Mitochondrial membrane 
that encloses the matrix space and forms extensive 
invaginations called cristae. 


inner nuclear membrane One of two concentric membranes 
comprising the nuclear envelope; contnuous with the outer 
nuclear membrane; contains specific proteins as anchoring 
sites for chromatin and the nuclear lamina. 


inositol 1,4,5-trisphosphate (IP3) Small intracellular signaling 
molecule produced during activation of the inositol phospholipid 
signaling pathway. Acts to release Ca*+ from the endoplasmic 
reticulum. (Figures 15-28 and 15-29) 


inositol phospholipid signaling pathway Intracellular 
signaling pathway that starts with the activation of 
phospholipase C and the generation of IP3 and diacylglycerol 
(DAG) from inositol phospholipids in the plasma membrane. 
The DAG helps to activate protein kinase C. (Figures 15-28 and 
15-29) 


integrin Transmembrane adhesion protein that is involved in 
the attachment of cells to the extracellular matrix and to each 
other. (Figure 19-3 and Table 19-1, p. 1037) 


interaction domain Compact protein module, found in 
many intracellular signaling proteins, that binds to a particular 
structural motif (e.g., a short peptide sequence, a covalent 
modification, or another protein domain) in another protein or 
lipid. 

interferon-a (IFNa) and interferon-B (IFNB) Cytokines 
(type | interferons) produced by mammalian cells as a general 
response to a viral infection. 


intermembrane space Compartment of mitochondrion 
between by the outer and inner mitochondrial membranes. 


internal ribosome entry site (IRES) Specific site ina 
eukaryotic MRNA, other than at the 5’ end, at which translation 
can be initiated. (Figure 7—68) 


interphase Long period of the cell cycle between one mitosis 
and the next. Includes G; phase, S phase, and Go phase. 
(Figure 17-4) 


interpolar microtubule In the mitotic or meiotic spindle, a 
microtubule interdigitating at the equator with the microtubules 
emanating from the other pole. (Figure 17—23) 


intracellular pathogens Pathogens, including all viruses and 
many bacteria and protozoa, that enter and replicate inside 
host cells to cause disease. 


intrinsic pathway (mitochondrial pathway) Pathway of 
apoptosis activated from inside the cell in response to stress or 
develoomental signals; depends on the release into the cytosol 
of mitochondrial proteins normally resident in the mitochondrial 
intermembrane space. 


intron Noncoding region of a eukaryotic gene that is 
transcribed into an RNA molecule but is then excised by RNA 
splicing during production of the MRNA or other functional 
RNA. (Figure 4-15) 


invadopodia Actin-rich protrusions extending in three- 
dimensions that are important for cells to cross tissue barriers 
by degrading the extracellular matrix. 


ion channel = Transmembrane protein complex that forms 

a water-filled channel across the lipid bilayer through which 
specific inorganic ions can diffuse down their electrochemical 
gradients. (Figure 11-22) 


ion-channel-coupled receptor (transmitter-gated ion 
channel, ionotropic receptor) lon channel found at 
chemical synapses in the postsynaptic plasma membranes of 
nerve and muscle cells. Opens only in response to the binding 
of a specific extracellular neurotransmitter. The resulting inflow 


Glossary G:17 


of ions leads to the generation of a local electrical signal in the 
postsynaptic cell. (Figures 15-6 and 11-35) 


ion-sensitive indicators Molecules whose light emission 
reflects the local concentration of a particular ion; some are 
luminescent (emitting light soontaneously) while others are 
fluorescent (emitting light on exposure to light). 


IP3-gated Ca?+-release channel (IP3 receptor) Gated Ca?+ 
channel in the ER membrane that opens on binding cytosolic 
IP3, releasing stored Ca?* into the cytosol. (Figure 15-29) 


iron-sulfur cluster Electron-transporting group consisting 
of either two or four iron atoms bound to an equal number of 
sulfur atoms, found in a class of electron-transport proteins. 
(Figure 14—16) 


J gene segment Short DNA sequences that encodes part 
of the variable region of light and heavy immunoglobulin chains 
and of a and B chains of T cell receptors. (Figures 24-28 and 
24-29) 


JAK-STAT signaling pathway Signaling pathway activated 
by cytokines and some hormones, providing a rapid route 
from the plasma membrane to the nucleus to alter gene 
transcription. Involves cytoplasmic Janus kinases (JAKs), and 
signal transducers and activators of transcription (STATs). 


Janus kinases (JAKs) Cytoplasmic tyrosine kinases 
associated with cytokine receptors, which phosphorylate and 
activate transcription regulators called STATs. 


junctional diversification The random loss and gain of 
nucleotides at joining sites during V(D)J recombination that 
occurs during B and T cell develooment when the cells are 
assembling the gene segments that encode their antigen 
receptors. It enormously increases the diversity of V-region 
coding sequences. 


K+ leak channel K‘t-transporting ion channel in the plasma 
membrane of animal cells that remains open even in a “resting” 
cell. 


karyotype Display of the full set of chromosomes of a cell, 
arranged with respect to size, shape, and number. 


keratin Type of intermediate filament, commonly produced by 
epithelial cells. 


kinase cascade Intracellular signaling pathway in which one 
orotein kinase, activated by phosphorylation, phosphorylates 
the next protein kinase in the sequence, and so on, relaying the 
signal onward. 


kinesin Member of one of the two main classes of motor 
oroteins that use the energy of ATP hydrolysis to move along 
microtubules. (Figure 16-56) 


kinesin-1 Motor protein associated with microtubules that 
transports cargo within the cell; also called “conventional 
kinesin.” 


kinetic proofreading A principle for increasing the specificity 
of catalysis. In the synthesis of DNA, RNA, and proteins, it 
refers to a time delay that begins with an irreversible step (Such 
as ATP or GTP hydrolysis) and during which incorrect base 
pairs are more likely to dissociate than correct pairs. 


kinetochore Large protein complex that connects the 
centromere of a chromosome to microtubules of the mitotic 
spindle. (Flgure 17-30) 


kinetochore microtubule In the mitotic or meiotic spindle, a 
microtubule that connects the spindle pole to the kinetochore 
of a chromosome. 
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lagging strand One of the two newly synthesized strands of 
DNA found at a replication fork. The lagging strand is made in 
discontinuous lengths that are later joined covalently. 

(Figure 5-7) 


lamellipodium (plural lamellipodia) Flattened, sheetlike 
orotrusion supported by a meshwork of actin filaments, which 
is extended at the leading edge of a crawling animal cell. 
(Figures 16-77 and 16-79) 


laminin Extracellular matrix fibrous protein found in basal 
laminae, where it forms a sheetlike network. (Figures 19-52 and 
19-53) 


lampbrush chromosome Huge chromosome paired in 
preparation for meiosis, found in immature amphibian eggs; 
consisting of large loops of chromatin extending out from a 
linear central axis. (Figure 4—47) 


late endosome Compartment formed from a bulbous, 
vacuolar portion of early endosomes by a process called 
endosome maturation; late endosomes fuse with one another 
and with lysosomes to form endolysosomes that degrade their 
contents. 


LDL-receptor-related protein (LRP) Co-receptor bound by 
Wnt proteins in the regulation of B-catenin proteolysis. 


leading strand One of the two newly synthesized strands of 
DNA found at a replication fork. The leading strand is made by 
continuous synthesis in the 5’-to-3' direction. (Figure 5-7) 


lectin Protein that binds tightly to a specific sugar. Abundant 
lectins from plant seeds are used as affinity reagents to purify 
glycoproteins or to detect them on the surface of cells. 


Legionnaire’s disease Type of pneumonia resulting from 
infection with Legionella pneumophila, a parasite of freshwater 
amoebae that is spread to humans by air-conditioning systems 
that harbor infected amoebae and produce microdroplets of 
water that are easily inhaled. 


lethal factor One of the two A subunits of anthrax toxin; a 
protease that cleaves several activated members of the MAP 
kinase kinase family and causes a large fall in blood pressure 
and death on entry into the bloodstream of an animal. 


leucine-rich repeat (LRR) receptor kinases Common 
type of receptor serine/threonine kinase in plants that contains 
a tandem array of leucine-rich repeat sequences in its 
extracellular portion. 


leukemia Cancer of white blood cells. 


leukocyte General name for all the nucleated blood cells 
lacking hemoglobin. Also called white blood cells. Includes 
lymphocytes, granulocytes, and monocytes. (Figure 22-27) 


ligand Any molecule that binds to a specific site on a protein 
or other molecule. From Latin /igare, to bind. 


light microscope One of aclass of microscopes that uses 
visible light to create the image. 


lignin Network of cross-linked phenolic compounds that 
forms a supporting network throughout the cell walls of xylem 
and woody tissue in plants. 


limit of resolution In microscopy, the smallest distance 
apart at which two point objects can be resolved as separate. 
Just under 0.2 um for conventional light microscopy, a limit 
determined by the wavelength of light. 


linkage In ligand binding, the conformational coupling 
between two separate ligand-binding sites on a protein, such 
that a conformational change in the protein induced by binding 
of one ligand affects the binding of a second ligand. 


lipid bilayer (phospholipid bilayer) Thin double sheet of 
phospholipid molecules that forms the core structure of all cell 
membranes. The two layers of lipid molecules are packed with 
their hydrophobic tails pointing inward and their hydrophilic 
heads outward, exposed to water. (Figure 10-1 and 

Panel 2-5, pp. 98-99) 


lipid droplets Storage form in cells for excess lipids; 
comprised of a single monolayer of phospholipids and proteins 
that surrounds neutral lipids that can be retrieved from droplets 
as required by the cell. 


lipid raft Small region of a membrane enriched in 
sphingolipids and cholesterol. (Figure 10-13) 


liposome Artificial phospholipid bilayer vesicle formed from an 
aqueous suspension of phospholipid molecules. (Figure 10-9) 


local mediator Extracellular signal molecule that acts on 
neighboring cells. 


long noncoding RNA (IncRNA) One of a large group 

(~8000 in humans) of RNAs longer than 200 nucleotides and 
not coding for protein. The functions, if any, of most IncRNAs 

is unknown but individual INCRNA are known to play important 
roles in the cell, for example, in telomerase function and 
genomic imprinting. In a general sense, INCRNAs are believed to 
act as scaffolds, holding together proteins and nucleic acids to 
speed up a wide variety of reactions in the cell. 


long-term depression (LTD) A long-lasting (hours or more) 
decrease in the sensitivity of certain synapses in the brain 
triggered by NMDA receptor activation. As the opposing 
process to long-term potentiation, it is thought to be involved in 
learning and memory. 


long-term potentiation (LTP) Long-lasting increase (days 
to weeks) in the sensitivity of certain synapses in the brain, 
induced by a short burst of repetitive firing in the presynaptic 
neurons. (Figure 11-44) 


loss of heterozygosity [he result of errant homologous 
recombination that uses the homolog from the other parent 
instead of the sister chromatid as the template, converting the 
sequence of the repaired DNA to that of the other homolog. 


low-density lipoprotein (LDL) Large complex composed 
of a single protein molecule and many esterified cholesterol 
molecules, together with other lipids. The form in which 
cholesterol is transported in the blood and taken up into cells. 
(Figure 13-51) 


lumen The space inside a hollow structure. In cells: the cavity 
enclosed by an organelle membrane. In tissues: the cavity 
enclosed by a sheet of cells. 


lymphocyte White blood cell responsible for the specificity 
of adaptive immune responses. Two main types: B cells, which 
produce antibody, and T cells, which interact directly with 
other effector cells of the immune system and with infected 
cells. T cells develop in the thymus and are responsible for 
cell-mediated immunity. B cells develop in the bone marrow in 
mammals and are responsible for the production of circulating 
antibodies. 


lymphoid organ An organ containing large numbers of 
lymphocytes. Lymphocytes are produced in primary lymphoid 
organs and respond to antigen in peripheral lymphoid organs. 
(Figure 24-12) 


lymphoma Cancer of lymphocytes, in which the cancer cells 
are mainly found in lymphoid organs (rather than in the blood, 
as in leukemias). 


lysosomal storage diseases Genetic diseases resulting 
from defects in or a lack of one or more functional hydrolases in 
lysosomes of some cells, leading to accumulation of undigested 
substrates in lysosomes and consequent cell pathology. 


lysosome Membrane-enclosed organelle in eukaryotic cells 
containing digestive enzymes, which are typically most active at 
the acid pH found in the lumen of lysosomes. (Figure 13-37) 


lysozyme Enzyme that catalyzes the cutting of 
polysaccharide chains in the cell walls of bacteria. 


M-Cdk (M-phase Cdk) Cyclin-Cdk complex formed in 
vertebrate cells by an M-cyclin and the corresponding cyclin- 
dependent kinase (Cdk). (Figure 17-11 and Table 17-1, p. 969) 


M-cyclin A cyclin found in all eukaryotic cells that promotes 
the events of mitosis. (Figure 17-11) 


M6P receptor proteins Jransmembrane receptor proteins 
present in the trans Golgi network that recognize the mannose 
6-phosphate (M6P) groups added exclusively to lysosomal 
enzymes, marking the enzymes for packaging and delivery to 
early endosomes. 


macromolecule Polymers constructed of long chains of 
covalently linked, small organic (carbon-containing) molecules. 
The principal building blocks from which a cell is constructed 
and the components that confer the most distinctive properties 
of living things. 


macrophage Phagocytic cell derived from blood monocytes, 
resident in most tissues but able to roam. It has both scavenger 
and antigen-presenting functions in immune responses. 


macropinocytosis Clathrin-independent, dedicated 
degradative endocytic pathway induced in most cell types by 
cell-surface receptor activation by specific cargoes. 


malaria Protozoal disease caused by four species of 
Plasmodium, which are transmitted to humans by the bite of 
the female Anopheles mosquito. 


malignant Of tumors and tumor cells: invasive and/or 
able to undergo metastasis. A malignant tumor is a cancer. 
(Figure 20-3) 


MAP kinase module (mitogen-activated protein kinase 
module) An intracellular signaling module composed of 
three protein kinases, acting in sequence, with MAP kinase as 
the third. Typically activated by a Ras protein in response to 
extracellular signals. (Figure 15—49) 


master transcription regulator A transcription regulator 
specifically required for formation of a particular cell type. 
Artificial expression of master transcription regulators (alone or 
in combination with others) will often convert one cell type into 
another. 


maternal inheritance A form of inheritance observed 
when following mitochondria in animals and plants, where 
mitochondrial DNA is inherited only through the female 
germ line. 


maternal-effect gene Gene that acts in the mother to 
specify maternal mRNAs and proteins in the egg. Maternal- 
effect mutations affect the development of the embryo even if 
the embryo itself has not inherited the mutated gene. 


maternal-zygotic transition (MZT) Event in animal 
development where the embryo’s own genome largely takes 
over control of develooment from maternally deposited 
macromolecules. 


matrix metalloprotease Ca*t- or Zn?+-dependent 
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proteolytic enzyme present in the extracellular matrix that 
degrades matrix proteins. Includes the collagenases. 


matrix space Large internal compartment of the 
mitochondrion. 


mechanosensitive channels Transmembrane ion channels 
that open in response to a mechanical stress on the lipid bilayer 
in which they are embedded. 


megakaryocyte Large myeloid cell with a multiloobed nucleus 
that remains in the bone marrow when mature. Buds off 
platelets from long cytoplasmic processes. (Figures 22-29) 


meiosis | The first of two rounds of chromosome segregation 
following meiotic chromosome duplication; segregates the 
homologs, each composed of a tightly linked pair of sister 
chromatids. 


meiosis Il The second of two rounds of chromosome 
segregation following meiotic chromosome duplication; 
segregates the sister chromatids of each homolog. 


membrane potential Voltage difference across a membrane 
due to a slight excess of positive ions on one side and of 
negative ions on the other. A typical membrane potential for 

an animal cell plasma membrane is -60 mV (inside negative 
relative to the surrounding fluid). (Figure 11-23) 


membrane protein Amphiphilic protein of diverse structure 
and function that associates with the lipid bilayer of cell 
membranes. (Figure 10-1 7) 


membrane transport protein Membrane protein that 
mediates the passage of ions or molecules across a 
membrane. The two main classes are transporters (also called 
carriers or permeases) and channels. (Figure 11-4) 


membrane-associated protein Membrane protein not 
extending into the hydrophobic interior of the lipid bilayer 

but bound to either face of the membrane by noncovalent 
interactions with other membrane proteins. (Figure 10-1 7) 


membrane-bending proteins Attach to specific membrane 
regions as needed and act to control local membrane curvature 
and thus confer on membranes their characteristic three- 
dimensional shapes. 


membrane-bound ribosome Ribosome attached to 
the cytosolic face of the endoplasmic reticulum. The site of 
synthesis of proteins that enter the endoplasmic reticulum. 
(Figure 12-38) 


memory cell In immunology: a T or B lymphocyte generated 
following antigen stimulation that is more easily and more 

quickly induced to become an effector cell or another memory 
cell by a later encounter with the same antigen. (Figure 24—17) 


mesoderm Embryonic tissue that is the precursor to muscle, 
connective tissue, skeleton, and many of the internal organs. 
(Figure 21-3) 


messenger RNA (mRNA) RNA molecule that specifies the 
amino acid sequence of a protein. Produced in eukaryotes by 
processing of an RNA molecule made by RNA polymerase as 
a complementary copy of DNA. It is translated into protein in a 
process catalyzed by ribosomes. (Figure 6-20) 


metabolism The sum total of the chemical processes that 
take place in living cells. All of catabolism plus anabolism. 
(Figure 2-14) 


metabotropic receptors Neurotransmitter receptors that 
regulate ion channels indirectly through the activation of 
second-messenger molecules. 
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metaphase plate Imaginary plane at right angles to the 
mitotic spindle and midway between the spindle poles; the 
plane in which chromosomes are positioned at metaphase. 
(Panel 17-1, pp. 980-981) 


metaphase-to-anaphase transition Transition in the 
eukaryotic cell cycle preceding sister-chromatid separation at 
anaphase. If the cell is not ready to proceed to anaphase, the 
cell cycle is halted at this point. (Figure 17-9, and Panel 17-1, 
pp. 980-981) 


metastases Secondary tumors, at sites in the body additional 
to that of the primary tumor, resulting from cancer cells breaking 
loose, entering blood or lymphatic vessels, and colonizing 
separate environments. 


metastasis [he spread of cancer cells from their site of origin 
to other sites in the body. (Figures 20-1 and 20-16) 


MHC complex (major histocompatibility complex) Cluster 
of genes in one vertebrate chromosome (chromosome 6 in 
humans) that code for a set of highly polymorphic cell-surface 
glycoproteins (MHC proteins). (Figure 24-37) 


microbiome The combined genomes of the various species 
of a defined microbiota. 


microbiota The collective of microorganisms that reside in or 
on an organism. 


microelectrode A piece of fine glass tubing, pulled to an 
even finer tip, that is used to inject electric current into cells or 
to study the intracellular concentrations of common inorganic 
ions (such as Ht, Nat, Kt, CF, and Ca®*) in a single living cell 
by insertion of Its tip directly into the cell interior through the 
plasma membrane. 


microRNAs (miRNAs) Short (~21 nucleotide) eukaryotic 
RNAs, produced by the processing of specialized RNA 
transcripts coded in the genome, that regulate gene expression 
through base-pairing with MRNA. (Figure 7-75) 


microsome Small vesicle derived from endoplasmic reticulum 
that is produced by fragmentation when cells are homogenized. 
(Figure 12-34) 


microtubule flux Movement of individual tubulin molecules 

in the microtubules of the spindle toward the poles by loss of 
tubulin at their minus ends. Helps to generate the poleward 
movement of sister chromatids after they separate in anaphase. 
(Figure 17-35) 


microtubule-associated protein (MAP) Any protein that 
binds to microtubules and modifies their properties. Many 
different kinds have been found, including structural proteins, 
such as MAP2, and motor proteins, such as dynein. [Not to be 
confused with the “MAP” (mitogen-activated protein kinase) of 
“MAP kinase.” ] 


microtubule-organizing center (MTOC) Region ina 
cell, such as a centrosome or a basal body, from which 
microtubules grow. 


midbody Structure formed at the end of cleavage that can 
persist for some time as a tether between the two daughter 
cells in animals. (Figure 17—43) 


mitochondrial hsp70 Part of a multisubunit protein assembly 
bound to the matrix side of the TIM23 complex that acts as a 
motor to pull mitochondrial precursor proteins into the matrix 
space. 


mitochondrial matrix Large internal compartment of 
the mitochondrion. The corresponding compartment in a 
chloroplast is known as the stroma. 


mitochondrial precursor proteins Proteins first fully 
synthesized in the cytosol and then translocated into 
mitochondrial subcompartments as directed by one or more 
signal sequences. 


mitochondrion (plural mitochondria) Membrane-bounded 
organelle, about the size of a bacterium, that carries out 
oxidative phosphorylation and produces most of the ATP in 
eukaryotic cells. (Figure 1-28) 


mitogen Extracellular signal molecule that stimulates cells to 
oroliferate. 


mitotic chromosome Highly condensed duplicated 
chromosome as seen at mitosis, consisting of two sister 
chromatids held together at the centromere. 


mitotic spindle Bipolar array of microtubules and associated 
molecules that forms in a eukaryotic cell during mitosis and 
serves to move the duplicated chromosomes apart. 

(Figure 17-23 and Panel 17-1, pp. 980-981) 


model organism A species that has been studied intensively 
over a long period and thus serves as a “model” for deriving 
fundamental biological principles. 


molecular chaperone (chaperone) Protein that helps 
guide the proper folding of other proteins, or helps them avoid 
misfolding. Includes heat-shock proteins (hsp). 


monoallelic gene expression Expression of only one of 
the two copies of a gene in a diploid genome, occurring, 
for example, as a result of imprinting or X-chromosome 
inactivation. 


monoclonal antibody Antibody secreted by a hybridoma 
cell line. Because the hybridoma is generated by the fusion of a 
single B cell with a single tumor cell, each hybridoma produces 
antibodies that are all identical. (Page 444) 


monocyte Type of white blood cell that leaves the 
bloodstream and matures into a macrophage in tissues. 
(Figure 22-27) 


monomeric GTPase A single-subunit enzyme that converts 
GTP to GDP (also called small monomeric GTP-binding 
proteins). Cycles between an active GTP-bound form and an 
inactive GDP-bound form and frequently acts as a molecular 
switch in intracellular signaling pathways. 


morphogen Diffusible signal molecule that can impose a 
pattern on a field of cells by causing cells in different places to 
adopt different fates. (Figure 21-8) 


morphogenesis Developmental process in which cells 
undergo movements and deformations in order to assemble 
into tissues and organs with specific shapes and sizes. 


motor protein Protein that uses energy derived from 
nucleoside triphosphate hydrolysis to propel itself along a linear 
track (protein filament or other polymeric molecule). 


mRNA degradation control Regulation by a cell of gene 
expression by selectively preserving or destroying certain 
MRNA molecules in the cytoplasm. 


mTOR The mammalian version of the large protein kinase 
called TOR, involved in cell signaling; mTOR exists in two 
functionally distinct multiprotein complexes. 


multidrug resistance An observed phenomenon in which 
cells exposed to one anticancer drug evolve a resistance not 
only to that drug, but also to other drugs to which they have 
never been exposed. 


multidrug resistance (MDR) protein Type of ABC 
transporter protein that can pump hydrophobic drugs (Such as 
some anticancer drugs) out of the cytoplasm of eukaryotic cells. 


multipass transmembrane protein Membrane protein in 
which the polypeptide chain crosses the lipid bilayer more than 
once. (Figure 10-17) 


multivesicular bodies Intermediates in the endosome 
maturation process; early endosomes that are on their way to 
becoming late endosomes. 


mutation Heritable change in the nucleotide sequence of a 
chromosome. (Panel 8-2, pp. 486-487) 


mutation rate The rate at which changes (mutations) occur in 
DNA sequences. 


mutualism Ecological relationship between microbes and 
their host in which both the microbe and host benefit. 


Myc Transcription regulatory protein that is activated when a 
cell is stimulated to grow and divide by extracellular signals. It 
activates the transcription of many genes, including those that 
stimulate cell growth. (Figure 17-61) 


myelin sheath Insulating layer of specialized cell 

membrane wrapped around vertebrate axons. Produced 

by oligodendrocytes in the central nervous system and by 
Schwann cells in the peripheral nervous system. (Figure 11-33) 


myeloid cell Any white blood cell other than a lymphocyte. 
(Figure 22-31) 


myoblast Mononucleated, undifferentiated muscle precursor 
cell. A skeletal muscle cell is formed by the fusion of multiple 
myoblasts. (Figure 22-19) 


myofibril Long, highly organized bundle of actin, myosin, and 
other proteins in the cytoplasm of muscle cells that contracts 
by a sliding filament mechanism. 


myosin Type of motor protein that uses the energy of ATP 
hydrolysis to move along actin filaments. 


Nat-K* pump (Nat-Kt ATPase) Transmembrane carrier 
protein found in the plasma membrane of most animal cells that 
pumps Na?* out of and K* into the cell, using energy derived 
from ATP hydrolysis. (Figure 11-15) 


NAD*/NADH (nicotinamide adenine dinucleotide/reduced 
nicotinamide adenine dinucleotide) Electron carrier system 
that participates in oxidation—reduction reactions, such as the 
oxidation of food molecules. NAD+ accepts the equivalent of 

a hydride ion (H7, a proton plus two electrons) to become the 
activated carrier NADH. The NADH formed donates its high- 
energy electrons to the ATP-generating process of oxidative 
phosphorylation. (Figure 2-36) 


NADH dehydrogenase complex First of the three electron- 
driven proton pumps in the mitochondrial respiratory chain, 
also known as Complex |. It accepts electrons from NADH and 
passes them to a quinone. (Figure 14-18) 


NADP*/NADPH (nicotinamide adenine dinucleotide 
phosphate/reduced nicotinamide adenine dinucleotide 
phosphate) Electron carrier system closely related to NAD*/ 
NADH, but used almost exclusively in reductive biosynthetic, 
rather than catabolic, pathways. (Figure 2-36) 


naive cell In immunology: a T or B lymphocyte that 
proliferates and differentiates into an effector cell or memory cell 
when it encounters its specific foreign antigen for the first time. 
(Figure 24-17) 


natural killer cell (NK cell) Cytotoxic cell of the innate 
immune system that can kill virus-infected cells and some 
cancer cells. 


natural regulatory T cell A regulatory T cell (Treg cell) that 
develops in the thymus and helps maintain self-tolerance. 
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negative selection Process by which thymocytes expressing 
a T cell receptor with high affinity for a self peptide bound to a 
self-MHC protein are eliminated by undergoing apoptosis. 


negative staining A technique in electron microscopy 
enabling fine detail of isolated macromolecules to be seen. 
Samples are prepared such that a very thin film of heavy-metal 
salt covers everywhere except where excluded by the presence 
of macromolecules, which allow electrons to pass through, 
creating a reverse or negative image of the molecule. 


Nernst equation Equation that computes relates the 
electrical potential (voltage) generated by differences in ion 
concentrations across a membrane. 


Netrin Signal protein, secreted by cells of the neural tube floor 
plate, responsible for attracting growth cones of commissural 
axons toward and across the midline. 


neural crest Collection of cells located along the line where 
the neural tube pinches off from the surrounding epidermis in 
the vertebrate embryo. Neural crest cells migrate to give rise to 
a variety of tissues, including neurons and glia of the peripheral 
nervous system, pigment cells of the skin, and the bones of the 
face and jaws. (Figure 19-8) 


neural map Regular mapping of neurons of a similar type 
from one territory to another, such that there are orderly 
projections of one array of neurons onto another. 


neural tube Tube of ectoderm that will form the brain and 
spinal cord in a vertebrate embryo. (Figure 21-56) 


neurofilament Type of intermediate filament found in nerve 
cells. (Figure 16-72) 


neuromuscular junction Specialized chemical synapse 
between an axon terminal of a motor neuron and a skeletal 
muscle cell. (Figure 11-37) 


neuron (nerve cell) |[mpulse-conducting cell of the nervous 
system, with extensive processes specialized to receive, 
conduct, and transmit signals. (Figures 11-28 and 21-66) 


neuronal specificity Nonequivalence among neurons; an 
intrinsic characteristic that guides axons to their appropriate 
target sites. 


neurotransmitter Small signal molecule secreted by the 
presynaptic nerve cell at a chemical synapse to relay the 
signal to the postsynaptic cell. Examples include acetylcholine, 
glutamate, GABA, glycine, and many neuropeptides. 


neurotrophic factor Factor released in limited amounts by a 
target tissue that the neurons innervating that tissue require to 
survive. 


neurotrophin Family of signal proteins that promote the 
survival and growth of specific classes of neurons. 


neutrophil White blood cell that is specialized for the uptake 
of particulate material by phagocytosis. Enters tissues that 
become infected or inflamed. (Figure 24—5) 


NFkB protein Latent transcription regulator that is activated 
by various intracellular signaling pathways when cells are 
stimulated during immune, inflammatory, or stress responses. 
Also has important roles in animal development. (Figure 15-62) 


nitric oxide (NO) Gaseous signal molecule that is widely used 
in cell-cell communication in both animals and plants. 
(Figure 15—40) 


nitrogen fixation Biochemical process carried out by certain 
bacteria that reduces atmospheric nitrogen (N2) to ammonia, 
leading eventually to various nitrogen-containing metabolites. 
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NMDA receptor Subclass of glutamate-gated ion channel 
in the mammalian central nervous system critical for long- 
term potentiation and long-term depression. NMDA-receptor 
channels are doubly gated, opening only when glutamate is 
bound to the receptor and, simultaneously, the membrane is 
strongly depolarized. 


NO synthase (NOS) Enzyme that synthesizes nitric oxide 
(NO) by the deamination of arginine. (Figure 15—-40B) 


NOD-like receptors (NLRs) Large family of pattern 
recognition receptors (PRRs) with leucine-rich repeat motifs; 
they are exclusively cytoplasmic and recognize a distinct set of 
microbial molecules. 


nonclassical cadherins Large family of cadherins that are 
more distantly related in sequence than classical cadherins and 
include proteins involved in adhesion (including protocadherins, 
desmocollins, and desmogleins) and signaling. 


noncoding RNA An RNA molecule that is the final product 
of a gene and does not code for protein. These RNAs serve as 
enzymatic, structural, and regulatory components for a wide 
variety of processes in the cell. 


nondisjunction Event occurring occasionally during meiosis 
in which a pair of homologous chromosomes fails to separate 
so that the resulting germ cell has either too many or too few 
chromosomes. 


nonenveloped virus Virus consisting of a nucleic acid core 
and a protein capsid only. (Figure 23—18C,D) 


nonhomologous end joining A DNA repair mechanism for 
double-strand breaks in which the broken ends of DNA are 
brought together and rejoined by DNA ligation, generally with 
the loss of one or more nucleotides at the site of joining. 


nonretroviral retrotransposons Type of transposable 
element that moves by being first transcribed into an RNA copy 
that is converted to DNA by reverse transcriptase then inserted 
elsewhere in the genome. The mechanism of insertion differs 
from that of the retroviral-like transposons. (Table 5-4, p. 288) 


nonsense-mediated mRNA decay Mechanism for 
degrading aberrant mRNAs containing in-frame internal stop 
codons before they can be translated into protein. (Figure 6-76) 


normal flora The human microbiota consisting of 
approximately 1014 bacterial, fungal, and protozoan cells, 
representing thousands of microbial species. 


Notch Transmembrane receptor protein (and latent 
transcription regulator) involved in many cell-fate choices in 
animal development, for example in the specification of nerve 
cells from ectodermal epithelium. Its ligands are cell-surface 
proteins such as Delta and Serrate. (Figure 15-59) 


NSF Hexameric ATPase that disassembles a complex of a 
V-SNARE and a t-SNARE. (Figure 13-20) 


nuclear envelope Double membrane (two bilayers) 
surrounding the nucleus. Consists of an outer and inner 
membrane and is perforated by nuclear pores. The outer 
membrane is continuous with the endoplasmic reticulum. 
(Figures 4-9 and 12-7) 


nuclear export receptors Bind to both the export signal and 
nuclear pore complex proteins to guide their cargo through the 
nuclear pore complex to the cytosol. 


nuclear export signal Sorting signal contained in the 
structure of molecules and complexes, such as nuclear RNPs 
and new ribosomal subunits, that are transported from the 


nucleus to the cytosol through nuclear pore complexes. 
(Figure 12-13) 


nuclear import receptors Recognize nuclear localization 
signals to initiate nuclear import of proteins containing the 
appropriate nuclear localization signal. 


nuclear lamin Protein subunit of the intermediate filaments 
that form the nuclear lamina. 


nuclear lamina Fibrous meshwork of proteins on the inner 
surface of the inner nuclear membrane. It is made up of a 
network of intermediate filaments formed from nuclear lamins. 


nuclear localization signal (NLS) Signal sequence or signal 
patch found in proteins destined for the nucleus that enables 
their selective transport into the nucleus from the cytosol 
through the nuclear pore complexes. (Figures 12-9 and 12-13) 


nuclear magnetic resonance (NMR) spectroscopy NMR 
is the resonant absorption of electromagnetic radiation at a 
specific frequency by atomic nuclei in a magnetic field, due to 
flipping of the orientation of their magnetic dipole moments. 
The NMR spectrum provides information about the chemical 
environment of the nuclei. NMR is used widely to determine the 
three-dimensional structure of small proteins and other small 
molecules. The principles of NMR are also used for medical 
diagnostic purposes in magnetic resonance imaging (MRI). 
(Figure 8-22) 


nuclear pore complex (NPC) Large multiprotein structure 
forming an aqueous channel (the nuclear pore) through the 
nuclear envelope that allows selected molecules to move 
between nucleus and cytoplasm. (Figure 12-8) 


nuclear receptor superfamily Intracellular receptors for 
hydrophobic signal molecules such as steroid and thyroid 
hormones and retinoic acid. The receptor-ligand complex acts 
as a transcription factor in the nucleus. (Figure 15-65) 


nuclear transport receptor (karyopherin) Protein that 
escorts macromolecules either into or out of the nucleus: 
nuclear import receptor or nuclear export receptor. (Figure 
12-13) 


nucleolus A prominent structure in the nucleus where rRNA is 
transcribed and ribosomal subunits are assembled. (Figure 4-9) 


nucleoporin Any of anumber of different proteins that make 
up nuclear pore complexes. 


nucleosome Beadlike structure in eukaryotic chromatin, 
composed of a short length of DNA wrapped around an 
octameric core of histone proteins. The fundamental structural 
unit of chromatin. (Figures 4-22 and 4-23) 


nucleotide Nucleoside with one or more phosphate groups 
joined in ester linkages to the sugar moiety. DNA and RNA are 
polymers of nucleotides. (Panel 2-6, pp. 100-101) 


nucleotide excision repair Type of DNA repair that corrects 
damage of the DNA double helix, such as that caused by 
chemicals or UV light, by cutting out the damaged region on 
one strand and resynthesizing it using the undamaged strand 
as template. Compare base excision repair. (Figure 5—41) 


O-linked glycosylation Addition of one or more sugars to a 
hydroxyl group on a protein. 


obligate pathogens Bacteria that can only replicate inside 
their host. 


olfactory receptors G-protein-coupled receptors on the 
modified cilia of olfactory receptor neurons that recognize 
odors. The receptors activate adenylyl cyclase via an olfactory- 


specific G protein (Golf) and resultant increases in CAMP open 
cyclic-AMP-gated cation channels, allowing Na* influx and 
depolarization and initiation of a nerve impulse. 


oligodendrocyte Glial cell in the vertebrate central nervous 
system that forms a myelin sheath around axons. Compare 
Schwann cell. 


oncogene An altered gene whose product can act ina 
dominant fashion to help make a cell cancerous. Typically, an 
oncogene is a mutant form of a normal gene (proto-oncogene) 
involved in the control of cell growth or division. (Figure 20-1 7) 


open reading frame (ORF) A continuous nucleotide 
sequence free from stop codons in at least one of the three 
reading frames (and thus with the potential to code for protein). 


opportunistic pathogens Microbes of the normal flora that 
Can cause disease only if the immune systems are weakened or 
if they gain access to a normally sterile part of the body. 


optogenetics Use of genetically engineered 
channelrhodopsin and other light-responsive ion channels 

and transporters to modulate neuron function and hence 
analyze the neurons and circuits underlying complex functions, 
including behaviors in whole animals. (Figure 11-32) 


organelle Subcellular compartment or large macromolecular 
complex, often membrane-enclosed, that has a distinct 
structure, composition, and function. Examples are nucleus, 
nucleolus, mitochondrion, Golgi apparatus, and centrosomes. 
(Figure 1—25) 


Organizer Specialized tissue at the dorsal lip of the 
blastopore in an amphibian embryo; a source of signals that 
help to orchestrate formation of the embryonic body axis. 


origin recognition complex (ORC) Large protein complex 
that is bound to the DNA at origins of replication in eukaryotic 
chromosomes throughout the cell cycle. (Figure 5-31) 


orthologs Genes or proteins from different species that are 
similar in Sequence because they are descendants of the same 
gene in the last common ancestor of those species. Compare 
paralogs. (Figure 1-21) 


osteoblast Cell that secretes matrix of bone. (Figure 22-14) 


osteoclast Macrophage-like cell that erodes bone, enabling 
it to be remodeled during growth and in response to stresses 
throughout life. (Figure 22-16) 


osteocyte Nondividing cell in bone that develops from an 
osteoblast and is embedded in bone matrix. (Figure 22-14) 


outer membrane Mitochondrial membrane that is in contact 
with the cytosol. 


outer mitochondrial membrane Membrane that separates 
the organelle from the cytosol. 


outer nuclear membrane One of two concentric 
membranes comprising the nuclear envelope; surrounds the 
inner nuclear membrane and is continuous with the inner 
nuclear membrane and the membrane of the endoplasmic 
reticulum. 


OXA complex Protein translocator in the inner mitochondrial 
membrane that mediates insertion of inner membrane proteins. 


oxidation (verb oxidize) Loss of electrons from an atom, as 
occurs during the addition of oxygen to a molecule or when a 
hydrogen is removed. Opposite of reduction. (Figure 2—20) 


oxidative phosphorylation Process in bacteria and 
mitochondria in which ATP formation is driven by the transfer 
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of electrons through the electron transport chain to molecular 
oxygen. Involves the intermediate generation of a proton 
gradient (OH gradient) across a membrane and a chemiosmotic 
coupling of that gradient to the ATP synthase. (Figures 14-12) 


oxidative phosphorylation Process in bacteria and 
mitochondria in which ATP formation is driven by the 
transfer of electrons through the electron-transport chain to 
molecular oxygen. Involves the intermediate generation of an 
electrochemical proton gradient across a membrane and a 
chemiosmotic coupling of that gradient to the ATP synthase. 
(Figure 14—10) 


P-type pumps A class of AI P-driven pumps comprising 
structurally and functionally related multipass transmembrane 
proteins that phosphorylate themselves during the pumping 
cycle. The class includes many of the ion pumps responsible 
for setting up and maintaining gradients of Na+, Kt, H+, and 
Ca?* across cell membranes. (Figure 11-12) 


p53 A transcription regulatory protein that is activated by 
damage to DNA and is involved in blocking further progression 
through the cell cycle. (Figures 20-37 and 20—40) 


p53 Tumor suppressor gene that is mutated in about half 

of human cancers. Encodes a transcription regulator that is 
activated by damage to DNA and is involved in blocking further 
progression through the cell cycle. (Figure 20-27) 


pair-rule gene In Drosophila development, a gene expressed 
in a series of regular transverse stripes along the body of the 
embryo and which helps to determine its segments. 

(Figure 21-19) 


pairing In meiosis, the lining up of the two homologous 
chromosomes along their length. (Figure 17-54) 


papillomaviruses Class of viruses responsible for human 
warts and a prime example of DNA tumor viruses, being a 
cause of cancer of the uterine cervix. 


paracrine signaling Short-range cell-cell communication 
via secreted signal molecules that act on neighboring cells. 
(Figure 15-2) 


paralogs Genes or proteins that are similar in sequence 
because they are the result of a gene duplication event 
occurring in an ancestral organism. Those in two different 
organisms are less likely to have the same function than are 
orthologs. Compare orthologs. (Figure 1-21) 


parasitism Ecological relationship between microbes and 
their host in which the microbe benefits to the detriment of the 
host, as is often the case for pathogens. 


passengers Mutations that have occurred in the same cell as 
driver mutations, but which are irrelevant to the development of 
the cancer. 


passive transport (facilitated diffusion) Transport of a 
solute across a membrane down its concentration gradient or 
its electrochemical gradient, using only the energy stored in the 
gradient. (Figure 11-4) 


patch-clamp recording Electrophysiological technique 
in which a tiny electrode tip is sealed onto a patch of cell 
membrane, thereby making it possible to record the flow of 
current through individual ion channels in the patch. 

(Figure 11-34) 


Patched Transmembrane protein predicted to cross the 
plasma membrane 12 times; much is in intracellular vesicles 
and some is on the cell surface where it binds the Hedgehog 
orotein. 
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pathogen (adjective pathogenic) An organism, cell, virus, or 
prion that causes disease. 


pathogen-associated molecular patterns (PAMPs) 
Microbe-associated molecules, either not present or 
sequestered in the host organism, that often occur in repeating 
patterns that are recognized by pattern recognition receptors 
(PRRs) in or on cells of the innate immune system. PAMPs are 
present in various microbial molecules, including nucleic acids, 
lipids, polysaccharides, and proteins. 


pattern recognition receptor (PRR) Receptor present on 
or in cells of the innate immune system that recognizes and is 
activated by microbial pathogen-associated molecular patterns 
(PAMPs). 


PDZ domain Protein-binding domain present in many 
scaffold proteins, and often used as a docking site for 
intracellular tails of transmembrane proteins. (Figure 19-22) 


pectin Mixture of polysaccharides rich in galacturonic acid 
which forms a highly hydrated matrix in which cellulose is 
embedded in plant cell walls. (Figure 19-63) 


peripheral (secondary) lymphoid organ Lymphoid organ 
in which T cells and B cells interact and respond to foreign 
antigens. Examples are spleen, lymph nodes, and mucosal- 
associated lymphoid organs. (Figure 24—12) 


peroxins Form a protein translocator that participates in the 
import of proteins into peroxisomes. 


peroxisome Small membrane-bounded organelle that uses 
molecular oxygen to oxidize organic molecules. Contains some 
enzymes that produce and others that degrade hydrogen 
peroxide (H202). (Figure 12-27) 


pH scale Common measure of the acidity of a solution: “p 
refers to power of 10, “H” to hydrogen. Defined as the negative 
logarithm of the hydrogen ion concentration in moles per liter 
(M). 0H = -log [H+]. Thus a solution of pH 3 will contain 10-8 M 
hydrogen ions. pH less than 7 is acidic and pH greater than 7 is 
alkaline. 


phagocytosis Process by which unwanted cells, debris, and 
other bulky particulate material is endocytosed (“eaten”) by a 
cell. Prominent in carnivorous cells, such as Amoeba proteus, 
and in vertebrate macrophages and neutrophils. From Greek 
phagein, to eat. 


phagosome Large intracellular membrane-enclosed vesicle 
that is formed as a result of phagocytosis. Contains ingested 
extracellular material. (Figure 13-61) 


phase variation The random switching of phenotype and 
expression of proteins involved in infection at frequencies much 
higher than mutation rates. 


phase-contrast microscope Type of light microscope that 
exploits the interference effects that occur when light passes 
through material of different refractive indices. Used to view 
living cells. (Figure 9-7) 


phenotype The observable character (including both physical 
appearance and behavior) of a cell or organism. 
(Panel 8-2, p. 486) 


phosphatidylinositol 4,5-bisphosphate [PI(4,5)P2, PIP2] 
Membrane inositol phospholipid (a phosphoinositide) that is 
cleaved by phospholipase C into IP3 and diacylglycerol at the 
beginning of the inositol phospholipid signaling pathway. It can 
also be phosphorylated by PI 3-kinase to produce PIP3 docking 
sites for signaling proteins in the PI-3-kinase—Akt signaling 
pathway. (Figures 15-28 and 15-53) 


phosphoglyceride Phospholipid derived from glycerol, 
abundant in biomembranes. (Figures 10-2 and 10-3) 


phosphoinositide A lipid containing a phosphorylated inositol 
derivative. Minor component of the plasma membrane, but 
important in demarking different membranes and for intracellular 
signal transduction in eukaryotic cells. (Figure 15-52) 


phosphoinositide 3-kinase (PI 3-kinase) Membrane-bound 
enzyme that is a component of the PI-3-kinase—Akt intracellular 
signaling pathway. It phosphorylates phosphatidylinositol 
4,5-bisphosphate at the 3 position on the inositol ring to 
produce PIP: docking sites in the membrane for other 
intracellular signaling proteins. (Figure 15-53) 


phosphoinositides (PIPs; phosphatidylinositol 
phosphates) A lipid containing a phosphorylated inositol 
derivative. Minor component of the plasma membrane, but 
important in demarking different membranes and for intracellular 
signal transduction in eukaryotic cells. (Figure 13-10) 


phospholipase C (PLC) Membrane-bound enzyme that 
cleaves inositol phospholipids to produce IP and diacylglycerol 
in the inositol phospholipid signaling pathway. PLC is activated 
by GPCRs via specific G proteins, while PLCy is activated by 
RTKs. (Figure 15-55) 


phospholipid The main category of lipids used to construct 
biomembranes. Generally composed of two fatty acids linked 
through glycerol (or sphingosine) phosphate to one of a variety 
of polar groups. (Flgure 10-3, and Panel 2-5, pp. 98-99) 


phosphorylation Reaction in which a phosphate group is 
covalently coupled to another molecule. 


photoactivation Technique for studying intracellular 
processes in which an inactive form of a molecule of interest 
is introduced into the cell, and is then activated by a focused 
beam of light at a precise spot in the cell. (Figure 9-28) 


photochemical reaction center The part of a photosystem 
that converts light energy into chemical energy in 
photosynthesis. (Figure 14—44) 


photosynthetic electron-transfer reactions Light-driven 
reactions in photosynthesis in which electrons move along an 
electron-transport chain in a membrane, generating ATP and 
NADPH. 


photosystem Multiprotein complex involved in 
photosynthesis that captures the energy of sunlight and 
converts it to useful forms of energy: a reaction center plus an 
antenna (Figure 14-45) 


phototropin Photoprotein associated with the plant plasma 
membrane that senses blue light and is partly responsible for 
phototropism. 


phragmoplast Structure made of microtubules and actin 
filaments that forms in the prospective plane of division of a 
plant cell and guides formation of the cell plate. (Figure 17—49) 


phytochrome Plant photoprotein that senses light via a 
covalently attached light-absorbing chromophore, which 
changes its shape in response to light and then induces a 
change in the protein’s conformation. Plant phytochromes are 
dimeric, cytoplasmic serine/threonine kinases, which respond 
differentially and reversibly to red and far-red light to alter cell 
behavior. 


PI-3-kinase-Akt pathway Intracellular signaling pathway 
that stimulates animal cells to survive and grow. (Figure 15-53) 


pinocytosis Literally, “cell drinking.” Type of endocytosis 
in which soluble materials are continually taken up from the 


environment in small vesicles and moved into endosomes 
along with the membrane-bound molecules. Compare 
phagocytosis. (Figure 13-48) 


piRNAs (piwi-interacting RNAs) A class of small noncoding 
RNAs made in the germ line that, in complex with Piwi proteins, 
keep in check the movement of transposable elements by 
transcriptionally silencing transposon genes and destroying 
RNAs produced by them. 


planar cell polarity Type of cellular asymmetry seen in some 
epithelia, such that each cell has a polarity vector oriented in 
the plane of the epithelium. (Figure 21—51) 


plant growth regulator (plant hormone) Signal molecule 
that helps coordinate growth and development. Examples are 
ethylene, auxins, gibberellins, cytokinins, abscisic acid, and the 
brassinosteroids. 


plasma membrane |The membrane that surrounds a living 
cell. (Figure 10-1) 


plasmid vector Small, circular molecules of double-stranded 
DNA derived from plasmids that occur naturally in bacterial 
cells; widely used for gene cloning. 


plasmodesma (plural plasmodesmata) Plant equivalent 
of a gap junction. Communicating cell-cell junction in plants 
in which a channel of cytoplasm lined by plasma membrane 
connects two adjacent cells through a small pore in their cell 
walls. 


platelet Cell fragment, lacking a nucleus, that breaks off from 
a megakaryocyte in the bone marrow and is found in large 
numbers in the bloodstream. Helps initiate blood clotting when 
blood vessels are injured. (Figure 22-29) 


pleckstrin homology domain (PH domain) Protein 
domain found in some intracellular signaling proteins. 
Some PH domains in intracellular signaling proteins bind 
to phosphatidylinositol 3,4,5-trisohosphate produced by 
PI 3-kinase, bringing the signaling protein to the plasma 
membrane when PI 3-kinase is activated. 


pluripotent Describes a cell that has the potential to give rise 
to all or almost all of the cell types of the adult body. 


polarized In epithelia, that the basal end of a cell, adherent to 
the basal lamina below, differs from the apical end, exposed to 
the medium above; thus, all epithelia and their individual cells 
are structurally polarized. 


Polycomb group Set of proteins critical for cell memory for 
some genes. They form complexes as part of the chromatin of 
the Hox complex, where they maintain a repressed state in cells 
where Hox genes have not been activated. 


polymerase chain reaction (PCR) Technique for amplifying 
specific regions of DNA by the use of Sequence-specific 
primers and multiple cycles of DNA synthesis, each cycle being 
followed by a brief heat treatment to separate complementary 
strands. (Figure 8-36) 


polymorphisms Describes genome sequences that coexist 
as two or more sequence variants at high frequency in a 
population. 


polypeptide Linear polymer of amino acids. Proteins are large 
polypeptides, and the two terms can be used interchangeably. 
(Panel 3-1, pp. 112-113) 


polypeptide backbone Repeating sequence of atoms along 
the core of the polypeptide chain. 


polyribosome mRNA engaged with multiple ribosomes in the 
act of translation. 
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polytene chromosome Giant chromosome in which the 
DNA has undergone repeated replication and the many copies 
have stayed together in precise alignment. (Figures 4-50 and 
4-51) 


porin Channel-forming proteins of the outer membranes of 
bacteria, mitochondria, and chloroplasts. 


position effect variegation Alteration in gene 

expression resulting from change in the position of the 

gene in relation to other chromosomal domains, especially 
heterochromatic domains. When an active gene is placed 
next to heterochromatin, the inactivating influence of the 
heterochromatin can spread to affect the gene to a variable 
degree, giving rise to position effect variegation. (Figure 4-31) 


positional value A cell’s internal record of its positional 
information in a multicellular organism; an intrinsic character 
that differs according to a cell’s location. 


positive feedback Control mechanism whereby the end 
product of a reaction or pathway stimulates its own production 
or activation. 


positive selection |n immunology: process of thymocyte 
maturation in which thymocytes expressing a T cell receptor 
with appropriate affinity for a self peptide bound to a self MHC 
protein is signaled to survive and continue development. 


post-transcriptional controls Any control on gene 
expression that is exerted at a stage after transcription has 
begun. (Figure 7-54) 


post-translational Occurring after completion of translation. 


preprophase band Circumferential band of microtubules and 
actin filaments that forms around a plant cell under the plasma 
membrane prior to mitosis and cell division. (Figure 17—49) 


prereplicative complex (preRC) Multiprotein complex that 
is assembled at origins of replication during late mitosis and 
early G; phases of the cell cycle; a prerequisite to license 

the assembly of a preinitiation complex, and the subsequent 
initiation of DNA replication. (Figures 17-17 and 17-18) 


primary cell wall The first cell wall produced by a developing 
plant cell; it is thin and flexible, allowing room for cell growth. 
(Figure 19-63) 


primary cilium Short, single, nonmotile cilium lacking 
dynein that arises from a centriole and projects from the 
surface of many animal cell tyoes. Some signaling proteins are 
concentrated in the primary cilium. (Figure 15-38) 


primary lg repertoire The billions of IgM and IgD 
immunoglobulin molecules made by the B cells of an adaptive 
immune system in the absence of antigen stimulation. 


primary immune response Adaptive immune response to 
an antigen that is made on first encounter with that antigen. 
(Figure 24—16) 


primary pathogens Pathogens that can cause overt disease 
in most healthy people. Some cause acute, life-threatening 
epidemic infections and spread rapidly between hosts; other 
potential primary pathogens may persistently infect a single 
individual for years without causing overt disease, the host 
often being unaware that they are infected. 


primary structure Linear sequence of monomer units in a 
polymer, such as the amino acid sequence of a protein. 


primary tumor Tumor at the original site at which a cancer 
first arose. Secondary tumors develop elsewhere by metastasis. 
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prion disease [ransmissible spongiform encephalopathy — 
such as Kuru and Creutzfeldt-Jakob disease (CJD) in humans, 
scrapie in sheep, and bovine spongiform encephalopathy (BSE, 
or “mad cow disease”) in cows—that is caused and transmitted 
by an infectious, abnormally folded protein (prion). (Figure 3-33) 


pro-inflammatory cytokine Any cytokine that stimulates an 
inflammatory response. 


programmed cell death A form of cell death in which a cell 
kills itself by activating an intracellular death program. 


prokaryote Single-celled microorganism whose cells lack a 
well-defined, membrane-enclosed nucleus. Either a bacterium 
or an archaeon. (Figure 1-17) 


promoter Nucleotide sequence in DNA to which RNA 
polymerase binds to begin transcription. (Figure 7-1 7) 


proteasome Large protein complex in the cytosol with 
proteolytic activity that is responsible for degrading proteins that 
have been marked for destruction by ubiquitylation or by some 
other means. (Figures 6-83 and 6-84) 


protein The major macromolecular constituent of cells. A 
linear polymer of amino acids linked together by peptide bonds 
in a specific sequence. (Figure 3-1) 


protein activity control The selective activation, inactivation, 
degradation, or compartmentalization of specific proteins 

after they have been made. One of the means by which a cell 
controls which proteins are active at a given time or location in 
the cell. 


protein domain see domain 


protein glycosylation Process of transferring a single 
saccharide or preformed precursor oligosaccharide to proteins. 


protein kinase Enzyme that transfers the terminal phosphate 
group of ATP to one or more specific amino acids (serine, 
threonine, or tyrosine) of a target protein. 


protein kinase C (PKC) Ca?+-dependent protein kinase 
that, when activated by diacylglycerol and an increase in the 
concentration of cytosolic Ca?+, phosphorylates target proteins 
on specific serine and threonine residues. (Figure 15-29) 


protein phosphatase Enzyme that catalyzes phosphate 
removal from amino acids of a target protein. 


protein subunit An individual protein chain in a protein 
composed of more than one chain. 


protein translocation Process of moving a protein across a 
membrane. 


protein translocator Membrane-bound protein that 
mediates the transport of another protein across a membrane. 
(Figure 12-21) 


protein tyrosine phosphatase Enzyme that removes 
phosphate groups from phosphorylated tyrosine residues on 
proteins. 


proteoglycan Molecule consisting of one or more 
glycosaminoglycan chains attached to a core protein. 
(Figure 19-38) 


proteomics Study of all the proteins, including all the 
covalently modified forms of each, produced by a cell, tissue, or 
organism. Proteomics often investigates changes in this larger 
set of proteins —in “the proteome” — caused by changes in the 
environment or by extracellular signals. 


proto-oncogene Normal gene, usually concerned with the 


regulation of cell proliferation, that can be converted into a 
cancer-promoting oncogene by mutation. 


protofilament Linear string of microtubule subunits joined 
end to end; multiple protofilaments associate with one another 
laterally to construct and provide strength and adaptability to 
microtubules. 


proton (H+) Positively charged subatomic particle that forms 
part of an atomic nucleus. Hydrogen has a nucleus composed 
of a single proton (H*). 


proton-motive force The force exerted by the 
electrochemical proton gradient that moves protons across a 
membrane. 


protozoan parasite Parasitic, nonohotosynthetic, single- 
celled, motile eukaryotic organism, for example Plasmodium. 


pseudogene Nucleotide sequence of DNA that has 
accumulated multiple mutations that have rendered an 
ancestral gene inactive and nonfunctional. 


purified cell-free system Fractionated cell homogenate that 
retains a particular biological function of the intact cell, and in 
which biochemical reactions and cell processes can be more 
easily studied. 


purifying selection Natural selection operating in a 
population to slow genome changes and reduce divergence by 
eliminating individuals carrying deleterious mutations. 


quantitative RT-PCR (reverse transcription—polymerase 
chain reaction) Technique in which a population of mRNAs is 
converted into cDNAs via reverse transcription, and the cDNAs 
are then amplified by PCR. The quantitative part relies on a 
direct relationship between the rate at which the PCR product 
is generated and the original concentration of the MRNA 
species of interest. 


quaternary structure Jhree-dimensional relationship of the 
different polypeptide chains in a multisubunit protein or protein 
complex. 


quinone (Q) Small, lipid-soluble, mobile electron carrier 
molecule found in the respiratory and photosynthetic electron- 
transport chains. (Figure 14-17) 


Rab cascade An ordered recruitment of sequentially acting 
Rab proteins into Rab domains on membranes, which changes 
the identity of an organelle and reassigns membrane dynamics. 


Rab effectors Molecules that bind activated, membrane- 
bound Rab proteins and act as downstream mediators of 
vesicle transport, membrane tethering, and fusion. 


Rab proteins Monomeric GTPase in the Ras superfamily 
present in plasma and organelle membranes in its GTP-bound 
state, and as a soluble cytosolic protein in its GDP-bound state. 
Involved in conferring specificity on vesicle docking. 

(Table 15-5, p. 854) 


Rac Member of the Rho family of monomeric GTPases that 
regulate the actin and microtubule cytoskeletons, cell-cycle 
progression, gene transcription, and membrane transport. 


Rad51 Eukaryotic protein that catalyzes synapsis of DNA 
strands during genetic recombination. Called RecA in E. coli. 


Ran (Ran protein) Monomeric GTPase of the Ras superfamily 
present in both cytosol and nucleus. Required for the active 
transport of macromolecules into and out of the nucleus 
through nuclear pore complexes. (Table 15-5, p. 854) 


rapidly inactivating Kt channel Neuronal voltage-gated 
K+ channel, open when the membrane is depolarized, with 


a specific voltage sensitivity and kinetics of inactivation that 
induce a reduced rate of action potential firing at levels of 
stimulation only just above the threshold required, thereby 
resulting in a firing rate proportional to the strength of the 
depolarizing stimulus. 


Ras A small family of proto-oncogenes that are frequently 
mutated in cancers, each of which produces a Ras monomeric 
GTPase. 


Ras (Ras protein) Monomeric GTPase of the Ras superfamily 
that helps to relay signals from cell-surface receptor tyrosine 
kinase receptors to the nucleus, frequently in response to 
signals that stimulate cell division. Named for the ras gene, first 
identified in viruses that cause rat sarcomas. (Figure 3-67) 


Ras superfamily Large superfamily of monomeric GTPases 
(also called small GTP-binding proteins) of which Ras is the 
prototypical member. (Table 15-5, p. 854) 


Ras-GAPs_ Ras GIPase-activating proteins; increase the rate 
of hydrolysis of bound GTP by Ras, thereby inactivating Ras. 


Ras-GEFs Ras guanine nucleotide exchange factors; 
stimulate the dissociation of GDP and the subsequent uptake 
of GTP from the cytosol, thereby activating Ras. 


Ras-MAP-kinase signaling pathway Intracellular signaling 
pathway that relays signals from activated receptor tyrosine 
kinases to effector proteins in the cell including transcription 
regulators in the nucleus. 


Rb gene Ihe gene that is defective in both copies in 
individuals with retinoblastoma; its protein product plays a 
central role in cell-cycle control. 


reading frame Ihe phase in which nucleotides are read in 
sets of three to encode a protein. An MRNA molecule can be 
read in any one of three reading frames, only one of which will 
give the required protein. (Figure 6-49) 


RecA (RecA protein) Prototype for a class of DNA-binding 
proteins that catalyze synapsis of DNA strands during genetic 
recombination. (Figure 5-49) 


receptor Any protein that binds a specific signal molecule 
(ligand) and initiates a response in the cell. Some are on the cell 
surface, while others are inside the cell. (Figure 15-3) 


receptor editing Process by which a developing B cell that 
recognizes a self molecule changes its antigen receptors so 
that the cell no longer does so. 


receptor serine/threonine kinase _ Cell-surface receptor 
with an extracellular ligand-binding domain and an intracellular 
kinase domain that phosphorylates signaling proteins on serine 
or threonine residues in response to ligand binding. The TGFB 
receptor is an example. (Figure 15-57) 


receptor tyrosine kinase (RTK) Cell-surface receptor with 
an extracellular ligand-binding domain and an intracellular 
kinase domain that phosphorylates signaling proteins on 
tyrosine residues in response to ligand binding. (Figure 15—43 
and Table 15-4, p. 850) 


receptor-mediated endocytosis Internalization of receptor— 
ligand complexes from the plasma membrane by endocytosis. 
(Figure 13-52) 


recombinant DNA technology Collection of techniques 

by which DNA segments from different sources are combined 
to make a new DNA, often called a recombinant DNA. 
Recombinant DNAs are widely used in the cloning of genes, in 
the genetic modification of organisms, and in the production of 
large amounts of rare proteins. 
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recycling endosome Organelle that provides an intermediate 
stage on the passage of recycled receptors back to the cell 
membrane. Regulates plasma membrane insertion of some 
proteins. (Figure 13-58) 


red blood cell Small hemoglobin-containing blood cell of 
vertebrates that transports oxygen to, and carbon dioxide from, 
tissues. Also called an erythrocyte. 


redox pair Pair of molecules in which one acts as an electron 
donor and one as an electron acceptor in an oxidation— 
reduction reaction: for example, NADH (electron donor) and 
NAD* (electron acceptor). (Panel 14-1, p. 765) 


redox potential The affinity of a redox pair for electrons, 
generally measured as the voltage difference between an 
equimolar mixture of the pair and a standard reference. NADH/ 
NAD* has a low redox potential and O2/H2 has a high redox 
potential (high affinity for electrons). (Panel 14-1, p. 765) 


redox reaction Reaction in which one component becomes 
oxidized and the other reduced; an oxidation—reduction 
reaction. (Panel 14-1, p. 765) 


reduction (verb reduce) Addition of electrons to an atom, as 
occurs during the addition of hydrogen to a biological molecule 
or the removal of oxygen from it. Opposite of oxidation. 

(Figure 2—20) 


regulated nuclear transport Mechanisms controlling export 
of mRNAs from the nucleus to the cytosol that can be used to 

regulate gene expression. Also includes the selective import of 

proteins and RNA molecules into the nucleus. 


regulated secretory pathway A second secretory pathway 
found mainly in cells specialized for secreting products 

rapidly on demand—such as hormones, neurotransmitters, 

or digestive enzymes—in which soluble proteins and other 
substances are initially stored in secretory vesicles for later 
release. (Figure 13-62) 


regulator of G protein signaling (RGS) A GAP protein that 
binds to a trimeric G protein and enhances its GTPase activity, 
thus helping to limit G-protein-mediated signaling. (Figure 15-8) 


regulatory site Region of an enzyme surface to which a 
regulatory molecule binds and thereby influences the catalytic 
events at the separate active site. 


regulatory T cell (Treg) A type of T cell that suppresses the 
development, activation, or function of other immune cells via 
secreted cytokines or cell-surface inhibitory proteins. 


replication fork Y-shaped region of a replicating DNA 
molecule at which the two strands of the DNA are being 
separated and the daughter strands are being formed. (Figures 
5-7 and 5-18) 


replication origin Location on a DNA molecule at which 
duplication of the DNA begins. (Figures 4-19 and 5-23) 


replicative cell senescence Phenomenon observed in 
primary cell cultures in which cell proliferation slows down and 
finally irreversibly halts. 


respiratory chain (electron-transport chain) Electron- 
transport chain present in the inner mitochondrial membrane 
that generates an electrochemical gradient across the 
membrane that is used to drive ATP synthesis. 

(Figures 14—4 and 14-10) 


resting membrane potential Electrical potential across the 
plasma membrane of a cell at rest, i.e. a cell that has not been 
stimulated to open additional ion channels than those that are 
normally open. 
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restriction nuclease One of a large number of nucleases 
that can cleave a DNA molecule at any site where a specific 
short sequence of nucleotides occurs. Extensively used in 
recombinant DNA technology. (Figure 8-24) 


restriction point Important transition at the end of G4 in the 
eukaryotic cell cycle; commits the cell to enter S phase. The 
term was originally used for this transition in the mammalian cell 
cycle; in this book we use the term Start. (Figure 17-9) 


retinoblastoma A rare type of human cancer arising from 
cells in the retina of the eye that are converted to a cancerous 
state by an unusually small number of mutations. Studies 

of retinoblastoma led to the discovery of the first tumor 
suppressor gene. 


retinoblastoma protein (Rb protein) Tumor suppressor 
protein involved in the regulation of cell division. Mutated in the 
cancer retinoblastoma, as well as in many other tumors. Its 
normal activity is to regulate the eukaryotic cell cycle by binding 
to and inhibiting the E2F proteins, thus blocking progression to 
DNA replication and cell division. (Figure 17-61) 


retroviral-like retrotransposons A large family of 
transposons that move themselves in and out of chromosomes 
by amechanism similar to that used by retroviruses, being 

first transcribed into an RNA copy that is converted to DNA by 
reverse transcriptase then inserted elsewhere in the genome. 
(Table 5-4, p. 288) 


retrovirus RNA-containing virus that replicates in a cell by first 
making an RNA-DNA intermediate and then a double-strand 
DNA molecule that becomes integrated into the cell’s DNA. 
(Figure 5—62) 


reverse genetics Approach to discovering gene function that 
starts from the DNA (gene) and its protein product and then 
creates mutants to analyze the gene’s function. 


reverse transcriptase Enzyme first discovered in retroviruses 
that makes a double-strand DNA copy from a single-strand 
RNA template molecule. 


RGD sequence Tripeptide sequence of arginine-glycine- 
aspartic acid that forms a binding site for integrins; present 
in fibronectin and some other extracellular proteins. (Figure 
19-47C) 


Rheb A monomeric Ras-related GTPase that in its active form 
(Rheb-GTP) activates mTOR, which promotes cell growth. 


Rho Member of the Rho family of monomeric GTPases that 
regulate the actin and microtubule cytoskeletons, cell-cycle 
progression, gene transcription, and membrane transport. 


Rho family Family of monomeric GTPases within the Ras 
superfamily involved in signaling the rearrangement of the 
cytoskeleton. Includes Rho, Rac, and Cdc42. (Table 15-5, p. 
854) 


rhodopsin Seven-span membrane protein of the GPCR 
family that acts as a light sensor in rod photoreceptor cells in 
the vertebrate retina. Contains the light-sensitive prosthetic 
group retinol. (Figure 15-39) 


ribosomal RNA (rRNA) Any one of a number of specific RNA 
molecules that form part of the structure of a ribosome and 
participate in the synthesis of proteins. Often distinguished by 
their sedimentation coefficient (e.g., 28S rRNA, 5S rRNA). 


ribosome Particle composed of rRNAs and ribosomal 
proteins that catalyzes the synthesis of protein using 
information provided by MRNA. (Figure 6-64) 


ribozyme An RNA molecule with catalytic activity. 


RNA (ribonucleic acid) Polymer formed from covalently 
linked ribonucleotide monomers. See also messenger RNA, 
ribosomal RNA, transfer RNA. (Figure 6—4) 


RNA editing Type of RNA processing that alters the 
nucleotide sequence of an RNA transcript after it is synthesized 
by inserting, deleting, or altering individual nucleotides. 


RNA interference (RNAi) As originally described, mechanism 
by which an experimentally introduced double-stranded RNA 
induces sequence-specific destruction of complementary 
mRNAs. The term RNAi is often used to include the inhibition 
of gene expression by microRNAs (miRNAs) and piwi RNAs 
(pIRNAs), which are encoded in the cell’s own genome. 


RNA polymerase Enzyme that catalyzes the synthesis of 
an RNA molecule on a DNA template from ribonucleoside 
triphosphate precursors. (Figure 6-9) 


RNA primer Short stretch of RNA synthesized on a DNA 
template. It is required by DNA polymerases to start their DNA 
synthesis. 


RNA processing control Regulation by a cell of gene 
expression by controlling the processing of RNA transcripts, 
which includes their splicing. 


RNA splicing Process in which intron sequences are 
excised from RNA transcripts. A major process in the nucleus 
of eukaryotic cells leading to formation of messenger RNAs 
(mRNAs). 


RNA transport and localization control Regulation by a 
cell of gene expression by selecting which completed mRNAs 
are exported from the nucleus to the cytosol and determining 
where in the cytosol they are localized. 


RNA world Hypothesis that early life on Earth was based 
primarily on RNA molecules that both stored genetic 
information and catalyzed biochemical reactions. 


RNA-seq Sequencing the entire repertoire of RNA from a cell 
or tissue; also known as deep RNA sequencing. 


robustness The ability of biological regulatory systems to 
function normally in the face of perturbations such as exposure 
to frequent and/or extreme variations in external conditions or 
the concentrations or activities of key components. 


rod photoreceptor (rod) Photoreceptor cell in the vertebrate 
retina that is responsible for noncolor vision in dim light. 


rough endoplasmic reticulum (rough ER) Endoplasmic 
reticulum with ribosomes on its cytosolic surface. Involved in 
the synthesis of secreted and membrane-bound proteins. 


rRNA gene Gene that specifies a ribosomal RNA (rRNA). 


ryanodine receptor A regulated Ca?* channel in the ER 
membrane that opens in response to rising Ca?* levels and 
thus amplifies the Ca2* signal. 


SAM complex Protein translocator that helps B-barrel 
proteins to fold properly in the outer mitochondrial membrane. 


Sanger sequencing see dideoxy sequencing 


Sari protein Monomeric GTPase responsible for regulating 
COPII coat assembly at the endoplasmic reticulum membrane. 


sarcoma Cancer of connective tissue. 


scaffold protein Protein that binds groups of intracellular 
signaling proteins into a signaling complex, often anchoring the 
complex at a specific location in the cell. (Figure 15—10) 


scanning electron microscope Type of electron microscope 
that produces an image of the surface of an object. 
(Figure 9-50) 


S-Cdk Cyclin-—Cdk complex formed in vertebrate cells by an 
S-cyclin and the corresponding cyclin-dependent kinase (Cdk). 
(Figure 17-11 and Table 17-1, p. 969) 


SCF Family of ubiquitin ligases formed as a complex of 
several different proteins. One is involved in regulating the 
eukaryotic cell cycle, directing the destruction of inhibitors of 
S-Cdks in late G4 and thus promoting the activation of S-Cdks 
and DNA replication. (Figures 3-71 and 17-15) 


Schwann cell Glial cell responsible for forming myelin 
sheaths in the peripheral nervous system. Compare 
oligodendrocyte. (Figure 11-33) 


S-cyclin Member of a class of cyclins that accumulate during 
late G4 phase and bind Cdks soon after progression through 
Start; they help stimulate DNA replication and chromosome 
duplication. Levels remain high until late mitosis, after which 
these cyclins are destroyed. (Figure 17-11) 


Sec61 complex Three-subunit core of the protein 
translocator that transfers polypeptide chains across the 
endoplasmic reticulum membrane. 


second messenger (small intracellular mediator) Small 
intracellular signaling molecule that is formed or released for 
action in response to an extracellular signal and helps to relay 
the signal within the cell. Examples include cyclic AMP, cyclic 
GMP, IP3, Ca?+, and diacylglycerol. 


secondary cell wall Permanent rigid cell wall that is laid 
down underneath the thin primary cell wall in certain plant cells 
that have completed their growth. 


secondary Ig repertoire Immunoglobulins produced 

by B cells after antigen- and helper- I-cell-induced somatic 
hypermutation and class switching. Compared to the primary lg 
repertoire, these lgs have a greatly increased diversity of both 

lg classes and antigen-binding sites and have increased affinity 
for antigen. 


secondary immune response The adaptive immune 
response that occurs in response to a second or subsequent 
exposure to an antigen. The response is more rapid in onset 
and stronger than the primary immune response. (Figure 24—16) 


secondary structure Regular local folding pattern of a 
polymeric molecule; in proteins, a helices and 
B sheets. 


secretion system Specialized bacterial systems that secrete 
effector proteins that interact with host cells. 


secretory vesicle Membrane-enclosed organelle in which 
molecules destined for secretion are stored prior to release. 
Sometimes called secretory granule because darkly staining 
contents make the organelle visible as a small solid object. 
(Figures 13-65) 


securin Protein that binds to the protease separase and 
thereby prevents its cleavage of the protein linkages that hold 
sister chromatids together in early mitosis. Securin is destroyed 
at the metaphase-to-anaphase transition. (Figure 17-38) 


segment Divisions of an insect body along its anteroposterior 
axis, each forming highly specialized structures, but all built 
according to a similar fundamental plan. 


segment-polarity gene In Drosophila development, a gene 
involved in specifying the anteroposterior organization of each 
body segment. (Figure 21-19) 
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segmentation clock The gene-expression oscillator 
controlling regular segmentation during vertebrate embryonic 
develooment. 


segmentation genes Genes expressed by subsets of cells 
in the embryo that refine the pattern of gene expression so as 
to define the boundaries and ground plan of the individual body 
segments. 


selectin Member of a family of cell-surface carbohydrate- 
binding proteins that mediate transient, Ca2*-dependent 
cell-cell adhesion in the bloodstream —for example between 
white blood cells and the endothelium of the blood vessel wall. 
(Figure 19-28) 


selectivity filter The part of an ion channel structure that 
determines which ions it can transport. (Figures 11-24 and 
11-25) 


sensory bristles Miniature sense organs present on most 
exposed surfaces of Drosophila, consisting of a sensory neuron 
and supporting cells and responding to chemical or mechanical 
stimull. 


separase Protease that cleaves the cohesin protein linkages 
that hold sister chromatids together. Acts at anaphase, 
enabling chromatid separation and segregation. (Figure 17-38) 


septum Structure formed during bacterial cell division by the 
inward growth of the cell wall and plasma membrane and that 
divides the cell into two. 


sequential induction Development process that generates 
a progressively more complicated pattern. A series of 

local inductions whereby one of two cell types present in a 
developing tissue can produce a signal to induce neighboring 
cells to specialize in a third way; the third cell tyoe can then 
signal back to the other two cell tyoes nearby to generate a 
fourth and a fifth cell tyoe, and so on. 


serine protease [ype of protease that has a reactive serine in 
the active site. (Figures 3-12 and 3-39) 


serine/threonine kinase Enzyme that phosphorylates 
specific proteins on serine or threonines. 


SH2 domain Src homology region 2, a protein domain 
present in many signaling proteins. Binds a short amino acid 
sequence containing a phosphotyrosine. (Panel 3-2, 

pp. 142-143) 


side chain The part of an amino acid that differs between 
amino acid types. The side chains give each type of amino acid 
its unique physical and chemical properties. (Panel 3-1, 

pp. 112-113) 


signal patch Protein-sorting signal that consists of a specific 
three-dimensional arrangement of atoms on the folded protein’s 
surface. (Figure 13-46) 


signal peptidase Enzyme that removes a terminal signal 
sequence from a protein once the sorting process is complete. 
(Figure 12-35) 


signal sequence Short continuous sequence of amino acids 
that determines the eventual location of a protein in the cell. An 
example is the N-terminal sequence of 20 or so amino acids 
that directs nascent secretory and transmembrane proteins to 
the endoplasmic reticulum. (Table 12-3, p. 648) 


signal-recognition particle (SRP) Ribonucleoprotein particle 
that binds an ER signal Sequence on a partially synthesized 
polypeptide chain and directs the polypeptide and its attached 
ribosome to the endoplasmic reticulum. (Figure 12-36) 
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signaling center Cluster of specialized cells in developing 
tissues that serves as a source of developmental signals — for 
example, the generation of a morphogen gradient. 


single-nucleotide polymorphism (SNP) A variation between 
individuals in a population due to a relatively common difference 
in a specific nucleotide at a defined point in the DNA sequence. 


single-particle reconstruction Computational procedure 
in electron microscopy in which images of many identical 
molecules are obtained and digitally combined to produce an 
averaged three-dimensional image, thereby revealing structural 
details that are hidden by noise in the original images. 

(Figures 9-54 and 9-55) 


single-pass transmembrane protein Membrane protein in 
which the polypeptide chain crosses the lipid bilayer only once. 
(Figure 10-24) 


single-strand DNA-binding (SSB) protein Protein that 
binds to the single strands of the opened-up DNA double helix, 
preventing helical structures from reforming while the DNA is 
being replicated. (Figure 5-15) 


sister chromatids Tightly linked pair of chromosomes that 
arise from chromosome duplication during S phase. They 
separate during M phase and segregate into different daughter 
cells. (Figure 17-21) 


sliding clamp Protein complex that holds the DNA 
polymerase on DNA during DNA replication. (Figure 5-17) 


Slit Signal protein, secreted by cells of the neural tube floor 
plate, responsible for repelling the growth cones of commissural 
axons after they have crossed the midline, thereby ensuring 
these neurons do not re-cross the midline. 


Smad family Latent transcription regulators that are 
phosphorylated and activated by receptor serine/threonine 
kinases and carry the signal from the cell surface to the 
nucleus. (Figure 15-57) 


small interfering RNAs (siRNAs) Short (21-26 nucleotide) 
double-stranded RNAs that inhibit gene expression by directing 
destruction of complementary mRNAs. Production of siRNAs is 
usually triggered by exogenously introduced double-stranded 
RNA. (Figure 7-77) 


small nuclear RNA (snRNA) Small RNA molecules that are 
complexed with proteins to form the ribonucleoprotein particles 
(SNRNPs) involved in RNA splicing. (Figures 6-28 and 6-29) 


small nucleolar RNA (snoRNA) Small RNAs found in 
the nucleolus, with various functions, including guiding the 
modifications of precursor rRNA. (Table 6-1, p. 305, and 
Figure 6-41) 


smooth endoplasmic reticulum (Smooth ER) Region of the 
endoplasmic reticulum not associated with ribosomes. Involved 
in detoxification reactions, Ca?+ storage, and lipid synthesis. 
(Figure 12-33) 


Smoothened Seven-pass transmembrane protein with a 
structure very similar to a GPCR but does not seem to act 
as a Hedgehog receptor or as an activator of G proteins; it is 
controlled by the Patched and iHog proteins. 


SNARE proteins (SNAREs) Members of a large family of 
transmembrane proteins present in organelle membranes 
and the vesicles derived from them. SNAREs catalyze the 
many membrane fusion events in cells. They exist in pairs—a 
v-SNARE in the vesicle membrane that binds specifically to a 
complementary t-SNARE in the target membrane. 


sodium dodecyl sulfate polyacrylamide-gel 
electrophoresis (SDS-PAGE) Type of electrophoresis 

used to separate proteins by size. The protein mixture 

to be separated is first treated with a powerful negatively 
charged detergent (SDS) and with a reducing agent 
(8-mercaptoethanol), before being run through a polyacrylamide 
gel. The detergent and reducing agent unfold the proteins, free 
them from association with other molecules, and separate the 
polypeptide subunits. 


somatic cell Any cell of a plant or animal other than cells of 
the germ line. From Greek soma, body. 


somatic hypermutation In immunology: accumulation 

of point mutations in the assembled variable-region-coding 
sequences of immunoglobulin genes that occurs when B cells 
are activated to form memory cells. Results in the production 
of antibodies with altered antigen-binding sites, some of which 
bind antigen with increased affinity; it is responsible for affinity 
maturation in antibody responses. 


somatic mutations In cancer, one or more detectable 
abnormalities in the DNA sequence of tumor cells that 
distinguish them from the normal somatic cells surrounding the 
tumor. 


somite One of a series of paired blocks of mesoderm 

that form during early development and lie on either side of 
the notochord in a vertebrate embryo. They give rise to the 
segments of the body axis, including the vertebrae, muscles, 
and associated connective tissue. (Figure 21-38) 


sorting signal Signal sequence or signal patch that directs 
the delivery of a protein to a specific location, such as a 
particular intracellular compartment. 


spectrin Abundant protein associated with the cytosolic side 
of the plasma membrane in red blood cells, forming a network 
that Supports the membrane. Also present in other cells. 
(Figure 10-38) 


S phase Period of a eukaryotic cell cycle in which DNA is 
synthesized. (Figure 17-4) 


spinal cord Bundle of neurons and support cells that extends 
from the brain. 


spindle assembly checkpoint Regulatory system that 
operates during mitosis to ensure that all chromosomes 

are properly attached to the spindle before sister-chromatid 
separation starts. (Figure 17-9 and Panel 17-1, pp. 980-981) 


spliceosome Large assembly of RNA and protein molecules 
that performs pre-mRNA splicing in eukaryotic cells. 


Src (Src protein family) Family of cytoplasmic tyrosine 
kinases (pronounced “sark”) that associate with the cytoplasmic 
domains of some enzyme-linked cell-surface receptors 

(for example, the T cell antigen receptor) that lack intrinsic 
tyrosine kinase activity. They transmit a signal onward by 
phosphorylating the receptor itself and specific intracellular 
signaling proteins on tyrosines. (Figure 3—10) 


SRP (signal-recognition particle) receptor Component 
in the endoplasmic reticulum (ER) membrane that guides the 
signal recognition particle to the ER membrane. 


starch Polysaccharide composed exclusively of glucose units, 
used as an energy-storage material in plant cells. (Figure 2-51) 


Start (restriction point) Important transition at the end of 
G4 in the eukaryotic cell cycle. Passage through Start commits 
the cell to enter S phase. The term was originally used for this 
point in the yeast cell cycle only; the equivalent point in the 


mammalian cell cycle was called the restriction point. In this 
book we use Start for both. (Figure 17-9) 


start-transfer signal Short amino acid sequence that 
enables a polypeptide chain to start being translocated across 
the endoplasmic reticulum membrane through a protein 
translocator. Multipass membrane proteins sometimes have 
both N-terminal (signal sequence) and internal start-transfer 
signals. (Figure 12—42) 


STAT (signal transducer and activator of transcription) 
Latent transcription regulator that is activated by 
phosphorylation by Janus kinases (JAKs) and enters the 
nucleus in response to signaling from receptors of the cytokine 
receptor family. (Figure 15-56) 


stem cell Undifferentiated cell that can continue dividing 
indefinitely, throwing off daughter cells that can either commit 
to differentiation or remain a stem cell (in the process of self- 
renewal). (Figure 22-3) 


stem-cell niche [he specialized microenvironment in a tissue 
in which self-renewing stem cells can be maintained. 


steroid hormones Hormones, including cortisol, estrogen, 
and testosterone, that are hydrophobic lipid molecules derived 
from cholesterol that activate intracellular nuclear receptors. 


stimulatory G protein (Gs) G protein that, when activated, 
activates the enzyme adenylyl cyclase and thus stimulates 
the production of cyclic AMP. See a/so G protein. 

(Table 15-8, p. 846) 


stochastic Random. Involving chance, probability, or random 
variables. 


stop-transfer signal Hydrophobic amino acid sequence 
that halts translocation of a polypeptide chain through the 
endoplasmic reticulum membrane, thus anchoring the protein 
chain in the membrane. (Figure 12-42) 


strand exchange Reaction in which one of the single-strand 
3’ ends from one duplex DNA molecule penetrates another 
duplex and searches it for homologous sequences through 
base-pairing. Also called strand invasion. 


strand-directed mismatch repair A proofreading system 
that removes DNA replication errors missed by the DNA 
polymerase proofreading exonuclease. Detects the potential for 
DNA helix distortion from noncomplementary base pairs then 
recognizes and excises the mismatch in the newly synthesized 
strand and resynthesizes the excised segment using the old 
strand as a template. 


stress fibers Cortical fibers of contractile actin-myosin II 
bundles that connect the cell to the extracellular matrix or 
adjacent cells through focal adhesions or a circumferential belt 
and adherens junctions. 


stroma (1) “Bedding”: the connective tissue in which a 
glandular or other epithelium is embedded. Stromal cells 
provide the environment necessary for the development of 
other cells within the tissue. (2) The large interior space of a 
chloroplast, containing enzymes that incorporate COs into 
sugars. (Figure 14-38) 


substrate Molecule on which an enzyme acts. 


superresolution Describes several approaches in light 
microscopy that bypass the limit imposed by the diffraction of 
light and successfully allow objects as small as 20 nm to be 
imaged and clearly resolved. 


survival factor Extracellular signal that promotes cell survival 
by inhibiting apoptosis. (Figure 18-12) 
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symporter Carrier protein that transports two types of solute 
across the membrane in the same direction. (Figure 11-8) 


synapse Communicating cell-cell junction that allows signals 
to pass from a nerve cell to another cell. In a chemical synapse, 
the signal is carried by a diffusible neurotransmitter. (Figure 
19-22) In an electrical synapse, a direct connection is made 
between the cytoplasms of the two cells via gap junctions. 
(Figure 11-34 and 19-23) 


synapse elimination Process by which each muscle cell 
at first receives synapses from several motor neurons, but is 
ultimately left innervated by only one. 


synaptic plasticity Changes in the strength with which 

a chemical synapse transmits a signal. It is thought to be 
important in memory formation, where concentrations of 
postsynaptic AMPA receptor are modulated in response to a 
synapse’s activity. 


synaptic signaling Intercellular signaling performed by 
neurons that transmit signals electrically along their axons and 
release neurotransmitters at synapses, which are often located 
far away from the neuronal cell body. 


synaptic vesicle Small neurotransmitter-filled secretory 
vesicle found at the axon terminals of nerve cells. Its contents 
are released into the synaptic cleft by exocytosis when an 
action potential reaches the axon terminal. 


synaptonemal complex Structure that holds paired 
homologous chromosomes tightly together in pachytene of 
prophase | in meiosis and promotes the final steps of crossing- 
over. (Figures 17—55 and 17-56) 


syncytium Mass of cytoplasm containing many nuclei 
enclosed by a single plasma membrane. Typically the result 
either of cell fusion or of a series of incomplete division cycles in 
which the nuclei divide but the cell does not. 


TATA box Sequence in the promoter region of many 
eukaryotic genes that binds a general transcription factor 
(TFIID) and hence specifies the position at which transcription is 
initiated. (Figure 6-14) 


T cell receptor (TCR) Transmembrane receptor for 
antigen on the surface of T lymphocytes, consisting of an 
immunoglobulin-like heterodimer. (Figure 24-32) 


T-cell-mediated immune response Any adaptive immune 
response mediated by antigen-specific T cells. 


telomerase Enzyme that elongates telomere sequences in 
DNA, which occur at the ends of eukaryotic chromosomes. 


telomere End of a chromosome, associated with a 
characteristic DNA sequence that is replicated in a special way. 
Counteracts the tendency of the chromosome otherwise to 
shorten with each round of replication. From Greek telos, end, 
and meros, portion. 


telomere End of a chromosome, associated with a 
characteristic DNA sequence that is replicated in a special way. 
Counteracts the tendency of the chromosome otherwise to 
shorten with each round of replication. From Greek te/os, end. 


telophase Final stage of mitosis in which the two sets of 
separated chromosomes decondense and become enclosed 
by nuclear envelopes. (Panel 17-1, pp. 980-981) 


template Single strand of DNA or RNA whose nucleotide 
sequence acts as a guide for the synthesis of a complementary 
strand. (Figure 1-3) 
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terminal differentiation The limit of cell determination when 
a cell forms one of the highly specialized cell tyoes of the adult 
body. 


terminally differentiated A cell at the limit of cell 
determination, being one of the highly specialized cell types of 
the adult body. 


terminator Signal in bacterial DNA that halts transcription; 
in eukaryotes, transcription terminates after cleavage and 
polyadenylation of the newly synthesized RNA. 


tertiary structure Complex three-dimensional form of a 
folded polymer chain, especially a protein or RNA molecule. 


Tf cell see follicular helper T cell 


Ty1 cell A type of effector helper T cell that secretes 
interferon-y to help activate macrophages and induces B cells 
to switch the class of antibody they make. (Figure 24-44) 


T417 cell A type of effector helper T cell that secretes IL17, 
which recruits neutrophils and stimulates an inflammatory 
response. (Figure 24—44) 


TH2 cell A type of effector helper T cell that helps activate 
B cells to produce antibodies, to undergo somatic 
hypermutation, and switch the class of immunoglobulin 
produced. (Figure 24—44) 


thylakoid Flattened sac of membrane in a chloroplast that 
contains chlorophyll and other pigments and carries out the 
light-trapping reactions of photosynthesis. Stacks of thylakoids 
form the grana of chloroplasts. (Figures 14-35 and 14-36) 


thylakoid membrane Chloroplast membrane system 
that contains the large membrane protein complexes for 
photosynthesis and photophosphorylation. 


thymocytes Developing T cells in the thymus. 


tight junction Cell-cell junction that seals adjacent epithelial 
cells together, preventing the passage of most dissolved 
molecules from one side of the epithelial sheet to the other. 
(Figures 19-2 and 19-21) 


TIM complexes Protein translocators in the mitochondrial 
inner membrane. The TIM23 complex mediates the transport of 
proteins into the matrix and the insertion of some proteins into 
the inner membrane; the TIM22 complex mediates the insertion 
of a subgroup of proteins into the inner membrane. 

(Figure 12-21) 


Toll A transmembrane receptor protein. On the ventral side 
of the Drosophila egg membrane, its activation controls the 
distribution of Dorsal, a transcription regulator of the NFkB 
family. 


Toll-like receptors (TLRs) Family of pattern recognition 
receptors (PRRs) on or in cells of the innate immune system. 
They recognize pathogen-associated immunostimulants 
(PAMPs) associated with microbes. (Figure 24-4) 


TOM complex Multisubunit protein complex that transports 
proteins across the mitochondrial outer membrane. 
(Figure 12-21) 


TOR Large, serine/threonine protein kinase that is activated 
by the PI-3-kinase—Akt signaling pathway and promotes cell 
growth. 


totipotent Describes a cell that is able to give rise to all the 
different cell types in an organism. 


trans face Face on the other (far) side. 
trans Golgi network (TGN) Network of interconnected 


tubular and cisternal structures closely associated with the 
trans face of the Golgi apparatus and the compartment from 
which proteins and lipids exit the Golgi, bound for the cell 
surface or another compartment. 


transcellular transport Transport of solutes, such as 
nutrients, across an epithelium, by means of membrane 
transport proteins in the apical and basal faces of the epithelial 
cells. (Figure 11-11) 


transcription (DNA transcription) Copying of one strand of 
DNA into a complementary RNA sequence by the enzyme RNA 
polymerase. (Figures 6-1 and 6-8) 


transcription regulators General name for any protein that 
binds to a specific DNA sequence (known as a cis-regulatory 
sequence) to influence the transcription of a gene. 


transcriptional control Regulation by a cell of gene 
expression by controlling when and how often a given gene is 
transcribed. 


transcytosis Uptake of material at one face of a cell by 
endocytosis, its transfer across a cell in vesicles, and discharge 
from another face by exocytosis. (Figure 13-58) 


transfer RNA (tRNA) Set of small RNA molecules used in 
protein synthesis as an interface (adaptor) between mRNA and 
amino acids. Each type of tRNA molecule is covalently linked to 
a particular amino acid. (Figure 6-50) 


transferrin receptor Cell-surface receptor for transferrin (a 
soluble protein that carries iron) that delivers iron to the cell 
interior via receptor-mediated endocytosis and recycling of the 
receptor—transferrin complex. 


transformed A cell with an altered phenotype that behaves 
in many ways like a cancer cell (i.e., unregulated proliferation, 
anchorage-independent growth in culture). 


transforming growth factor-B superfamily (TGFB 
superfamily) Large family of structurally related secreted 
proteins that act as hormones and local mediators to 
control a wide range of functions in animals, including 
during development. It includes the TGFB/activin and bone 
morphogenetic protein (BMP) subfamilies. (Figure 15-57) 


transgene The foreign or modified gene that has been added 
to create a transgenic organism. 


transgenic organism Plant or animal that has stably 
incorporated one or more genes from another cell or organism 
(through insertion, deletion, and/or replacement) and can pass 
them on to successive generations. (Figures 8-53 and 8-70) 


transit amplifying cell Cell derived from a stem cell 
that divides a limited number of times before terminally 
differentiating. 


transition state Structure that forms transiently in the course 
of a chemical reaction and has the highest free energy of any 
reaction intermediate. Its formation is a rate-limiting step in the 
reaction. (Figure 3-47) 


translation (RNA translation) Process by which the 
sequence of nucleotides in an MRNA molecule directs 
the incorporation of amino acids into protein. Occurs on a 
ribosome. (Figures 6-1 and 6-64) 


translational control Regulation by a cell of gene expression 
by selecting which mRNAs in the cytoplasm are translated by 
ribosomes. 


translocon The assembly of a translocator associated with 
other membrane complexes, such as enzymes that modify the 
growing polypeptide chain. 


transmembrane adhesion proteins Cytoskeleton- 

linked transmembrane molecules with one end linking to the 
cytoskeleton inside the cell and the other end linking to other 
structures outside it. 


transmembrane protein Membrane protein that extends 
through the lipid bilayer, with part of its mass on either side of 
the membrane. (Figure 10-17) 


transmitter-gated ion channel (ion-channel-coupled 
receptor, ionotropic receptor) lon channel found at 
chemical synapses in the postsynaptic plasma membranes of 
nerve and muscle cells. Opens only in response to the binding 
of a specific extracellular neurotransmitter. The resulting inflow 
of ions leads to the generation of a local electrical signal in the 
postsynaptic cell. (Figures 11-36 and 15-6) 


transport vesicle Membrane-enclosed transport containers 
that bud from specialized coated regions of donor membrane 
and pass from one cell compartment to another as part of the 
cell’s membrane transport processes; vesicles can be spherical, 
tubular, or irregularly shaped. 


transporter (carrier protein, permease) Membrane 
transport protein that binds to a solute and transports it across 
the membrane by undergoing a series of conformational 
changes. Transporters can transport ions or molecules passively 
down an electrochemical gradient or can link the conformational 
changes to a source of metabolic energy such as ATP 
hydrolysis to drive active transport. Compare channel protein. 
See also membrane transport protein. (Figure 11-3) 


transposable element (transposon) Segment of DNA 
that can move from one genome position to another by 
transposition. (Table 5-4, p. 288) 


transposition (transpositional recombination) Movement 
of a DNA sequence from one genome site to another. 
(Table 5-4, p. 288) 


transposon see transposable element 


treadmilling Process by which a polymeric protein filament is 
maintained at constant length by addition of protein subunits at 
one end and loss of subunits at the other. (Panel 16-2, 

op. 902-903) 


trimeric GTP-binding protein see G protein 


Trithorax group Set of proteins critical for cell memory 
that maintains the transcription of Hox genes in cells where 
transcription has already been switched on. 


t-SNAREs _§ Transmembrane SNARE protein, usually 
composed of three proteins and found on target membranes 
where it interacts with v-SNAREs on vesicle membranes. 


tubulin The protein subunit of microtubules. (Panel 16-1, 
p. 891, and Figure 16-42) 


y-tubulin ring complex (y-TuRC) Protein complex 
containing y-tubulin and other proteins that is an efficient 
nucleator of microtubules and caps their minus ends. 


tumor progression Process by which an initial mildly 
disordered cell behavior gradually evolves into a full-blown 
cancer. (Figures 20-8 and 20-9) 


tumor suppressor gene Gene that appears to help prevent 
formation of a cancer. Loss-of-function mutations in such 
genes favor the development of cancer. (Figure 20-17) 


tumor virus Virus that can help make the cell it infects 
Cancerous. 


turgor pressure Large hydrostatic pressure developed inside 
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a plant cell as the result of the intake of water by osmosis; it is 
the force driving cell expansion in plant growth and it maintains 
the rigidity of plant stems and leaves. 


two-dimensional gel electrophoresis Technique combining 
two different separation procedures— separation by charge 
(isoelectric focusing) in the first dimension, then separation by 
size in a direction at a right angle to that of the first step —to 
resolve up to 2000 proteins in the form of a two-dimensional 
protein map. 


type Ill fibronectin repeat The major repeat domain in 
fibronectin, it is about 90 amino acids long and occurs at 
least 15 times in each subunit. The repeat is among the most 
common of all protein domains in vertebrates. 


type Ill secretion system One of several secretion systems 
in Gram negative bacteria; delivers effector proteins into host 
cells in a contact-dependent manner. (Figure 23-7) 


type IV collagen An essential component of mature basal 
laminae consisting of three long protein chains twisted into a 
ropelike superhelix with multiple bends. Separate molecules 
assemble into a flexible, felt-like network that gives the basal 
lamina tensile strength. 


tyrosine kinase Enzyme that phosphorylates specific 
proteins on tyrosines. See also cytoplasmic tyrosine kinase. 


tyrosine-kinase-associated receptor Cell-surface receptor 
that functions similarly to RTKs, except that the kinase domain 
is encoded by a separate gene and is noncovalently associated 
with the receptor polypeptide chain. 


ubiquitin Small, highly conserved protein present in all 
eukaryotic cells that becomes covalently attached to lysines of 
other proteins. Attachment of a short chain of ubiquitins to such 
a lysine can tag a protein for intracellular proteolytic destruction 
by a proteasome. (Figure 3-69) 


ubiquitin ligase Any one of a large number of enzymes that 
attach ubiquitin to a protein, often marking it for destruction in 
a proteasome. The process catalyzed by a ubiquitin ligase is 
called ubiquitylation. (Figure 3-71) 


unfolded protein response Cellular response triggered 
by an accumulation of misfolded proteins in the endoplasmic 
reticulum. Involves expansion of the ER and increased 
transcription of genes that code for endoplasmic reticulum 
chaperones and degradative enzymes. (Figure 12-51) 


uniporter Carrier protein that transports a single solute from 
one side of the membrane to the other. (Figure 11-8) 


V(D)J recombination Somatic recombination process by 
which gene segments are brought together to form a functional 
gene for a polypeptide chain of an immunoglobulin or T cell 
receptor. (Figure 24-28) 


vacuole Large fluid-filled compartment found in most plant 
and fungal cells, typically occupying more than a third of the cell 
volume. (Figure 13-41) 


van der Waals attraction Type of (individually weak) 
noncovalent bond that is formed at close range between 
nonpolar atoms. (Table 2-1, p. 45 and Panel 2-3, pp. 94-95) 


variable region Region of an immunoglobulin or T cell 
receptor polypeptide chain that is the most variable and 
contributes to the antigen-binding site. (Figures 24-25 and 
24-32) 


vascular endothelial growth factor (VEGF) Secreted 
protein that stimulates the growth of blood vessels. 
(Table 15-4, p. 850, and Figure 22-26) 
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vesicle transport model One hypothesis for how the Golgi 
apparatus achieves and maintains its polarized structure and 
how molecules move from one cisterna to another. This model 
holds that Golgi cisternae are long-lived structures that retain 
their characteristic set of Golgi-resident proteins firmly in place, 
and cargo proteins are transported from one cisterna to the 
next by transport vesicles. 


vesicular transport Transport of proteins from one cell 
compartment to another by means of membrane-bounded 
intermediaries such as vesicles or organelle fragments. 


V gene segment A DNA sequence encoding most of 

the variable region of an immunoglobulin or T cell receptor 
polypeptide chain. There are many different V gene segments, 
one of which becomes joined to a D or J gene segment by 
somatic recombination when an individual lymphoid progenitor 
cell begins to differentiate into a B or T lymphocyte. 

(Figure 24—28) 


virulence factor Protein, encoded by a virulence gene, that 
contributes to an organism's ability to cause disease. 


virulence gene Gene that contributes to an organism’s ability 
to cause disease. 


virus Particle consisting of nucleic acid (RNA or DNA) 
enclosed in a protein coat and capable of replicating within a 
host cell and spreading from cell to cell. (Figure 23-11) 


virus receptor Molecule on the host cell surface to which 
virus Surface proteins bind to enable binding of virus to the cell 
surface. 


voltage-gated cation channel Type of ion channel found 

in the membranes of electrically excitable cells (Such as nerve, 
endocrine, egg, and muscle cells). Opens in response to a shift 
in membrane potential past a threshold value. 


voltage-gated K+ channel lon channel in the membrane of 
nerve cells that opens in response to membrane depolarization, 
enabling Kt efflux and rapid restoration of the negative 
membrane potential. 


voltage-gated Na* channel lon channel in the membrane 
of nerve and skeletal muscle cells that opens in response to a 
stimulus causing sufficient depolarization, allowing Na* to enter 
the cell down its electrochemical gradient 


v-SNAREs Transmembrane SNARE protein, comprising a 
single polypeptide chain, usually found in vesicle membranes 
where it interacts with t-SNAREs in target membranes. 


V-type pumps Turbine-like protein machines constructed 
from multiple different subunits that use the energy of ATP 
hydrolysis to drive transport across a membrane. The V-type 
proton pump transfers H* into organelles such as lysosomes to 
acidify their interior. (Figure 11-12) 


WASp protein Key target of activated Cdc42. Exists in 

an inactive folded conformation and an activated open 
conformation; association with Cdc42 stabilizes the open form, 
enabling binding to the Arp 2/3 complex and enhancing actin- 
nucleating activity. 


Wee1 Protein kinase that inhibits Cdk activity by 
phosphorylating amino acids in the Cdk active site. Important in 
regulating entry into M phase of the cell cycle. 


Western blotting Technique by which proteins are separated 
by electrophoresis and immobilized on a paper sheet and then 
analyzed, usually by means of a labeled antibody. Also called 
immunoblotting. 


white blood cell General name for all the nucleated blood 
cells lacking hemoglobin. Also called leukocytes. Includes 
lymphocytes, granulocytes, and monocytes. (Figure 22-27) 


Wnt protein Member of a family of secreted signal proteins 
that have many different roles in controlling cell differentiation, 
oroliferation, and gene expression in animal embryos and adult 
tissues. 


Wnt/f-catenin pathway Signaling pathway activated by 
binding of a Wnt protein to its cell-surface receptors. The 
pathway has several branches. In the major (canonical) branch, 
activation causes increased amounts of B-catenin to enter the 
nucleus, where it regulates the transcription of genes controlling 
cell differentiation and proliferation. Overactivation of the 
Wnt/B-catenin pathway can lead to cancer. (Figure 15-60) 


X-inactivation Inactivation of one copy of the X chromosome 
in the somatic cells of female mammals. 


X-inactivation center (XIC) Site in an X chromosome at 
which inactivation is initiated and spreads outward. 


x-ray crystallography Technique for determining the three- 
dimensional arrangement of atoms in a molecule based on the 
diffraction pattern of x-rays passing through a crystal of the 
molecule. (Figure 8-21) 


zygote Diploid cell produced by fusion of a male and female 
gamete. A fertilized egg. 
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A-kinase see cyclic-AMP-dependent 
protein kinase 
A-P (anteroposterior) axis 1162-1164, 
1169-1170 
A-to-| editing (adenine to inosine) 418 
A-type lamins 948 
A-V (animal-vegetal) axis 1155-1156, 
1167-1168 
AAA proteins 358, 359F, 709 
ABC (ATP-binding cassette) transporters 
18F, 163-164, 606, 609-611, 
1139 
Abcb71 gene 1139 
Abl gene, in Bcr-Ab/ hybrid 1135 
abortive initiation 306 
absorptive cells, intestinal 1218, 1219F, 
1221, 1223-1224, 1225F 
ABT-737 1032F 
Acanthamoeba (A. castellanil) 801F, 805F, 
924 
accessory proteins 
actin accessory proteins 899, 
904-906, 908F, 909-911, 913, 
921, 958 
cytoskeletal filament assembly and 
889, 894-896 
intermediate filaments 946 
membrane transport 655, 677 
microtubules 929-930, 941 
motor proteins as 889 
accumulation delays 1176, 1178 
acetyl CoA (acetylcoenzyme A) 
from B-oxidation 667 
in the citric acid cycle 106, 758-760 
in glycolysis 75 
from oxidation of fats 81-82, 83F 
pyruvate conversion to 75, 81-82 
structure and biosynthetic role 69 
acetylation see histones; lysine; N-termini 
acetylcholine 
as an excitatory neurotransmitter 629 
effects on different target cells 
816-817, 837T, 843 
effects on nitric oxide synthesis 846 


GPCRs activated by 832 
structure 817F 
acetylcholine receptors 
as drug targets 632 
as muscarinic or nicotinic 843 
at neuromuscular junctions 630-631, 
633, 1072F, 1210F 
acetylcholinesterase 143, 630 
N-acetylgalactosamine 718, 1058, 1060F 
N-acetylglucosamine (GIcNAc) 
in GAGs 1058 
GIcNAc phosphotransferase defects 
728-729 
in Golgi apparatus and ER 716, 
717-718F, 720 
in mammalian glycoproteins 717F 
Achaete/Scute family 1171-1172 
achondroplastic dwarfism 1196 
acid catalysis 144, 145F 
acidic amino acid side chains 113 
acidification, in secretory vesicles 743 
acidity, of lysosomes 722-723 
acids, defined 46, 93, 763-764 
aconitase 106, 427 
ACTH (adrenocorticotrophic hormone) 
744F, 835T 
actin 
and actin-binding proteins 898-914 
bacterial homologs 896-897 
chemical inhibitors 904T 
cross-linking proteins 911F 
F-actin and G-actin 898 
monomer availability and 906 
myosin and, in contractile rings 
996-997 
myosin and, in muscle contraction 
916-920 
myosin as associated motor protein 
890 
in non-muscle cells 923-925 
polymerization in cell migration 951 
polymerization in vitro 899, 900F 
separation from samples 440 
actin-binding proteins 904-906, 907-909, 
1079 


actin depolymerizing factor (cofilin) 905, 
910, 914, 954, 957, 958F 
actin filaments 
in adherens junctions 1036, 1042 
arrays 911-913 
“arrowhead” conformation 898, 899F, 
911F, 913F 
bacterial effectors and 1281 
bundles 912F 
capping 903, 909, 910F 
in the cell cortex 592 
confocal microscopy 541F 
D- and T-forms 901, 904F, 910, 927, 
954 
half-lives 896-897, 906, 919 
helical structure 124, 899F 
integrin links to 1075 
negative staining 559-561 
neuronal membranes 591 
nucleation in formation of 899-900, 
902, 906-907, 908-909F, 953, 
954F, 1289F 
and plant cell walls 1087 
plus and minus ends 898, 900, 902, 
904F 
treadmilling 901, 903-904, 953-954 
tube and vesicle formation 1192 
underlying the plasma membrane 890 
visualizing with TIRF 548 
actin gene 371F, 923 
actin mRNA 422 
actin polymerization by pathogens 1278, 
1281-1282, 1287-1288, 1289F 
actin-related proteins see Arp2/3 complex 
actin-severing proteins 909-910 
actin subunits 
assembly 898-899 
as asymmetrical 894 
actin webs 908F, 953F, 959 
a—actinin 905, 911- 912, 919, 920F, 1079 
action potentials 
firing frequency and distance 634, 
636 
firing frequency and PSPs 633 
gap junctions and 1051 
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in muscle contraction 920 
plasma membrane depolarization by 
607, 621-624, 627-629, 632, 
634-636 
propagation 624F, 625 
activated amino acids 338F 
activated carriers 
carboxylated biotin 69T, 70F 
coenzyme Aas 68-69 
coupling with favorable reactions 
64-65 
energy storage by 63-64 
FADH» as electron carrier 69, 83 


NADH and NADPH as electron carriers 


67-68 
and their functions 69T 
see also ATP; GTP 


activation, enzyme-coupled receptors and 


GPCRs 850-851, 852F 
activation energies 
enzyme action and 57-58, 141-145 
of glycolysis 74 
activator proteins see transcription 
activators 
active sites 
elastase and chymotrypsin 119F 
in enzyme activity 59, 136, 147 
lysozyme 145, 146F 
and regulatory sites in allostery 151 
see also binding sites 
active transport 
ion-concentration gradients 601-604 
primary and secondary active 
transport 602 
three methods 601 
transporters and 600-611 
activity-dependent synaptic change 
1211-1212 
acute lymphocytic leukemias 1117 
acyl transferases 689 
adaptation 
of neurons to prolonged stimulation 
635 
of system response to signals 825, 
830-831, 848-849 
in the visual system 846 
adaptive immune system 
antigenic variation and 1290-1291 
immunological self-tolerance 
1313-1315 
overview 1307-1315 
pathogen selection through 1290 
two classes of response 1298F 
see also B cells; T cells 
adaptor molecules 334, 338F, 341 
adaptor proteins 
actin-binding 956F 
for cadherins 1044 
catenins as 1042 
clathrin coated vesicles 698-699 
Grb2 824F, 855, 862F 
interaction domains as 823 
nuclear import receptors 652 
SH2 and SH3 domains in 854 
talin as 1075, 1080 
vesicle transport model 721 
ADARs (adenosine deaminases acting on 
RNA) 418 
adducin 592F 
adenine 
1-methyl- 271 


deamination to hypoxanthine 271, 
272F 
DNA base pairing with thymine 176, 
177F 
structure 100 
adenocarcinomas, as malignant 1092, 
1093F 
adenomas, as benign 1092, 1093F, 1220 
adenomatous polyps 1123 
adenosine deamination 335, 336, 337F 
S-adenosylmethionine 
as an activated carrier 69T 
DNA methylation damage from 267T, 
268 
transcription regulator binding 377 
adenovirus 1273T, 1274F, 1280F, 1281, 
1289 
adenylyl cyclase 834, 836, 843-844. 
846T, 848, 1278 
adherens junctions 
actin filaments in 1036, 1042, 1192 
assembly 1043F 
classical cadherins and 1037T 
in epithelial cells 893F, 924, 1036F 
response to force 1042-1043 
tissue remodeling 1043-1045 
adhesins 1277-1278 
adhesion belt 1036F, 1044, 1045F 
adipocytes 573 
adjuvants 1307 
ADP/ATP carrier protein 779, 780F 
adrenaline, GPCR mediated effects 827, 
832, 835T 
adrenoleukodystrophy 300F 
Aequorea victoria 543, 547 
aequorin 547, 839 
aerobic bacteria and mitochondria 25 
aerobic lithotrophs 11 
aerobic respiration 55 
Aeropyrum pernix 21T 
affinity chromatography 448-451, 459, 
484 


affinity maturation 1320-1321 
aflatoxins 1128-1129 
AFM (atomic force microscopy) 307F, 
481, 548-549, 587F, 895, 913 
Africa, human origins 232 
agarose, in gel electrophoresis 465-466 
age 
and cancer incidence 1094-1095, 
1111 
and DNA repair errors 274 
and mitochondrial mutations 808 
see also premature aging 
agent-based simulations 524 
aggrecan 1058F, 1060, 1061F 
agrin signal protein 1072, 1210 
Agrobacterium 508F 
AID (activation-induced deaminase) 
1322-1323 
AIDS (acquired immune deficiency 
syndrome) 
mortality 1263 
see also HIV 
AIRE (autoimmune regulator) 1333 
AKAPs (A-kinase anchoring proteins) 835 


Akt protein kinase 860, 1030F, 1114-1115 


alanine, structure 113 
albinism 490F, 729, 1186 
alcohol detoxification 667 
aldolase 104 


aldoses and ketoses 98 
Alexa dyes 537, 538F 
algae, as eukaryotes 30 
alkalis see bases 
alkylation lesions see methylation 
all-or-none responses 827-829, 830F, 857 
alleles, defined 286, 486, 490 
allergic reactions 1317 
allostery 
activation of Ca2*/calmodulin 841 
allosteric enzymes 151-153 
EF-Tu conformational change 
160-161 
in GPCR kinases 848-849 
inducing protein activation 828F 
inducing protein degradation 360 
integrin activation by 1077 
in ion channels 618, 825 
membrane transport proteins 
163-164 
in motor proteins and protein 
machines 162, 164 
in proton pumps 773-774 
and second messengers 848 
tryptophan repressor 381 
a and B subunits, ATP synthase 777 
a2 protein 120 
alpha satellite DNA 203, 204 
a-chains, collagen 1061-1062, 1063T 
a-chains, integrins 1075, 1078 
a-chains, laminins 1070F 
a-helices 
in bacteriorhodopsin 586-587 
discovery and description 115-116 
in the histone fold 188 
in intermediate filaments 945 
ion channels 617-618, 631F 
membrane anchoring 682 
membrane-spanning 579-581, 677, 
679, 680 
myosin lever arm 916 
in nascent proteins 349, 354 
in protein structural motifs 376-377 
as secondary structure elements 
117-118 
switch helices 161 
in transporters 603 
Alport syndrome 1071 
ALS (amyotrophic lateral sclerosis) 947 
alternative pathway, complement system 
1302-1303 
alternative splicing 
constitutive 324, 416 
definition of the gene 416-417 
multiple protein forms from 415 
positive and negative control 416F 
RNA sequencing and 482 
tropomyosin gene 319F, 416 
and voltage-gated ion channels 627 
Alu sequences 212F, 223F, 291-292 
Alzheimer’s disease 130, 868 
Amanita mushrooms 904 
see also phalloidins 
Ames test for mutagenicity 1128 
amino acid sequences 
alignments 463 
implying protein function 121, 
462-463 
specifying protein structure 109-114 
see also signal sequences 
amino acid starvation 425 


amino acids 
abbreviations 110F, 112 
activated amino acids 338F 
addition to polypeptide chains 
339-344 
codons specifying 7, 334 
coupling to transfer RNAs 336-338 
essential amino acids 86, 87F 
glutamine synthesis 66 
mitochondrial conversion to acetyl 
CoA 84 
multisite post-translational 
modification 166 
in the nitrogen cycle 85-86 
optical isomers 112 
polarity and ionization state 109-110, 
112-113 
at protein active sites 135-136 
as protein monomers 6 
selenocysteine 350 
side-chain modifications 196-197 
aminoacyl-tRNA synthetases 336-338, 
339F 
amoebae 182, 997F 
AMPA receptors (a-amino 3-hydroxy 
5-methyl 4-isoxazolepropionic 
acid) 636-637 
amphibians 
metamorphosis 1182 
newts 1248-1249 
see also frogs 
amphipathicity 
of defensins 1298 
NADH dehydrogenase a-helices 768 
amphiphilicity of membrane molecules 
9, 566, 576 
amplification (of DNA) see DNA cloning 
amplification (of signals) 
in the caspase cascade 1023 
in intracellular signaling 824, 848, 857 
in microscopy 539-540 
of nerve impulses 621 
amygdala 935 
amyloid fibrils 130-133 
Anabaena cylindrica 14F 
anabolic reactions, NADPH in 68 
anabolism as biosynthesis 51-52 
anaerobic organisms 
anaerobic lithotrophs 11 
ATP synthase role 778 
glycolysis in 74-75 
proton gradients in 781 
analytical methods 
for DNA 463-466 
for proteins 452-462 
analytical ultracentrifuges 455 
anaphase 
chromatid separation 964, 994 
in meiosis 1005, 1009 
in mitosis 981 
anaphase A and anaphase B 994, 995F 
anaphase chromosomes 988F 
anchorage dependence 1079 
anchoring fibrils 1062, 1063T 
anchoring junctions 1036, 1037T 
AND and AND NOT logic 521-522, 523F, 
825 
androgen receptor signaling 1117 
anemia, spectrin deficiency 591 
Angelman syndrome 407 
angina 847 


angiogenesis 
endothelial tip cells 1236-1237 
involvement of VEGF and HIF1a 1237 
and metastasis 1120 
and vasculogenesis 1236 
Angstrom units 531 
animal-vegetal (A-V) axis 1155-1156, 
1167-1168 
animals 
body plan inversion 1169 
body plans as conserved 1147-1148, 
1174 
extracellular matrix in 1057-1074 
model organisms 33 
regenerative ability 1247 
regulatory DNA and morphology 
1149, 1174-1175 
size differences 1010 
ankyrin 592F 
antenna complexes 788-789, 794F 
Antennapedia gene/complex 1161F, 
1162-1163, 1164F, 1169, 1170F 
anteroposterior (A-P) axis 1147, 1155, 
1157-1160, 1161F, 1162-1164, 
1169, 1206 
anthrax 1270 
anti-IAP proteins 1029 
anti-inflammatory drugs 838 
antibiotic resistance 
DNA-only transposons in 288 
horizontal gene transfer and 19, 1269, 
1292-1293 
plasmid segregation and 897 
Staphylococci (MRSA) 1276 
three mechanisms 1293F 
antibiotics 
bacterial targets 1293F 
as inhibitors of protein synthesis 351, 
352T 
lysozyme as 144 
misuse 1293 
penicillin 1267, 1291, 1293 
ribosome response to 800 
selective toxicity 1292 
vancomycin 1292-1293 
antibodies 
in affinity chromatography 449, 450 
anti-BrdU 966 
blotting techniques 454-455 
delivery of poisons 1137 
as immunoglobulins 1315-1316 
immunogold electron microscopy 557 
labeled, in electron microscopy 539 
labeled, in fluorescence microscopy 
537, 538F, 539-540 
major classes in humans 1318T 
number of potential antibodies 1309 
polyadenylation and release 417-418 
secreted by B lymphocytes 1297, 
1307, 1309 
transport in newborns 737 
as triggers of phagocytosis 739 
see also immunoglobulins; monoclonal 
antibodies 
anticancer drugs see cancer treatment 
anticodons, tRNA 7, 335-337, 338F, 339, 
341-342, 343F, 344-345 
antidepressants 632 
antigen presentation 1240, 1305, 1330F 
antigenic determinants 1316F, 1318, 
1337, 1338F 


Page numbers with an F refer to a figure; page numbers with a T refer to a table. 
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antigenic variation, and pathogen 
evolution 1289-1291 
antigens 
adaptive immune system 1307 
dendritic cell processing 1305, 1330F 
Langerhans cells presentation 1240 
in the lymphatic system 1311 
recognition 138-139 
antiporters 602, 604, 607-608, 747F, 
769F, 781 
Antirrhinum 30F, 488F 
AP2 (adaptor protein 2) 699, 700F, 701, 
733-734 
Apaf1 (apoptotic protease activating 
factor-1) 1025, 1026F 
APC/C (anaphase-promoting complex/ 
cyclosome) 
Cdh1-APC/C 1003-1004, 1014 
completion of mitosis 992-993 
M-Cdk inactivation 1002-1003 
metaphase-anaphase transition 
970-971, 972F, 973, 975, 978, 
1009 
as a ubiquitin ligase 360 
Apc protein (adenomatous polyposis coli) 
870-871, 1112F, 1124 
Apc tumor suppressor gene 1123-1124, 
1220 
APCs (antigen-presenting cells) 1307, 
1311, 1328 
see also dendritic cells 
apical meristems see meristems 
apolipoprotein B 418-419 
apoptosis 
Bcl2 family proteins and 860F, 1026, 
1027-1028F 
in C. elegans 1194 
Caspase cascade and 1022-1023 
cytotoxic T cells and 1334 
DNA fragmentation in 1024F 
in embryonic development 1022 
epithelial cells 1219 
extent of 1021-1022 
extracellular signals and 816 
extrinsic and intrinsic activation 
pathways 1023-1028 
Golgi apparatus fragmentation 722 
macrophages and 739 
mitochondrial proteins in 802 
p53 pathway and 1115-1116 
phagocytes and 1030-1031 
phosphatidylserine in 574, 690, 740 
reduction in cancer cells 1099, 1103 
response to irreparable DNA damage 
1015 
response to viral infection 1304 
suppression by survival factors 1011, 
1029-1030 
apoptosome 1025, 1026F 
aquaporins 580F, 599, 612-613 
Arabidopsis (A. thaliana) 
cell and organ size 1195F 
flowering 1183F, 1195F 
generating mutations in 488 
genome 880-881, 1084 
gravitropism 884F 
mitochondrial genome 805F 
as model organism 29T, 32-33 
mutant libraries 498 
totipotent cells 507 
vacuoles 724F 
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arabinose metabolism 521 
arachidonic acid 837 
AraJ gene (E. coli) 521 
archaea 
bacteriorhodopsin from 586 
CRISPR system 434 
cytokinesis in 737 
eukaryote origins and 26, 27F 
Methanococcus jannaschii 16F, 676F 
as prokaryotes 14-15 
Sec61 complex 676F 
thermophilic 572 
Archaeoglobus fulgidus 21T 
Arf proteins 703, 854T, 1016 
arginine 
in ATP synthases 777 
deamination to NO 847 
in nuclear localization signals 650 
in the nucleosome 188 
structure 112 
Argonaute protein 428F, 430-434 
Arp2/3 complex 
actin filament nucleation 905-906, 
908F 
actin networks and 911, 953, 954F 
bacterial recruitment 913-914, 1281, 
1288, 1289F 
Rac activation 957-959 
Arp1 (actin-related protein 1) 939 
arrestins 845, 849 
arthritis, as a multigenic condition 493 
Ascaris (A. lumbricoides) 1265-1266 
asparagine 
in aquaporins 613 
homeodomain arginine contact 376 
N-linked oligosaccharides 683, 1057 
structure 113 
aspartic acid 
in P-type ATPases 607, 608F 
structure 113 
targeting by caspases 1022 
Aspergillus flavus oryzae 1129F 
aspirin 838 
assembly factors 130 
association constant 139F, 140 
astral microtubules 
dyneins and 984-985, 994 
spindle positioning 982, 983F, 987, 
1001-1002 
astral relaxation model 999 
astral stimulation model 998-999 
asymmetric cell division 1001-1002, 
1153, 1173-1174, 1222, 1223F 
ataxia telangiectasia (AT) 266T, 276, 1015 
ATF6 protein 688 
ATG9 protein 726 
atherosclerosis 733, 1265-1266 
AIM protein 266T, 276, 1014-1015 
atomic force microscopy (AFM) 307F, 
481, 548-549, 587F, 895, 913 
atomic number and electron density 556 
atoms and cells, scale of 530F 
ATP (adenosine triphosphate) 
as an activated carrier 64, 65 
as an energy carrier 8 
binding in cyclin/Cdk complex 970F 
daily turnover in humans 774 
powering condensation reactions 
65-66, 70-73 
production by fermentations 75 
production by glycolysis 74-78, 85 


production by mitochondria and 
chloroplasts 753 
production by the citric acid cycle 
84-85 
substrate of Src protein kinase 74-78, 
85 
turnover and metabolic rate 148 
ATP-dependent chromatin remodeling 
complexes 190-193 
ATP-dependent proteases 35/7, 358F 
ATP-driven pumps 601-602, 606-607 
ATP hydrolysis 
actin catalysis of 894, 901, 904, 954F 
in active transport 60, 602, 605-606, 
608F, 838 
aminoacyl-tRNA synthetases 337 
chaperone operation 356 
DNA helicase operation 246F 
DNA ligase operation 246 
by dyneins 938F 
by kinesins 936, 937F 
in lysosomes 723 
macromolecular synthesis 70-73 
by myosin heads 916 
NPC translocation 655 
overall usefulness 65 
in phospholipid translocators 690 
in the proteasome 358, 359F 
protein import into mitochondria 661 
protein import into peroxisomes 667 
protein import into the ER 682 
spliceosome RNA rearrangements 
321 
ATP synthases 586, 590, 606 
aggregation 590 
bacterial 780-781 
c subunits 777-778, 804 
of chloroplasts 787, 793, 794 
dimerization 778, 779F 
in mitochondria and chloroplasts 794, 
795F 
Nat-driven 781 
powered by electrochemical gradients 
586, 606, 774 
as protein machines 164, 754, 
776-778 
related enzymes 778, 781 
reversibility as a proton pump 778 
ATP synthesis 
driven by an electrochemical gradient 
761, 763 
electron-transport chain and 84-85 
in glycolysis 781 
in mitochondria 774-782 
in mitochondria and chloroplasts 658, 
758 
thylakoid membrane as site of 
786-787 
ATPases, F-type see ATP synthases 
ATR protein 1014, 1015F 
attachment to surfaces (cell migration) 
951, 952F, 955 
auditory epithelium 1227 
auditory system 1204 
AUG codons (translation inititiation/ 
methionine) 334, 347-348, 
424-425 
Aurora kinases (Aurora-A and -B) 978, 
985-986, 990 
autism 494 
autocatalytic process, life as 7F 


autocrine action, Type | interferons 1304 
autocrine signaling 815 
autofluorescence 544F 
autoimmune diseases 
AIRE regulator 1333 
bullous pemphigoid 1076F 
FoxP3 gene and 1336 
lymphocyte survival and 1031 
myasthenia gravis 1315 
Type | diabetes 1315 
autoimmunity, T-cell inhibitory receptors 
1138 
autophagosomes 725F, 726-727 
autophagy 
function 726-727 
as a lysosome delivery pathway 725 
as selective or nonselective 726 
of sperm-derived mitochondria 807 
autophosphorylation 364F, 688, 841, 
842F, 850, 878 
plant photoproteins 884 
transautophosphorylation 851-852F 
autoradiography 452, 454, 466F, 1212F 
autosomes, defined 486 
autostimulation 518 
auxilin 702 
auxin signaling in plants 881-883, 1087 
axin protein 870 
axonal neurofilaments 944T, 947 
axonemal dyneins (ciliary dyenins) 
937-938, 942 
axonemes 927F, 931F, 938F, 941-943, 
950F 
axons 
electrical activity and synapse 
modification 1211-1212 
elongation 844, 1202 
growth cones 858, 943, 951, 
1201-1204, 1206, 1208-1211 
guidance mechanisms 1202-1204, 
1206 
initial segment (axon hillock) 634 
microtubule orientation 940 
in neural development 1198, 1199F, 
1200-1202, 1203F, 1204-1212 
retrograde and anterograde axonal 
transport 938 
role in neurons 620-621, 940 
self-avoidance 1206-1207 
spectrin in 913 
azides 772 
AZT reverse transcriptase inhibitor 1292 


B7 proteins 1337-1338 
B-cell inhibitory receptors 1337 
B cell lymphoma 1031 
B cell receptors (BCRs) 1315-1317, 
1321-1322, 13836-1338 
B cells (B lymphocytes) 
antibody secretion by 1297, 1307 
apoptosis inhibition by Bcl2 
1246-1247 
bone marrow origin 1308 
in Burkitt’s lymphoma 1106 
Class switching 1320, 13822-1323, 
1335-1336, 1338F 
control of antibody forms 417-418 
extracellular signal and activation 
1336-1337 


gene segments 1319-1320, 1321F, 
1325, 1332 
immunoglobulins and 1315-1324 
and monoclonal antibodies 444 
rough endoplasmic reticulum in 1309 
B-Raf oncoprotein 1136 
babies see infants 
BAC (bacterial artificial chromosome) 
469, 471, 479 
Bacillus anthracis 1270 
Bacillus subtilis 
actin homologs in 897F 
gene families 17, 18F, 21T 
“backstitching” mechanism 242, 244 
bacteria 
ABC transporters in 163F 
aminoacyl-tRNA synthetases 336 
Classification by shape 1267F 
control of translation 422-423 
DNA replication in 253-255 
Gram-positive and Gram-negative 
610F, 1267F 
green sulfur bacteria 796 
ion channels 617-620 
largest 13F 
mutation rates 237-238 
N-formylated peptide markers 958 
phagocytosis by host cells 
1281-1282 
as prokaryotes 14-15, 1266 
structure 13F 
thermophilic 473F, 483 
transcription in 306, 307F 
transposon frequency in 288 
transposon types, characteristic 292 
use in DNA cloning 467-469 
use of snRNAs against viruses 
433-434 
see also Escherichia coli 
bacterial cells, sizes 644 
bacterial cytoskeleton 896-897 
bacterial flagella 942 
bacterial origins 
of chemiosmosis 780-781 
of mitochondria and chloroplasts 
25-28, 644, 798F, 806-807 
bacterial pathogens 
cytoskeleton hijacking by 913-914 
extracellular bacterial pathogens 
1269-1271 
bacteriophage lambda 
Cro repressor protein 123F 
virus receptors and 1279 
bacteriophages 
CTXọ 1269 
T4 bacteriophage 19F, 324 
T7 bacteriophage 243 
virulence genes 1268 
as viruses 18, 19F 
bacteriorhodopsin 
hydropathy plot 579F 
structure and function 586-588, 591F 
Bad proteins 860F, 1030 
Bak protein 1027-1028 
Balbiani Ring genes 326 
BAM complex 662F 
band 3 protein 592F, 605 
band 4.1 protein 592F 
BAR domains 701, 702F 
“barcoding” mutant organisms 498, 499F 
Bardet-Biedl syndrome 943 


barrier DNA sequences 195F, 202, 210, 
391 
basal and basolateral surfaces, epithelial 
cells 590, 605 
basal bodies 943 
basal cell carcinomas 873, 1092 
basal lamina 
and endothelia 1235 
and epithelia 749, 1035, 1062, 
1068-1069 
and epithelial cancers 1093F, 
1096-1097F, 1123 
functions 1070-1072 
organization 1069-1070 
as specialized extracellular matrix 
1068-1069 
in synapses 1209 
base catalysis 144, 145F 
base excision repair 269, 270F 
base-pairing 
bond strengths 255 
and edge recognition 374 
and homologous recombination 
277-278, 280 
limitations of complementary pairing 
345 
role in RNA folding and templating 
363-364 
role in RNA interference 429 
wobble base-pairing 335, 336-337F, 
342, 804 
base-pairing, in DNA 
antiparallel strand arrangement 176 
in DNA synthesis 4 
in replication and repair 239-240, 255 
RNA as complementary 302-303 
basement membrane see basal lamina 
bases (nucleotide) 
in DNA 175-176, 177F 
in RNA 302 
structures of 100 
tautomeric forms 242 
unnatural, from DNA deamination 
271-273 
unusual, in tRNAs 335F, 337F 
bases (proton acceptors) defined 46, 93, 
763-764 
basic amino acid side chains 112 
basophils 1239, 1240F, 1241T, 1245, 
1317 
Bax protein 1027-1028 
Bcl2 family 
ABT-737 and 1032F 
as apoptosis-inhibitory proteins 860F 
in B cell lymphoma 1031, 1047 
pro- and anti-apoptotic 1026, 
1027-1028F 
regulation of the intrinsic pathway 
1025-1028 
survival factors and 1030 
BclIX_ protein 1026-1027, 1030-1031, 
1032F 
Bcr gene, in Bcr-Ab/ hybrid 1135 
BCRs (B cell receptors) 1315-1317, 
1321-1322, 1336-1338 
Beggiatoa 14F 
benign tumors 1092, 1093F 
benzofa/pyrene 270, 1128 
B-adrenergic receptors, structure 
832-833F 
B-lactamases 1293 


Page numbers with an F refer to a figure; page numbers with a T refer to a table. 
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Bo-microglobulin 1327, 1328F, 1330T, 
1339 
B-barrels 
in porins 758 
transmembrane proteins 579-581, 
659, 662-663 
B-oxidation, fatty acids 667 
B-sheets 
amyloid fibrils as 130-131, 133 
discovery and description 115-116 
in DNA recognition proteins 377 
immunoglobulin (lg) domains 
1318-1319 
as secondary structure 117-118 
BH domains (Bcl2 homology) 1026, 
1027F 
BH3-only proteins 1027-1028, 
1030-1031, 1032F 
bi-orientation, chromatids 988-990, 
993-994, 1006, 1009 
Bicoid gene 1158-1159 
Bicoid transcription activator 393F, 394, 
395F, 422, 1158-1160 
Bid protein 1028 
bilayers, lipid see lipid 
bimetallic centers, cytochrome c oxidase 
772 
binding constant (Km) 601-602 
binding interactions 
cooperative binding 516-517 
proteins with promoters 510-512, 
515F 
binding sites 
allostery 151-153 
Ca?+/calmodulin 841F 
cryptic binding sites 1043, 1068 
disordered regions as 126 
equilibrium constants and 138-140 
evolutionary tracing of 136-137 
integrins 1075 
loop regions as 118F, 121, 138 
multienzyme complexes 148 
for nucleotides in G proteins 833F 
for phosphorylated amino acids 154 
polypeptide subunits 123 
RGD sequence 1067-1068, 1075, 
1078 
ribosomes, for antibiotics 351F 
ribosomes, for RNAs 341-342, 347 
specificity 134-135 
transporters 600 
see also active sites; docking sites; 
ligands 
biofilms, and amyloid fibrils 132 
bioluminescence 547 
biotin 
carboxylated biotin 69T, 70F 
as a coenzyme 147 
nucleotide labeling 467 
BiP protein (binding protein) 677, 678F, 
683, 712-713 
BIR domains (baculovirus IAP repeat) 
1029 
1,3-bisphosphoglycerate 55-59F, 77-79F, 
105, 760, 786F 
bistability 
and positive feedback loops 518-520, 
829 
and robustness 520 
Bithorax gene/complex 1162-1163, 
1164F, 1169, 1170F 


1:6 Index 


bivalents 1006, 1007-—1009F 
BLAST program 462, 463F 
blastemas 1249 
blastocyst stage 1253, 1254 
blastoderm stage 1159F, 1165F 
cellular blastoderms 1157 
syncytial blastoderms 1157-1158, 
1157-1161, 1165 
blastomeres 
differentiation 1148F 
maternal-zygotic transition 1181 
mouse 1156 
blastula stage 
fate maps 1159F, 1167 
gastrulation process 1147 
germ layers 1148, 1167 
pluripotency 1148 
blebbing 953-953, 1186 
blood cell formation 
in a hierarchical stem cell system 
1239-1247 
listed 1241T 
myeloid cells 1240 
see also red; white blood cells 
blood clotting 1067-1068, 1077 
blood sinuses 1242 
blood vessels 
arteries 1235F 
capillary growth in bone 1231 
elastin in 1065 
endothelial cells in 1235-1238 
smooth muscle relaxation 847F 
VEGF and 1237 
walls 1238 
bloodstream 
cell-cell adhesion 1054-1055 
metastases use of 1101-1102 
BMP (bone morphogenetic protein) family 
865, 1168-1169, 1191F, 1199, 
1200F 
body patterning see spatial patterning 
body size 
cell numbers or cell sizes 1193 
IGF1 and 1196 
bond angles, in polypeptides 110, 111F, 
112 
bond energies 
in activated carriers 63-64 
in citric acid cycle 759 
of phosphate bonds 78, 79F 
bond strengths 
base-pairing 255 
bond types 44F, 45T, 92 
bone 
remodeling and repair 1230-1232 
resorption 1231 
trabecular and compact 1230-1231 
bone marrow 
B cells migration to 1312 
B cells origin in 1308 
stromal cells 1229 
bone matrix and osteoblasts 1229, 1230F 
Boolean networks 524 
Bordetella pertussis 1270F, 1277-1278 
Borrelia burgdorferi 1264F, 1286F 
BP230 adaptor protein 1037T, 1076 
brain 
CaM-kinase Il in 841 
cerebral cortex 1200, 1201F, 1205, 
1211 
human, number of neurons 627, 1198 


monkey brain 1212F 
protein abundances in liver and 372F 
transmitter-gated ion channels 636 
see also amygdala; hippocampus 
brain atlas project 502F 
“prainbow mice” 502F 
branch-migration reactions 284-285 
branching morphogenesis 1190, 1191F 
brassinosteroids 881 
Brca1 and Brca2 genes 267, 1116, 
1133-1135 
Brca1 and Brca2 proteins 281-282, 1134 
BrdU (bromodeoxyuridine) 966, 967F 
breast cancer 
Brca1 and Brca2 proteins and 
281-282, 1134 
chromosome abnormalities 1097F, 
1111 
E-cadherin gene and 1122 
epigenetic changes 1110F 
genomes 1118, 1119F 
Her2 kinase and 1137 
incidences 1093F 
tumor growth 1094F 
variable differentiation in 1122 
Bril protein 881 
bright-field microscopy 533-534, 535F, 
940F 


bristles, mechanosensory (Drosophila) 
1172-1173 

bromodomains 388F 

brown fat cells 780 

brush-border cells see absorptive cells 

buffers 46 

bullfrogs 560F 

bullous pemphigoid 1076F 

bundling proteins 911-912 

Burkholderia pseudomallei 1287, 1288F 

Burkitt’s lymphoma 1107, 1128F, 1130T 

burns, treatment 1250 


C 


C. elegans see Caenorhabditis 
c-Cbl protein 853 
C3 complement component (C3a and 
C3b) 1302-1303 
C9 complement component 1303, 1334 
C-H bond reduction and oxidation 56, 
78F 
c-Src gene 1105 
c subunits, ATP synthase 777-778, 804 
C-terminal domain (CTD) 310F, 311-312, 
311T, 316-317 
C-termini 
attachment of GPI anchors 688 
membrane anchoring 682 
polypeptide backbones 110F, 339 
signal sequences at 647, 667 
soluble and membrane-bound 
immunoglobulins 1317 
transcript cleavage site and 417-418 
C-to-U editing (cytosine to uracil) 418, 
419F 
CA repeats, as genetic markers 233 
CAD endonuclease 1024F 
cadherins 
cadherin superfamily members 1039F 
cell-cell junctions mediated by 1037F, 
1038-1046 
in cell sorting 1187-1188 


E-cadherin gene and breast cancer 
1120, 1122 
E-cadherins 1038, 1041-1042, 1045F, 
1281, 1284F 
in embryonic development 
1040-1041, 1190 
and homophilic adhesion 1038-1042 
M-cadherins 1234F 
N-cadherins 1038, 1041, 1056, 1202 
P-cadherins 1038 
structure and function 1040F 
as transmembrane adhesion proteins 
1037 
cadmium selenide 538F 
Caenorhabditis elegans 
adult worm 1180F 
astral relaxation model 999 
asymmetric cell division 1002F 
behavioral changes 488F 
cell numbers 1194 
genes for voltage-gated ion channels 
627 
heterochronic mutants 1180, 1181F 
human Bcl2 and 1025 
kinesins 936 
loss of miRNA genes 1149 
as model organism 29T, 33 
MTOC 931F 
mutant libraries 498 
neurons 913 
RNA interference 499 
sarcomeres 920 
caged molecules 544, 545F 
Cajal bodies 213, 331-332, 544F 
CAK (Cdk-activating kinase) 970, 973T, 
979 
calcineurin 655F 
calcium (Ca?+) ions 
ATPase pump in muscle SR 606-607, 
920 
buffering 761 
Ca?+-activated K+ channels 635, 636F 
Ca?+-release channels 607, 632F, 633 
Ca?+ spikes 839-840 
in cell adhesion 440, 1038-1039, 
1054 
in cell wall cross-linking 1084 
fertilization and 547F, 839F 
integrin binding and 1075 
IP3 receptors and 837 
LTP and LTD and 637 
monitoring with indicators 546-547 
in muscle contraction 920-923 
PKC and 837, 838F, 852 
in the regulated secretory pathway 
744-745 
release from ER 574 
as asecond messenger 819-820 
as a signal mediator 838, 920 
storage by ER 669, 670-671 
triggering membrane repair 748 
calcium (Ca2*) pumps 607-608, 671, 838, 
840, 920 
calcium oscillators 
delayed negative feedback 516, 
839-840 
frequency decoding 842-843 
calcium phosphate (hydroxylapatite) 1229 
calculus 512 
calluses 442, 507 
calmodulin 840-841, 842F, 843, 846, 
921-922 


calnexin 685, 712 
calreticulin 685 
Calvin cycle (carbon fixation cycle) 755F, 
785-786, 787F 
CaM-kinases (Ca?*+/calmodulin-dependent 
kinases) 841-843 
CaM-kinase II 841, 842F, 843 
cancer cells 
aberrant properties required 1103 
abnormal behavior in culture 1098 
abnormal glucose metabolism 1098 
abnormal proliferation 1092, 1098, 
1099-1100 
abnormal surface MHC proteins 
1304, 1305F 
cross-presentation and 1329 
defining behaviors 1092 
dendritic cell cross-presentation 1329 
epigenetic changes 1094, 1096-1097, 
1109-1111, 1125-1126 
genetic analysis 1102 
genetic instability 1097, 1103, 
1111-1112, 1116, 1125, 
1133-1134 
genome sequencing 1095, 
1109-1111, 1119, 1137, 1141 
immune defenses 1137-1138 
immunity and T-cell inhibitory 
receptors 1138 
loss of anchorage dependence 1079 
NK cells/T cells attacking 1304F, 
1334F 
oncogene dependence 1135 
somatic mutations in 1094, 1104, 
1112 
transformed cell lines from 443 
variable differentiation in 1121, 1122 
cancer-critical genes 
cancer genome changes and 1111, 
1141 
discovery and effects of 1104-1126 
mutational landscapes 1112F 
proportion of human genome 
1112-1113 
Ras-MAP kinase pathway 1137F 
studies in mice 1117-1118 
see also oncogenes; tumor suppressor 
genes 
cancer (generally) 
Apc gene mutations and 871, 1124, 
1125 
apoptosis and B cell lymphoma 1031 
environmental and lifestyle factors 
contributing to 1127-1129 
epithelial-mesenchymal transitions 
1042 
five-year survival rates 1128 
Hedgehog hyperactivity and 873 
incidence and mortality 1091-1092, 
1093F, 1095F, 1127, 1128-1129F 
Keratins in diagnosis 946 
as a microevolutionary process 
1091-1103 
Myc gene mutations and 1016 
need for multiple mutations 1118, 
1125-1126 
p53 gene mutations and 871, 1016, 
1031, 1115, 1125-1126 
pathogens contributing to 
1265-1266, 1289 
pathways commonly disrupted 
1113-1116 


PTEN phosphatase mutations and 
859 
Ras hyperactivity and 854-855, 1016, 
1106, 1123, 1125 
Rb gene mutations and 1108 
RNA splicing errors and 324 
RTK regulation breakdown and 853 
T cell inhibition 1337 
telomerase production 10165 
see also cancers; carcinogenesis 
cancer stem cells 1120-1122, 1124 
cancer treatment 
cancer stem cells and 1121-1122 
combination therapies 1139-1140 
curable cancers 1132 
cytotoxic drugs 1132-1133, 1140 
drug resistance 1135-1136, 1139 
drug targets and drug nomenclature 
1137F 
drugs targeting Bcl2 family proteins 
1032F 
drugs targeting topoisomerases 253F 
immune response enhancement 
1137-1139 
personalized 1139, 1140 
present and future 1132-1141 
radiation 1132 
sequencing tumor genomes and 506 
targeted, synthetic-lethal treatments 
1133 
targeting oncogenic proteins 
1135-1137 
targeting rapid division 1122 
targeting the Ras-MAP-kinase 
pathway 1137F 
Taxol® usefulness 929 
cancers (specific instances) 
arising in self-renewing tissues 1120 
caused by infectious agents 
1105-1106, 1129-1132 
cell motion in metastasis 951, 952 
classification by appearance 1097 
classification by causative mutation 
1092 
classification by cell type 1092 
derived from one abnormal cell 
1093-1094 
genetic diversity 1096, 1118-1119 
hereditary 250, 282-282, 1107-1108, 
1124-1125 
incubation period 1095 
matrix degradation 1072 
multidrug resistant 610 
origins in mutant clones 1091-1092 
preventable 1127-1132 
see also breast; colorectal; tumors 
Candida albicans 349, 1286F 
canonical Wnt pathway 869-870 
CAP (catabolite activator) protein 
382-383, 522-523 
cap snatching, mRNAs 1288 
capillaries 1231, 1235-1237F 
capping 
actin filaments 903, 909, 910F, 918 
decapping 426-428 
microtubules 927 
of mRNA 5’ ends 315-317, 422 
CapZ capping protein 909, 914, 919, 
920F 
carbohydrates 
Golgi apparatus as site of synthesis 
711 
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plasma membrane protective layer 
582, 583F 
carbon cycle 55, 785 
carbon fixation 12 
in chloroplasts 783-786 
by cyanobacteria 782 
carbon fixation cycle 755F, 785-786, 
787F 
carbonyl oxygen, in ion channels 612, 
618, 619F 
carboxylated biotin 69T, 70F 
carboxylation stage, carbon cycle 785 
Carcinogenesis 
chemical carcinogens 270, 
1094-1095, 1127-1129 
link to mutagenesis 1094 
radiation and 1094, 1128F, 
1132-1133 
viral carcinogens 1130-1132 
carcinomas, defined 1092 
cardiac muscle see heart 
cardiac myopathy 948 
cardiolipin 760, 772 
CARDs (caspase recruitment domains) 
1026F 
cargo receptors 698, 711 
cargoes 
cytoskeletal filaments 896, 924-925 
of motor proteins 896 
and transport vesicles 695 
B-carotene 508 
carotenoids 789, 791F 
carrier proteins 816F, 876 
carriers see transporters 
cartilage 1057-1058, 1060, 1061F, 1063T, 
1064, 1229 
cartoon summaries 509-510 
Cas9 497-498 
Cas (CRISPR-associated) proteins 434 
Cas transcription regulator 1179 
casein kinase 1 (CK1) 870 
caspase cascades 1023-1025, 
1028-1029, 1301, 1334 
Caspases 
in apoptosis 1022-1023, 1334 
caspase-8 1024-1025, 1026F, 1028 
caspase-9 1025, 1026F 
cytotoxic T cells and 1334 
executioner caspases 1022-1025, 
1026F 
IAPs and 1029 
in inflammatory response 1301 
initiator caspases 1022-1025, 1028 
Caspr protein 625F 
catabolism 
as food breakdown 51, 52F 
NADH in 68 
of sugars by glycolysis 74-78 
catalase, in peroxisomes 666-667 
catalysis 
and energy use 51-73 
enzymes as catalysts 48, 51, 57, 
140146 
regulation of enzyme activity 149-151 
ribosomes as ribozymes 346-347 
ribozymes as catalysts 51, 69, 
363-364 
rotary catalysis in ATP synthases 
716-778 
simultaneous acid and base 144, 
145F 
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speed of protein and ribozyme 363 
catalytic cascades (enzyme cascades) 
848, 873, 881 
catastrophe, microtubules 927-928, 
932-935, 986 
catastrophe factors 934-935, 986 
catenins 
as adaptor proteins 1042 
a-catenin 1042-1043, 1044F 
B-catenin 868-871, 1037T, 1042, 
1045, 1124 
p120-catenin 1042, 1046F 
y-catenin (plakoglobin) 1037T, 1042, 
1046F 
Caulobacter crescentus 897, 898F 
caveolae 
in endocytosis 572 
as pinocytic vesicles 731-732 
TGF® inactivation 866 
CBC (cap-binding complex) 317, 323F, 
325, 32/F 
CBP (CREB-binding protein) 836 
CCR5 receptor 1279 
CD4 co-receptors 1330T, 1331-1333, 
1335-1338 
CD4 receptor 1279 
CD8 co-receptors 1330T, 1331-1333, 
1335-1336, 1337F, 1338 
CD31 protein 1312 
CD40 receptors 1337 
Cdc2 protein (now Cdk1) 463F, 969T 
Cdc3 protein 950F 
Cdc6 protein 975, 976F 
Cdc12 protein 950F 
Cdc14 phosphatase 995 
Cdc20 protein 971, 972F, 973, 993-994, 
1003 
Cdc25 phosphatase 970, 973T, 979, 1014 
Cdc28 protein (now Cdk1) 969T 
Cdc42, as Rho family member 854T, 858, 
956-957 
Cdh1 protein 971, 973T, 1003-1004, 
1013-1014 
Cdks (cyclin-dependent kinases) 
and cyclins in vertebrates and yeast 
969T 
inactivation 1002, 1116 
role in cell cycle control system 968 
cDNA (complementary DNA) 
cloning 470-471 
DNA microarrays and 503, 504F 
RNA sequencing using 477, 503-504 
CDP-choline (cytidine-diphosphocholine) 
689F 
Cdt1 protein 975, 976F 
cell adhesion 
at cell-matrix junctions 1074 
cell-substratum adhesion 954-955, 
956F 
loss of, in apoptosis 1023 
protein CD31 1312 
see also cell-cell adhesion 
cell cannibalism 1248 
cell-cell adhesion 
B-catenin and 870 
in the bloodstream 1054-1055 
control of 582-583, 870 
desmosomes in 946 
embryonic cell sorting 1187-1188 
immunoglobulins in 1055-1056 
T cell binding 1325-1326 


tissue remodeling 1043-1045 
vertebrate embryo patterning 1178 
cell-cell contact 
contact-dependent signaling 815F 
desmosomes 946 
hematopoietic stem cells 1244 
immunological synapses 1344F 
lateral inhibition dependent on 1152, 
1173F 
Rac and Rho in actin organization at 
957, 958F 
signaling dependent on 1150 
cell-cell junctions 
anchoring junctions 1036, 1037T 
cadherin mediation 1037F, 1188 
focal adhesions 863 
intermediate filaments 891 
major forms 1038-1056 
microtubules 931-932 
planar cell polarity 1190 
repulsive interactions 1188, 1206 
see also cell-matrix junctions; gap 
junctions; tight junctions 
cell coats 582, 583F 
cell cortex 
actin filaments beneath 592, 891, 907 
cortical cytoskeleton 591-593 
in mitosis 913 
cell cycle 
accessory proteins and 895 
Caenorhabditis elegans 33 
changes in the nucleolus 330F 
eukaryotic 185, 258F, 964-966 
homologous recombination use 275 
length and cell type 1012 
M and S phase 963 
mitosis 978-995 
model organisms 966 
overview 963-967 
permanent arrest 1016 
regulatory transitions 967 
response to DNA damage 276, 
1014-1015 
Saccharomyces cerevisiae 31-32 
subnuclear structures and 331 
suggested timekeeping role 1180 
temperature-sensitive mutations and 
489 
withdrawal of sympathetic neurons 
1018 
see also cell division 
cell-cycle control system 
overview 967-974 
regulatory proteins 973T, 1108, 1115 
resetting 975, 1003 
transcriptional regulation in 971 
cell-cycle network, feedback 516 
cell death, programmed 
apoptosis and necroptosis 1021 
contrasted with necrosis 1021, 1022F, 
1099, 1115-1116 
see also apoptosis 
cell determination 1148 
cell division 
asymmetric 1001-1002, 1153, 
1173-1174, 1222 
control of cell growth and 1010-1018 
cytoskeleton in 890, 892F 
Lgr5 cell cycle times 1221 
limits, for human cells 1016, 
1099-1100 


mitogens in 1011-1012 
rate in hematopoietic stem cells 1243 
see also meiosis; mitosis 
the cell doctrine 529 
cell fate determinants 1002, 1243F, 1246 
cell-free cloning 474 
cell-free systems 451, 673, 1224F 
cell fusion prevention, by membrane lipids 
570 
cell growth 
control and the PI-3-kinase-Akt 
pathway 861 
control of, in plants 1085 
control of cell division and 1010-1018 
distinguished from cell proliferation 
1011 
key regulatory pathways 1113-1114 
see also growth factors 
cell lines 
eukaryotic 442-444 
mouse fibroblast 1106 
RNA interference 499 
cell mass regulation 1194-1196 
cell-matrix junctions 
actin-linked 1036 
anchoring junctions 1037T 
epithelial tissue 1035-1036 
integrin mediation 1037F, 1079 
response to mechanical forces 
1080-1081 
transmembrane receptors 1074-1081 
cell membranes 
composition 571T 
proportion of protein 576 
three classes of lipids 566-568 
see also plasma membranes 
cell memory 
and chromatin structure 194, 197, 
206, 387 
and differentiation 392, 397F, 401-402 
in embryonic development 1148, 
1150, 1162, 1164 
Hox complexes and 1164 
intracellular signaling pathways 825, 
829, 843F 
overriding with cell reprogramming 
1251 
and positive feedback 520, 829 
reinforcement mechanisms 404-413 
see also epigenetic inheritance 
cell migration 
cell sorting in 1188 
chemotaxis and 958-960 
cytoskeletal coordination in 959-960 
endothelial cells 1235 
environmental cues 1185-1186 
neurons 1200, 1201F 
polarization and 951-960 
survival factors and 1186-1187 
three components 951 
cell numbers 
body size correlation with 1193 
regulation as cell mass 1194-1196 
cell plates 1000-1001, 1082 
cell polarization 748-750 
in epithelia 749, 1047 
microtubules 927, 940F 
and migration 951-960 
planar cell polarity 1189-1190 
and Rho family proteins 955-959 
role of the cytoskeleton 892-893 


cell proliferation 
accompanied by cell growth 
1016-1017 
in cancers 1092, 1098, 1099-1100, 
1104 
by clonal expansion 1310 
dependent on mitogens 1017 
distinguished from cell growth 1011 
integrins in the control of 1079 
in intestinal crypts 1218-1219 
cell recognition 
glycolipids in 575 
surface oligosaccharides in 582, 720 
cell reprogramming 1251-1253 
cell signaling 
alternative routes in gene regulation 
867-880 
coordination of spatial patterning 
1150 
corralling and 593 
disordered regions in 126F 
extracellular matrix role 1073 
by germ layer 1187 
in plants 880-885 
principles of 813-831 
through enzyme-coupled receptors 
850-867 
through G protein-coupled receptors 
832-849 
see also inductive signaling 
cell size 
body size correlation with 1193 
and ploidy 1194 
regulation as cell mass 1194-1196 
regulation by growth factors and 
mitogens 1018F 
regulation by vacuoles 725F 
cell sorting 1041, 1122, 1188-1189 
cell stress see stress 
cell-surface immunofluorescence staining 
591F 
cell-surface proteins 
displayed by dendritic cells 1306F, 
1326, 1327F, 1330 
lg superfamily 1338-1339 
oligosaccharides on 720 
and phagocytosis 740 
cell-surface protrusions 
bacterial replication 1287, 1288F 
blebbing 953-953, 1186 
in cell migration 951-953, 954F, 955, 
956-957F, 959, 1185-1186 
pedestals 1278 
role of the cytoskeleton 892-893 
stereocilia 924 
cell-surface receptors 
death receptors 1024-1025 
immunoglobulins as 1315-1316 
inhibitory receptors 1245, 1304, 
1305F 
intracellular signaling molecules 
819-820 
lymphocyte differentiation 1309 
survival factor binding 1030 
use by phagocytic cells 1301 
virus receptors 1279 
see also TCRs 
cell survival 
integrins in the control of 1079 
mouse epithelial cell lifetimes 1219 
PI 3-kinase and 860 
regulation of 1246-1247 


cell types 

in blood 1239 

and cDNA libraries 470 

and cell cycle length 1012 

and cell reprogramming 1251 

and cell wall composition 1082 

characteristics preserved in cancers 
1092 

different effects of acetylcholine 
816-817 

embryonic development 1047 

from induced pluripotent stem cells 
1257 

interconvertibility in connective tissues 
1228 

number in the human body 1217 

RNA splicing and protein variants 416 

segregation in the gut 1224 

specialized for contraction 1232 

from stem cells in culture 1169 

transdifferentiation 1258 

see also differentiation; tissues 

cell walls 

in bacteria 896, 1292 

in plants 26, 1000, 1053, 1081-1087 

primary cell walls 1082-1085 

in prokaryotes 13 

secondary cell walls 1082-1083, 


1085-1086 
cells 
appearance at different magnifications 
529, 530F 


biochemical similarity 8 
catalysis and energy use 51-73 
centering, by microtubules 931, 932F 
chemical components of 43-51 
comparative sizes 29F, 529 
composition by weight 48F 
eukaryotic and bacterial, sizes 644, 
645F 
extracts 447-448 
growing in culture 440-445 
introducing altered genes 495 
isolation from tissues 440 
locomotion by crawling and swimming 
951 
mechanical interaction with the ECM 
1064 
number in human body 2 
number of proteins in eukaryotic 641 
regeneration and repair 1247-1251 
subcellular fractionation 445-447 
universality of 1-2 
water content of 535 
cellular blastoderms 1157 
cellularization 748, 1002, 1003F, 1160, 
1165 
cellulase 461 
cellulose 1083-1086 
cellulose synthase 1085 
CENP-A (centromere protein-A) variant of 
histone H3 198F, 203-204 
central dogma 299 
central nervous system 
asymmetric segregation in 1173-1174 
Drosophila 1179 
neural stem cell treatments 1250 
origins 1199 
synapse formation 1210 
see also brain; spinal cord 
central spindle 997, 999, 1000F 
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central spindle stimulation model 999 
centrifugation and the ultracentrifuge 
445-447, 455 
centrioles 930, 931F, 943, 982, 985F 
centromeres 
centromeric chromatin 203-204 
creation of human centromeres 204 
fluorescence microscopy 538F 
heterochromatin maintenance 
432-433 
positions in human chromosomes 
181F 
role and size 186 
centrosome maturation 985 
centrosome positioning 949, 959-960 
centrosomes 891, 930-932, 933F, 
935-939, 943, 981-982, 983F 
duplication 984-985 
ceramide 690 
cerebral cortex 1200, 1201F, 1205, 1211 
cervical cancers (uterine cervix) 1093F, 
1096F, 1129, 1131, 1265 
CESA (cellulose synthase) genes 1085 
cesium chloride 447 
CFTR (cystic fibrosis transmembrane 
conductance regulator) gene/ 
protein 225F, 611 
CG islands 406-407 
CG sequences (GpG sequences) 
loss of, in vertebrates 406-407 
methylation and its inheritance 404 
TLR9 recognition 1300, 1304 
Chagas disease (Trypanosoma cruzi) 
1283-1284, 1290 
channelrhodopsins 588, 623-624 
channels 
distinguished from transporters 599 
membrane electrical properties and 
611-637 
as water channels or ion-channels 
611 
see also ion channels 
chaperones, histone 190, 192F, 198, 262, 
313 
and transcription activators 386-387 
chaperones, molecular 
BiP (binding protein) 677, 678F, 683, 
712-713 
calnexin and calreticulin 685, 712 
hsp60 355-357, 662 
hsp70 family 355-357, 659-662, 683, 
702 
mitochondrial precursor proteins and 
660-662, 664 
preventing folding in the ER and 
cytosol 677, 686, 711F 
protecting protein N-termini 361 
and protein folding 114, 354-357, 
1329F 
recognizing unfolded proteins 683 
chaperonins 356 
charge, on phosphatidylserines 567F, 574 
charge-separation in photosynthesis 
788-789, 792-793 
checkpoints 
cancer therapy and 1132-1133 
cell size 1174 
spindle assembly 993-994 
chemical biology 459 
chemical bonds 
and chemical groups 47, 90-91 
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double bonds 90, 98, 566, 571 
chemical carcinogens 270, 1094-1095, 
1127-1129 
chemical components of cells 12, 43-51 
see also elements 
chemical synapses 627-630 
chemiosmotic processes 
bacterial origins 780-781 
in the citric acid cycle 759, 761-763 
defined 753-754 
evolution 794-796 
use by chloroplasts 782-783, 786 
chemokine receptors 1279 
chemokines 1241 
chemotaxis 958-960, 1185, 1202, 1237F, 
1241F 
chiasmata 1006F, 1007, 1008F 
chick embryos 
fibroblasts and collagen 1062F, 1064F, 
1069F 
neural tube 1041F, 1192F 
somite formation 1177F 
spinal cord 1200F 
chimeric animals 1254 
chimeric proteins 1135 
chimpanzees 17, 217-219, 221, 224, 
225F, 226, 228, 231 
ChIP (chromatin immunoprecipitation) 
analysis 210 
Chk1 protein kinase 1014, 1015F 
Chk2 protein kinase 1014, 1015F 
Chlamydia pneumoniae 1266 
Chlamydomonas (C. reinhardtii) 938F, 
941F, 943 
chlorophyll 754-755 
energy/electron transfer 787-788 
ionization, initiating photosynthetic 
electron-transfer 783 
Pego chlorophyll 790, 791F, 794F 
special pairs 788-790, 791-792F, 793, 
794F 
structure 787F 
chlorophyll Ag 792 
chlorophyll-protein complexes 787-788 
photosystems as 788 
reaction centers and antenna 
complexes 788-789 
chloroplast genome 782, 802F, 806-807 
chloroplast precursor proteins 665F 
chloroplasts 
among intracellular compartments 
642 
collaboration with mitochondria 787F 
compared with mitochondria 782-783 
in cytokinesis 1001 
electron transport in mitochondria and 
755F 
energy conversion in 784F 
energy storage in 80-81 
GFP-tagged mitochondria and 544F 
origins and features 26-28, 644, 798F, 
806-807 
in photosynthesis 782-799 
as plastids 642 
protein transport into 658, 664-666 
self-splicing RNAs 324 
structure 658F 
thylakoid membranes 606, 658, 664, 
686 
chloroquine 610 
cholecystokinin 1219F 


cholera 576, 732, 834, 1265, 1266F, 


1269-1270 
cholera toxin 1270, 1278 
cholesterol 


in cell membranes 568, 571-572 
latent gene regulation 655, 656F 
NADPH in biosynthesis 68F 
receptor-mediated endocytosis of 733 
and the steroid hormones 875 
structure 99, 568F 
in synaptic vesicles 747F 
chondroblasts 1057 
chondrocytes 1229 
chondroitin sulfate 1058, 1060-1061F, 
1185, 1202 
chondromas, as benign 1092 
chondrosarcomas, as malignant 1092 
Chordin protein 1168-1169 
chromatids see sister chromatids 
chromatin 
activating and repressive 205, 210 
changes following nuclear 
transplantation 1252-1253 
DNA packaging in 179-193, 259 
heterochromatin and euchromatin 
194, 976 
insertion of histone variants 198 
position and gene expression 212, 
213F 
propagation of changes 199-201 
structure and function 194-207 
types of protein in 187 
zigzag model 192 
chromatin assembly factors see 
chaperones, histone 
chromatin domains 
and barrier sequences 202, 210, 391 
histone variants and 198 
reader-writer complexes 199-201, 
205 
chromatin immunoprecipitation technique 
505, 506F 
chromatin modification 
by reader-writer complexes 205, 406F 
timing of plant flowering 1183-1184 
by transcription activators 386-388 
chromatin remodeling 
fibroblast reprogramming and 1257F 
timescale of 1177, 1179 
chromatin remodeling complexes 
for DNA replication 261 
nucleosome changes and 190-193 
required by RNA polymerase II 
312-313 
and transcription regulators 380, 386, 
388, 390 
and transcription repressors 390 
chromatin structure 
changes in packing 206, 214-215 
and chromosome duplication 
975-977 
and epigenetic inheritance 194, 
204-206, 409-41 1 
and induced pluripotency 1255-1256 
loop structures 207-208, 211-212, 
391 
multiple forms 210-211 
and replication initiation 259 
and RNA splicing 323 
chromatography 
affinity chromatography 448-450, 
459, 484 


column chromatography of proteins 
448-449 
gel-filtration chromatography 
448-449, 450F, 455 
HPLC (high-performance liquid 
chromatography) 449, 457 
hydrophobic chromatography 448, 
452 
ion-exchange chromatography 
448-449, 452 
chromokinesins 984 
chromophores 
plant photoproteins 884 
retinal, in bacteriorhodopsin 587 
chromosomal translocation, reciprocal 
182F 
chromosome abnormalities 
aberrant human chromosomes 182 
breast cancer cells 1097F, 1111F 
in CML 1093 
nonhomologous end joining and 
274-275 
ovarian cancer cells 1116 
Philadelphia chromosome 1093. 
1094F, 1095, 1135 
chromosome banding 181-182, 209, 
210F, 211, 391F 
chromosome condensation 978-979 
chromosome conformation capture (3C) 
method 209F, 212 
chromosome cross-overs 
crossover control 285 
frequency in humans 492 
in meiosis 1006, 1007F, 1009-1010 
as a result of homologous 
recombination 282-283, 486 
chromosome deletions, specific to 
humans 226 
chromosome duplication 
chromatin structures and 975-977 
matrix proteins 1067 
chromosome painting 180-182 
chromosome puffs 211 
chromosome segregation 
errors and cancer 1097, 1109, 1132 
in meiosis 1004-1006, 1008, 1010 
in mitosis 978, 994, 1097 
as universal 963 
chromosome size, Drosophila 
melanogaster 33-34 
chromosome translocations 
breast cancer 1111F 
globin gene family 230 
leading to CML 1093-1094 
chromosomes 
control of duplication 974, 975F 
DNA packaging at mitosis 214 
DNA packaging in chromatin 187-193 
DNA replication within 254-266 
early discoveries 173-174 
electrophoresis of whole 
chromosomes 466 
essential components of 185-186 
forces on, in the mitotic spindles 
990-992 
global structures 207-216 
Ig chain loci 1320 
lampbrush chromosomes 207-209, 
211 
number of nucleotide pairs, human 
Zor 


polytene chromosomes 208-211 
rRNA genes 181F, 330 
chromosomes, homologous see homologs 
chromosomes, numbered human see 
human chromosome 
chronic myelogenous leukemia (CML) 
1093-1095, 1135-1136 
chymotrypsin, compared with elastase 
119F 
Ci (Cubitus interruptus) transcription 
regulator 871-873 
CICR (Ca?+-induced calcium release) 838 
cilia 
built from microtubules 941-942 
microtubules and 890-891 
olfactory receptors 843, 844F 
planar cell polarity and 1189 
primary cilia 824F, 845F, 873 
stereocilia 890, 892, 924, 1189 
see also axonemes 
ciliary dyneins (axonemal dyneins) 
937-938, 942 
circadian clocks 876-879, 1183 
circular DNA 
conservative site-specific 
recombination 293F 
in mitochondria 804 
in prokaryotes 23F 
replication 242F, 255F 
circumferential belts 924 
cis-acting IncRNAs 435F, 436 
cis-double bonds, in phospholipid tails 
566, 571 
cis-Golgi network (CGN) 716 
cis-regulatory sequences 402, 406-407, 
409 
CRE as 836 
as enhancers 386 
Eve gene 393, 395 
in gene control regions 384-385 
and genome annotation 477 
immunoglobulin chains 1320 
insulators and 391 
master regulators and 399-401 
nucleosomes and 379-380 
occupation by transcription regulators 
505 
reporter genes and 501 
as sequence logos 375, 3/78F 
transcription regulator recognition 
and binding 373-375, 379-381, 
383-384 
cisternae, Golgi apparatus 642, 715-716 
cisternae, rough ER 671F 
cisternal maturation model 720-721 
cisternal space 669 
citrate synthase 106 
citric acid cycle 
excess citrate from 760 
macromolecule precursors from 85F 
mitochondria and 664, 758-759 
overview 106-107 
in plants 786 
CK1 (casein kinase 1) 870 
CKls (Cdk inhibitor proteins) 970-971, 
972F, 973T, 1003-1004, 
1013-1014, 1015F 
CI- (chloride ion) channels 612-613, 629 
clamp loader complex 247-249 
class switch recombination 1323 
class switching, B lymphocytes 1320, 
1322-1323, 1335-1336, 1338F 


classical cadherins 1037T, 1038, 
1039-1040F, 1042 
classical pathway, complement system 
1302-1303, 1318T 
clathrin-coated pits, plasma-membrane 
553F, 731, 734, 1280 
clathrin coated vesicles 
adaptor proteins 698-699 
assembly 697-698, 700F, 703, 727 
coat structure 699F 
protein delivery to 736, 853 
claudins 1048-1051 
cleavage furrows 996-1000 
Clonal analysis 1221F, 1222 
clonal deletion 
absence of co-stimulation 1337 
immunological self-tolerance 
1313-1314, 1321 
negative selection 1332 
clonal expansion 808, 1309-1311 
clonal inactivation/clonal suppression 
immunological self-tolerance 1314 
clones 
the body as aclone of the egg 1091 
mutant, and cancers 1091-1094, 
1097F 
subclones in cancer 1096, 1097F, 
1118-1119, 1123 
Clostridium difficile 1264 
Cloverleaf structures 335-336 
CLRs (C-type lectin receptors) 1300 
cluster analysis approach 504 
CMC (critical micelle concentration) 583, 
584F 
CNS see central nervous system 
CNVs (copy number variations) 232, 492 
co-immunoprecipitation 457, 505, 506F 
co-repressors 385, 386F, 390, 392, 394, 
395F 
co-stimulatory proteins/signals 
B7 proteins 1337-1338 
dendritic cell expression 1305, 1306F, 
1314, 1326, 1327F, 1337, 1338F 
helper T cells 13814, 1335 
naive lymphocyte response 
1310-1311F, 1316F, 1332 
co-translational processes 
contrasted with post-translational 
670, 678F 
protein import into ER 670, 674 
size limits 744 
co-transporters see symporters 
coactivators 385-386, 388-389, 392, 
394, 395F 
coat-recruitment GTPases 703 
coated vesicles 
coat assembly switching, Golgi 
apparatus 713 
three types 697 
coatomer 713 
“coccus” ending 1267 
Cockayne syndrome 271 
codon-anticodon matches 345, 804 
codon bias 482 
codons (nucleic acids) 
amino acid equivalents 7, 334 
mitochondrial usage 804-805 
stop and start codons 347-349 
synonymous codons 219-220 
coenzyme A 68-69 
see also acetyl CoA 
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coenzyme Q (ubiquinone) 765-766, 767F, 
768-770, 771F, 772-773 
coenzymes 
nucleotides as 101 
and vitamins 146-148 
see also activated carriers 
cofactors, electron-carrying 764 
cofilin (actin depolymerizing factor) 905, 
910, 914, 954, 957, 958F 
cohesins 215, 550, 977-979, 982, 992, 
993F, 1004, 1007-1008F, 1009. 
coiled coils 
cytoskeletal filaments 915F, 931F, 
936, 937-938F, 945, 950F 
derived from a-helices 116-117, 124, 
137 
in intermediate filaments 894, 
945-946, 950F 
in kinesin tails 936 
leucine zipper motif 376 
in myosin I] 915 
see also DNA supercoiling 
coincidence detectors 699, 733, 825 
colchicine 459, 904T, 929, 935, 939, 993 
collaboration, in multicellular organisms 
1091 
collagen 
assembly 130 
triple helix 124-125, 1061-1062 
Type IV collagen 1058F, 1062, 
1069-1073 
Type XVII collagen 1037T, 1062, 1070, 
1076F 
vesicular transport of procollagen 704 
collagen family proteins 
as extracelluar matrix macromolecules 
1057, 1061-1064 
fibril-associated collagens 1062-1064 
fibrillar collagens 1058F, 1062-1064 
nonfibrillar collagens 1063T 
and their properties 1063T 
collagen fibers 1058, 1062, 1064 
collagen fibrils 1057F, 1058, 1060, 
1062-1065, 1069F 
color blindness 300F 
colorectal cancers 
common genetic abnormalities 
1123-1124 
driver mutations 1112F, 1117 
epigenetic changes 1110F 
evidence on stem cells from 1220 
tumor progression example 
1122-1125 
combination therapies, for cancer and 
AIDS 1139-1140 
combinatorial controls 
Eve gene 394-395 
gene expression and cell type 
396-398, 399, 1150, 1160 
logic functions governing 430 
miRNAs 430 
of transcription regulators 396, 397F, 
399, 520-521 
combinatorial diversification 1320 
combinatorial regulatory codes 166 
commensalism 1264, 1277, 1298 
commissural neurons 1202-1204 
committed precursors see transit 
amplifying cells 
committed progenitor cells 
fate of 1245-1246 
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lymphoid and myeloid 1243 
compact bone 1230-1231 
comparative genomics 218, 482, 1269F 
compartmentalization see organelles 
compartments, maintenance of diversity 
697-710 
complement system 
alternative pathway 1302-1303 
classical pathway 1302-1303, 1318T 
early complement components 
1302-1303 
immunoglobulin activation 1317 
inactivation 1303 
and the inflammatory response 1241, 
1303 
late complement components 1303 
lectin pathway 1302-1303 
and phagocytosis 740, 1302-1303 
complementary DNA see cDNA 
complementation tests 487, 490 
Complex | (NADH dehydrogenase 
complex) 767-770, 769F 
Complex II (Succinate dehydrogenase) 
767-/68F, 772-773, 775 
complex oligosaccharides 717-718 
complexins 745 
compression resistance 1036F, 1082, 
1084F 
computer analogy, cell signaling 831 
computer models see models 
(simulations) 
computer techniques 
need for quantitative analysis 38 
protein structure displays 117 
in proteomics 167 
in structured illumination microscopy 
550 
concentration gradients 
driving passive transport 599 
germ cell migration 1186F 
morphogens 1158, 1165-1166F, 1200 
signal molecules in development 
1151 
concentration thresholds, signal- 
processing 827-831 
condensation reactions 
as energetically unfavorable 66 
formation of macromolecules by 49, 
TAP 
hydrolysis as the reverse 49F 
powered by AIP 65-66, 70-73 
condensins 209F, 215, 979-982 
conditional mutations 487, 489 
confluence 1098 
confocal microscopy 540-542, 544F, 546, 
724F, 800F, 875F 
conformational changes 
actin filament elongation 902 
in allosteric enzymes 151 
in AP2 adaptor protein 700 
in calmodulin 840 
complement system 1303F 
in dyneins 938F 
generating movement 160-163 
in GPCRs 833 
in integrin activation 1077-1078 
in ion channels 618, 630 
in kinases 835 
in myosin 916 
retinoic acid activation 877F 
in RTK activation 851 


in SecA ATPase 677 
in transcription elongation 312 
in transcription initiation 307 
in transporters 599, 601, 607 
virus entry 1281 
conformations 
and energy 114 
of macromolecules generally 49, 50F 
of proteins 110 
see also protein structure 
conjoined twins 1167 
conjugation, and horizontal gene transfer 
1268 
connective tissues 
collagens in 1061, 1063-1064 
derivation from extracellular matrix 
1035, 1228 
fibroblasts in 1057F, 1228 
myoblast patterning 1185 
connexins 1051, 1052F 
connexons (hemichannels) 1051-1052 
conoids 1282, 1283F 
consensus nucleotide sequences 
marking introns 319F 
for RNA splicing 319 
in transcription 308, 311F, 325F 
in translation 348 
consensus recognition sequences 348 
conservative site-specific recombination 
292-295 
conserved DNA 
coding 1149 
multispecies conserved sequences 
225-226 
noncoding 224-225, 1149 
X-chromosome sample 300F 
conserved genes 15-16, 216-217 
biochemical importance 21-22 
common to all domains 20, 21T 
eukaryotic cell cycle 32 
conserved proteins 
actin as 898 
in apoptosis 1025 
eukaryotic cell cycle 966 
histones as 190 
Sec61 complex 676 
structure 120 
conserved RNA motifs 363 
conserved systems 
in cell polarization 956 
in cell signaling 814, 852, 855, 1150, 
1154 
cytochrome c oxidase 771F 
cytochrome c reductase 769 
in early development 1166-1167 
Hox complex serial gene expression 
1164 
N-linked glycosylation 719 
signal-recognition particles 673-674, 
675F, 677-680 
constant region, lg light and heavy chains 
1318-1319 
constitutive secretory pathway 741, 742F, 
746 
contact-dependent signaling 814, 815F, 
867 
contact inhibition 1098F 
contractile rings 
in cell division 890, 892F 
in cytokinesis 924, 996-1001 
septins and 949 


in telophase 981 
contrast enhancement 534-535 
convergent extension 1188, 1189F 
Coolair RNA 1183 
cooperative allosteric transition 152-153 
cooperative binding 
by repressors and activators 
516-517, 519 
transcription regulators 378-380 
coordination 
of cell growth and cell division 1018 
of multiple cell signaling responses 
825 
COPIl-coated vesicles 
budding from the Golgi apparatus 
698, 713F 
COPII and 697 
GTPase control 703 
interaction with ER retrieval signals 
713-714 
vesicle transport model 721 
COPIl-coated vesicles 
budding from the ER 698, 711 
COPI and 697 
formation 704F 
GTPase control 703, 705-706 
procollagen packaging 704, 705F 
protein packaging 711 
uncoating coupled to delivery 
706-707 
copper 
in ethylene receptors 881 
ions, in cytochrome c oxidase 771F 
corrals 593 
cortical cytoskeleton 591-593 
cortical rotation 1156, 1167 
cortisol 400, 835T, 875-876 
Costal2 scaffold protein 871-873 
coupled reactions 
energetically unfavorable reactions 
60-61, 63, 76-78, 102 
favorable reactions with activated 
carriers 64-65 
mechanical model 64F 
coupled transporters 601-604 
coupling 
between binding sites 151-152 
heat production to increasing order 
53-54 
transcription and excision repair 271 
covalent bonds 43, 45T, 90 
disulfide bonds 127 
DNA-topoisomerase 252-253 
double bonds 90 
polar covalent bonds 56 
covalent modifications 
in the Golgi apparatus 716 
histone amino acid side chains 
196-197 
post-translational protein modification 
165-166 
tRNAs 336 
Coxiella burnetii 1284F 
CpG sequences see CG sequences 
CPSF (cleavage and polyadenylation 
specificity factor) 324-325 
CRE (cyclic AMP response element) 836 
Cre recombinase 496-497 
CREB (CRE-binding protein) 836, 841 
CreERT2 gene 1221-1222F 
crescentin 897, 898F 


crime see forensic science 
CRISPR (clustered regularly interspersed 
short palindromic repeat) system 
434, 497-498 
crista junctions 757-758 
crista space 757, 762, 767—770, 772, 
771—1179, 802 
cristae 658, 757-758, 759-761F, 778-779 
critical concentration, Cc 900-903, 904F, 
906, 927-928 
Crm1 receptor 420 
Cro repressor protein 123 
Crohn’s disease 1300 
cross-beta filaments 130-131 
cross-linking 
of cellulose by glycans 1083 
with glutaraldehyde 555 
cross-presentation, dendritic cells 1329 
cross-species transmission 1279, 1291 
cross-strand exchanges (Holliday 
junctions) 283F, 284-285 
cross-talk 202, 391, 821-822, 857 
crossover control 285 
crossover interference 1010 
crossovers see chromosome cross-overs 
crRNAs (CRISPR RNAs) 434 
cryoelectron microscopy 559-561, 562F 
actin helices 910F 
clathrin coats 699F 
respiratory chain supercomplex 772, 
773F 
cryoelectron tomography 559F, 705F, 778 
cryptic binding sites 1043, 1068 
“cryptic” splice sites/signals 321, 322F, 
323, 324F 
cryptochromes 885 
crypts, intestinal 1122, 1124, 1218-1219 
CSFs (colony-stimulating factors) 
1244-1246, 1257F 
CstF protein (cleavage stimulation factor) 
324-325, 417-418 
CT (computed x-ray tomography) 1092F 
CTCF protein 409F 
CTCs (circulating tumor cells) 1102 
CTD (C-terminal domain) 310F, 311-312, 
311T, 316-317 
CTLA4 (cytotoxic T-lymphocyte- 
associated protein 4) 
1138-1139, 1337 
CTR1 protein 881-882 
cullins 160, 164 
“culture shock” 443 
cultured cells 
abnormal behavior of cancer cells 
1098, 1099-1100 
eukaryotic cell lines 442-444 
fibroblast proliferation 966F 
hematopoiesis regulating factors 
1244 
light microscopy 440F, 442F 
limitations of laboratory culture 14 
myoblast fusion 1234 
need for support 441 
organoids from stem cells 1223 
population homogeneity 440, 442 
curare 632 
cut-and-paste transposition 289-291 
CXCL12 ligand 1185-1186 
CXCR4 receptor 1279 
CXCR4 receptors 1185 
Cy3 and Cy5 dyes 537 


cyanides 772 
cyanobacteria 
advanced photosynthesis in 782, 790 
ATP synthases 778F 
circadian clocks 878-879 
and the origins of aerobic life 796-797 
and the origins of chloroplasts 28F, 
798F 
cyclic AMP (cAMP) 
AraC/AraJ genes and 521 
binding site example 135F 
CAP protein and 382, 522 
chemotaxis by Dictyostelium 958 
in cholera 576, 834 
fish melanocytes and 940 
G protein regulation 833-834 
heart muscle effects 835 
as asecond messenger 819-820, 
827, 833 
structure 101, 834 
cyclic-AMP-dependent protein kinase 
(PKA) 827, 834-837, 841, 843, 
845T, 848 
cyclic-AMP-gated cation channels 844 
cyclic AMP phosphodiesterases 834 
cyclic GMP (CGMP) 844-848, 880 
cyclic-GMP-dependent kinase 155F 
cyclic GMP phosphodiesterases 
844-845, 846T, 847-848 
cyclic-nucleotide-gated ion channels 
843-844 
cyclin-Cdk complexes 968, 969F 
cyclin-dependent kinases (Cdk) 
phosphorylation of nucleoporins and 
lamins 656 
in protein kinase evolutionary tree 
155F 
Cyclin E and growth inhibition 1197 
cyclins 
as conditionally short-lived 359 
four classes 968-969 
cycloheximide 351, 352T, 546F 
cyclopamine 873 
cyclosome see APC/C 
cyclosporin A 655F 
Cyk4 protein 1000F 
cysteine 
selenocysteine 350 
structure 113 
tetracysteine tags 1053F 
see also disulfide bonds 
cystic fibrosis 324, 611, 712 
cystinuria 598-599 
cytochalasins 904 
cytochrome bse2 118F, 354F 
cytochrome be-f complex 787, 789, 
791-794, 797 
cytochrome c 
in intrinsic apoptosis 1025-1027 
structure of heme group 766 
cytochrome c oxidase 767, 768F, 
770-773, 7174F 
cytochrome c reductase 767-7609, 
770-771F, 773, 797 
cytochrome oxidases 659F, 797 
cytochrome P450 family, in detoxification 
670, 1128 
cytokine receptors 863-864 
cytokines 
class switching 1323 
co-stimulatory signals from 1337 
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helper T cell fate 1335 
IL1 and TNFa as 873 
in the inflammatory response 1301 
mammalian response to dSRNA 1304 
see also chemokines; interferon-y; 
interleukins 
cytokinesis 981 
in cell cycle M phase 964, 965F, 
966-1004 
contractile ring and 949 
ESCRT mechanism 737 
membrane-enclosed organelles 1001 
in plants 1000-1001 
plasma membrane enlargement 748 
cytokinins 881-882 
cytoplasm 
defined 642 
macromolecules in 60F 
nuclear reprogramming 1252 
cytoplasmic dyneins 937-939, 943 
cytoplasmic inheritance 807 
cytoplasmic proteins in vesicle formation 
701-702 
cytoplasmic tyrosine kinases 852, 853F, 
858, 862-863 
Janus kinases (JAKs) as 863 
cytosine 
3-methyl- 271 
5-methyl- 404, 406 
deamination of 5-methyl- 405 
deamination to uracil 267T, 268, 269F, 
1322-1323 
DNA base pairing with guanine 176, 
177F 
DNA methylation at 404 
structure 100 
cytoskeleton 
accessory proteins 889, 894-896 
actin and actin-binding proteins 891, 
898-914 
in axon growth 1201 
bacterial and eukaryotic 24, 896-897 
bacterial and viral hijacking 913-914, 
1286-1288 
cell migration and 951-960 
coordination among elements 
959-960 
dynamic behavior 890-892, 895F 
extracellular matrix connections 1035 
filament assembly 893 
function and origin 889-898 
integrin links to 1075 
intermediate filaments 891, 944-950 
linker proteins 948-949 
and membrane protein diffusion 
591-593 
microtubules in 891, 925-944 
mitochondria association with 755 
motor proteins 896 
myosin and actin 914-925 
receptor coupling by Rho family 858 
thermal stability 895F 
three elements of 889 
see also microtubules 
cytosol 
pH regulation 604-605, 608 
RNA virus replication 1278 
as site of protein synthesis 641 
transport between nucleus and 
649-658 
cytosolic surfaces 
coated vesicles and 697-698 
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RAB proteins on 705 
cytotoxic T lymphocytes 611 

CD8 expression 1331-1332 

class | MHC presentation 1326, 
1329F, 1330T 

dendritic cell activation 1329, 1333, 
1335 

foreign protein recognition 1327F, 
1328 

killing target cells 1333-1334 

NK cells resembling 1304-1305, 1333 

as a T cell class 1325 


D 


D gene segments 1320 
D-V (dorsoventral) axis 
animal body plan 1147 
cortical rotation, Xenopus 1167 
Dorsal transcription regulator and 
1164-1165, 1166F 
nervous system development 1199, 
1200F 
polarization of the embryo 1155-1158 
retinotopic map 1206 
vertebrate equivalent 1169 
Dally and Dally-like proteoglycans 1073 
Danio rerio see zebrafish 
DAPI dye 537-538F, 800F 
dark-field microscopes 533-534, 535F 
databases, protein 
mass spectrometry matching 
456-457 
and protein structure prediction 462 
DDK kinase 975, 976F 
ddNTP (dideoxy nucleoside triphosphates) 
478 
de-differentiation 398, 1249 
de novo methyl transferases 404-405, 
406F 
de novo mutations 494 
deamination 
of 5-methlycytosine 405 
adenine to inosine 335, 336-337F, 
418 
arginine to NO 847 
cytosine to uracil 267T, 268, 269F, 
272F, 418, 13822-1323 
of DNA yielding unnatural bases 
271-273 
RNA editing 418-419 
death receptors 1024-1025 
decapping MRNA 426-428 
decorin 1058F, 1060 
default secretory pathway 741 
defensins 1298, 1302 
degenerative disease treatment 
1258-1259 
degradation complex 870 
degrons 158 
degrowth 1248 
dehydrogenation and hydrogenation 56 
Deinococcus radiodurans 483 
delayed Kt channels 634-635 
delayed negative feedback 516, 517F 
delayed response genes 1013F 
deletion mutations 487 
Delta protein 867, 868F, 1172, 1173F, 
1178-1179, 1224 
denaturing of DNA 472 
denaturing of proteins 114, 453, 584 


dendrites 
in neural development 1198, 1199F, 
1200-1201, 1206-1207, 1208F 
role in neurons 621, 940 
self-avoidance 1206-1207 
dendritic cells 
antigen processing 1330F 
co-stimulatory protein expression 
1306F, 1314, 1326, 1337, 1338F 
cytotoxic T cells and 1333, 1335 
derived from monocytes 1240, 1243F 
Langerhans cells as 1226F, 1240 
linking innate and adaptive immune 
systems 1305, 1306F 
negative selection in thymocytes 
1332 
as professional APCs 1307, 1328 
role as antigen peptide presenters 
1240, 1305 
surface proteins displayed 1306F, 
1326, 1327F, 1330 
T cell binding and activation 
1324-1326, 1331 
use of cross-presentation against 
viruses and tumors 1329 
dense-core secretory granules see 
secretory vesicles 
deoxyribose 
stability conferred by 366 
structure 100 
dephosphorylation, M-Cdk and 970-971, 
978, 993-995, 998 
depth of field 533F, 558-559 
depurination 267T, 268-269, 270F 
dermatan sulfate 1058 
desensitization see adaptation 
desmin 944T, 948-949, 1046 
desmocollins 1037T, 1038, 1039F, 1046F 
desmogleins 1037T, 1038, 1039F, 1046F 
desmoplakin 1037T, 1046F 
desmosomes 893F, 946, 1036, 1226F 
desmotubules 1053, 1054F 
detergent solubilization, membrane 
proteins 583-586, 766 
deterministic models 524 
detoxification 
carcinogen activation and 1128 
of oxygen 796 
by peroxisomes 667 
in smooth ER 670 
developing tissues see embryonic 
development 
developmental complexity and gene 
repression 390 
diabetes 
B-cell renewal and 1226 
insulin levels and cancer risk 1115 
as a multigenic condition 493 
Type | as an autoimmune diseases 
1315, 1332-1333 
diacylglycerol 819, 837, 838F, 859, 862F 
Diap gene 1197F 
Dicer enzyme 430F, 431-433 
Dictyostelium 559F, 958 
dideoxy sequencing 466F, 477-478 
Didinium 25F 
differential equations 
coupled differential equations 515 
and deterministic models 524 
for positive feedback 518 
protein concentrations and 513-514 
and transient behavior 512-513 


differential-interference-contrast 
microscopes 533-535, 560F, 
1002F 
differentiation 
from the blastomere 1148F 
embryoid bodies from iPSs 1257F 
epigenetic inheritance and 205 
four general statements 371 
genetic mechanisms maintaining 
392-404 
reprogramming cell types 396, 398 
retention in culture 441-442 
stepwise commitment in 
hematopoiesis 1243-1244 
terminally differentiated cells 400, 
816, 1012, 1121 
transcription regulator activation 
1170-1171 
variations in cancer cells 1121, 1122 
without changes in the genome 369 
diffraction effects, in microscopy 531, 
539F 
diffraction limit to resolution 532, 549, 
551, 554 
diffusion 
within lipid layers 569-570, 601 
passive diffusion into nuclei 650 
and reaction rates 59-60 
and signal molecule gradients 
1151-1153, 1158F, 1166 
of small molecules across membranes 
598 
diffusion coefficients 570, 589 
diffusion-limited reaction rates 143, 148 
digestion, enzymatic 74 
digital cameras in microscopy 555 
digital image enhancement 534 
digoxigenin 467 
dihydroxyacetone 96 
dihydroxyacetone phosphate 104 
dilated cardiomyopathy 923 
dilution effects 513-514 
dimerization/dimers 
a- and B-tubulin 925 
ATP synthases 778, 779F 
caspases 1023-1025, 1026F 
fibronectin 1067 
helix-loop-helix proteins 377 
integrins 1075 
of MHC proteins 1327 
polypeptide 123 
of RTKs on ligand binding 851 
single-pass transmembrane proteins 
580 
T cell receptors 1325 
TGFB superfamily 865 
transcription regulators 375-378 
tyrosine-kinase-associated receptors 
862 
dimorphism, pathogenic fungi 1271-1272 
diplotene 207F, 1006, 1007F, 1010 
DISC (death-inducing signaling complex) 
1024-1025 
discontinuous responses 827, 829 
see also switches 
diseases 
analysis using stem cells 1258-1259 
linked to DNA repair 266T 
linked to integrin defects 1076-1077 
linked to mitochondrial mutations 
807-808 


linked to mutations 479, 493-494, 
627, 668 
linked to X-chromosome changes 
300F 
lysosomal storage diseases 728-729 
membrane transport protein mutations 
598-599 
microbiotic imbalance and 1264 
potential of RNA interference 433 
RNA splicing errors and 323-324 
spectrin deficiency 591 
Dishevelled protein 870-871 
disorder (thermodynamic) see entropy 
disordered regions (proteins) 125-126, 
149 
disulfide bonds 127, 452 
in fibronectin 1067-1068 
immunoglobulin (lg) domains 1319 
in keratins 946 
in laminin 1070 
mitochondrial protein import 664 
proteins in lumen and cytosol 682 
single-pass transmembrane proteins 
582 
diurnal rhythms (circadian clocks) 
876-879, 1183 
Dlg (discs-large protein) 165 
DNA 
analytical methods 463-466 
as Carrier of genetic information 174, 
177-178 
as constituent of chromosomes 
173-174, 179-182 
content measurement 966, 967F 
distinctions from RNA 4-5, 302, 366 
fragmentation in apoptosis 1024F 
localization in eukaryotes 178-179 
manipulation 467-485 
in mitochondria and chloroplasts 753 
packaging within chromosomes 
179-193 
polarity 175 
structure and function 3, 175-179 
synthesis 240-241F 
see also double helix; genes; 
mitochondrial DNA; recombinant 
DNA 
DNA catenation 977, 979 
DNA cleavage by restriction nucleases 
464-465 
DNA cloning 
CDNA cloning 470-471, 475F 
DNA libraries from 469-470 
genomic DNA cloning 470-471, 475F 
proteins in quantity from 483-484 
in recombinant DNA technology 464 
two meanings 467 
using PCR 473-474 
DNA damage 
apoptosis response 1022, 1028 
cell cycle response 276, 1014-1015 
by chemical carcinogens 1127-1128 
p53 regulatory pathway 1113, 
1115-1116 
response in cancer cells 1099, 1113, 
1115-1116, 1132 
susceptible sites and typical lesions 
267 
telomeres and response 264 
see also DNA repair 
DNA demethylases 404, 406 


DNA duplication, collagen a-chains 1062 
DNA fingerprinting 475-477 
DNA glycosylases 269, 270F, 271, 273 
DNA gyrases 315 
DNA helicases 
nucleotide excision repair 270 
production 484F 
replication role 246, 249, 255 
in S-phase 974-975, 976F 
structure 247F 
in TFIIH 311 
Xpd knockout mice 497F 
DNA injection by bacteriophages 19F 
DNA inversion 294 
DNA labeling 466 
DNA libraries 469-471 
DNA ligases 
in disease 266T 
in recombinant DNA technology 464 
repair function 269-270, 280, 289F 
replication function 245, 246F, 250 
use in DNA cloning 468, 469F 
DNA looping 383-384, 385F, 386, 391 
DNA-only transposons 218F, 288-290 
DNA polymerase a-primase 253 
DNA polymerases 
attachment 246-247 
discovery 240 
and PCR 473 
Pold 246, 261 
Pole 246 
RNA polymerases compared 304-305 
as self-correcting 243 
synthetic action 241 
translesion 273, 274F 
viral use of host 1289 
DNA primases 245, 249, 253, 255, 256F, 
263F 
DNA primers, in PCR 473 
DNA probes 472-473, 502 
DNA repair 266-276 
base excision and nucleotide excision 
269-271 
contrasted with RNA 271-273 
defects in cancers 1097, 1124-1125, 
1132-1133 
diseases linked to 266T 
double strand breaks 273-275, 
1133-1134 
in eukaryotes and bacteria 271 
by homologous recombination 
1133-1134 
see also DNA damage 
DNA replication 4-5 
and the cell cycle 258-260, 974-977 
end-replication problem 262 
errors in cancer 1106F, 1107 
initiation and completion 254-266 
mechanisms 239-254 
proofreading 242-244, 250-251, 257F 
as semiconservative 240, 242F, 447 
temperature-sensitive mutations and 
489 
topoisomerase and the winding 
problem 251-253 
two stages 974, 976F 
see also replication forks; replication 
origins 
DNA segment shuffling 16, 17F 
DNA sequences 
Alu sequences 212F, 223F 
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barcoding mutants 498, 499F 
barrier DNA sequences 195F, 202 
centromeric 203 
changes in cancer cells 1094, 1097, 
1106, 1109-1110, 1111-1112F 
deducing gene function 20, 216-217 
human B-globin gene 179F 
inferring for ancient genomes 
223-224 
maintenance of 237-239 
rate of change 218 
see also genome sequencing; human 
genome 
DNA sequencing methods 
dideoxy sequencing 466F, 477-478 
llumina® sequencing 480 
ion torrent™ sequencing 481 
second-generation sequencing 
methods 479-480 
shotgun sequencing 479 
third-generation sequencing methods 
481 
DNA supercoiling 
created by RNA polymerases 
314-315 
positive and negative 315 
the replication winding problem 251, 
253 
DNA transfection for oncogene 
identification 1104 
DNA transferases 
de novo methyl transferases 404-405, 
406F 
maintenance methyl transferases 404 
docking sites 
PI(3,4,5)P3 as 859 
plasma membrane 859, 860F 
protein tyrosine kinases 1080 
receptor tyrosine kinases 849-850, 
851F, 852, 853F 
specificity of 821-823 
STAT proteins and cytokines 863 
dogs 
olfactory bulb neurons 1199F 
size differences 1193, 1196 
dolichol/dolichol phosphate 99, 684 
“Dolly the sheep” 1252 
domain shuffling 121-123 
domains (lipid bilayers) 572-573, 
590-591, 593, 749 
domains (protein) see protein domains 
domains (taxonomic) 
bacteria, archaea, and eukaryotes 15 
common gene families 20, 21T 
dominant mutations 
cancer-critical genes 1005F, 1104 
gain-of-function as typically 489 
dominant negative mutations 487, 1116 
Dorsal transcription regulator 655F, 873, 
1164-1165, 1166F 
dorsoventral (D-V) axis 
animal body plan 1147 
cortical rotation, Xenopus 1167 
Dorsal transcription regulator and 
1164-1165, 1166F 
nervous system development 1199, 
1200F 
polarization of the embryo 1155-1158 
retinotopic map 1206 
vertebrate equivalent 1169 
dorsoventral inversion 1169 
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double bonds 
alternating 90 
cis-double bonds 566, 571 
unsaturated fatty acids 98 
double helix 
DNA 3F, 174, 176 
hydrogen bonding 175-176 
as key to DNA repair 268 
superhelical tension, DNA 314-315 
double-negative activation 820, 821F 
double-pass transmembrane proteins 
679, 681F 
double-reciprocal plots 143 
double-strand breaks (DNA) 
in cancers 1111, 1116 
homologous recombination 278-279, 
1116 
in meiosis 282-284, 1009-1010 
repair mechanisms 273-275, 497, 
1323 
segmental duplications from 228 
topoisomerase Il and 252-253 
double-stranded RNA (dsRNA) 
RNA interference and 431, 499-500 
TLR3 recognition 1299, 1304 
as viral characteristic 1304 
Down syndrome 1010 
doxycycline 495F 
Dpp gene (Decapentaplegic) 1165-1166, 
1169 


Dpp signaling protein 1073 
driver mutations 1104, 1112-1113, 1117, 
1119F 
Drosophila (D. melanogaster) 
airways 1191F 
alternative splicing in 415 
body segmentation 1159-1163 
Branchless gene 1191F 
cell cycle 966 
central nervous system 1179 
characteristic transposon types 292 
chromosome and genome sizes 
33-34 
circadian clock 877-878 
Dlg protein discovery 165 
Engrailed protein 120, 536F 
Eve (even-skipped) gene 393-394, 
395F 


Eyeless transcription regulator 
397-398F 

fungal infection 1299F 

gene expression by in situ 
hybridization 536F 

genetic switches 392-395 

Hedgehog proteins in 871 

homeodomain motif 376 

imaginal discs 1195 

insulator-binding protein 391F 

mitochondrial DNA 805 

as model organism 29T, 33-34, 417 

mRNA localization in 422 

mutant libraries 498 

neurogenesis 1173-1174 

Notch receptor in 867-868 

P element 416, 486 

photosensing neurons 1208F 

polytene chromosomes 208-211, 
391F, 540F 

position effects in 194, 195F 

RNA interference 499 

Sev, Sos and Grb2 Ras-GEFs 855 

shibire mutant 702F 


Drosophila embryos 
control of development 1157 
discovery of signaling pathways 1154 
germ-band extension 1045 
mitosis without cytokinesis 1002, 
1003F 
syncytium development 1157 
transcription regulators 392-395 
drug discovery 
computer-based 463 
stem cells and 1258-1259 
drug resistance 
cancer treatment 1135-1136, 1139 
malaria 610 
multidrug resistant cancers 610 
pathogens 1291-1294 
see also antibiotic resistance 
drug targets 
antibiotics 1293F 
transmitter-gated ion channels 
631-632 
drug treatment, predicting individual 
responses 506 
Dscam gene 415F, 1208F 
DSCAM proteins 415, 1207 
duplication in tissue renewal 1226 
dwarfism, pituitary and achondroplastic 
1196 
dynactin 939 
dynamic instability, actin 903 
dynamic instability, microtubules 
927-929, 935 
dynamic range, intracellular signaling 
824-825 
dynamic systems and differential 
equations 512 
dynamin and dynamin-related proteins 
701-702, 803, 804F, 806 
dyneins 936-943, 959 
axonemal dyneins 937-938, 942 
cytoplasmic dyneins 937-939, 943 
and the mitotic spindle 983-985 
viruses and 1288 
dysbiosis 1264 
dyskeratosis congenita 265 
dystroglycan 1070, 1071F 
dystrophin 1234 


E 


E1 (ubiquitin-activating enzyme) 158, 
159F 
E2 (ubiquitin-conjugating enzyme) 
158-159, 160F 
E3 see ubiquitin ligases 
early endosomes 
delivery to 732-733, 738 
and late endosomes 696F, 707 
maturation 707, 730, 735-736 
retrieval from 734-735 
transcytosis and 738 
EB1 proteins 935 
EC (extracellular cadherin) domains 1039, 
1040F 
ecdysone 875 
ECM see extracellular matrix 
EcoRI nuclease 465-466F, 468F 
ectoderm 1147-1148, 1167-1168 
neurogenic ectoderm 1166, 
1173-1174, 1179, 1186 
eczema 947 


editing pockets 339 
EF-G elongation factor 343, 344F 
E2F proteins 1012-1014 
EF-Tu elongation factor 160-161, 163, 
343-344 
effector B and T cells 
compared 1309 
effector B cells 1309-1310, 
1312-1313, 1316, 1322, 1324 
effector T cells 1309-1310, 1312- 
1313, 1324, 1328, 13833-1335, 
1337 
and immunological memory 
1310-1311 
in lymphocyte differentiation 1309 
effector Bcl2 family proteins 1027, 1028F 
effector proteins 
extracellular bacterial pathogens 
1269-1271, 1278, 1281-1282, 
1283F, 1285-1286 
in intracellular signaling 814 
efflux pumps 883, 884F, 1293 
EGF (epidermal growth factor) 
action via receptor tyrosine kinases 
850T 
broad specificity 1011 
production of embryoid bodies 1257F 
resemblances to matrix proteins 
1066F 
vertebrate embryo segmentation 1178 
EGF receptors 
activation 851 
lysosomal degradation 735 
mutation in glioblastoma 1107 
targeting in lung cancers 1136 
egg-polarity genes 1157-1159, 
1160-1161F, 1163, 1165 
eggs 
growth without cell division 1018 
similarity of egg cells 2F 
eggs, fertilized see zygotes 
eicosanoids 837 
elFs (eukaryotic initiation factors) 347, 
348F, 423-428, 1304 
elF4E transcription initiation factor 347, 
425-426, 1017 
EINS3 protein 882 
EJCs (exon junction complexes) 320F, 
321, 352, 353F 
elastase, compared with chymotrypsin 
119F 
elastic fibers 1065-1066 
elastin 
as disordered 125 
in the extracellular matrix 1058, 
1065-1066 
electrochemical gradients 
in chemiosmotic coupling 754 
composed of membrane potentials 
and concentration gradients 
599, 662, 762-763 
membrane transport and 599-600, 
602, 612, 662 
mitochondrial membranes 758 
powering ATP synthase 774, 776 
thylakoid membrane 784F 
electrogenic effects 608, 615 
electron carriers 
NADH and NADPH as 67-68, 762 
plastoquinone, plastocyanin, and 
ferredoxin as 793 


electron crystallography 580, 586 
electron microscopy 
cryoelectron microscopy 559-561, 
562F 
EM tomography 558, 779, 988F 
freeze-etch 937F, 942F, 948F 
freeze-fracture electron microscopy 
1047, 1049F, 1051F 
immunogold electron microscopy 
556-557 
negative staining 559-561 
resolution 554, 559, 560F, 562 
single-particle reconstruction 
561-562 
staining 555-556 
three-dimensional imaging 557-558 
transmission electron microscopy 
305F, 554, 555-556F, 558, 560F 
see also SEM 
electron transfers 
chlorophyll 787-788, 792 
NADH dehydrogenase complex 768 
oxidation and reduction as 55-56 
photosynthetic reactions 783-784 
see also light reactions 
electron-transport chain 
in ATP synthesis 84-85, 658 
in chemiosmotic coupling 754 
in chloroplasts 789 
the citric acid cycle and 83 
in early living cells 795-797 
location in mitochondrial cristae 
757-758 
in mitochondria and chloroplasts 
755F 
mitochondrial protein imports 659, 
662, 663F, 664 
proton gradients and 86F 
proton pumps of 763-774 
electrons 
damage to proteins 561 
productivity in terms of ATP 775 
wavelength 554 
electrophoresis see gel electrophoresis 
electroporation 495 
electrospray ion source 456-457 
electrostatic attractions 
atomic force microscopy and 548 
binding site example 135F 
as noncovalent 44, 95 
elements 
periodic table 43F 
required for living cells 12, 43 
elongation factors 
EF-G 343, 344F 
EF-Tu 160-161, 163, 343-344 
loading onto RNA polymerase 388 
role 313-314, 343-344 
elongation phase, DNA replication 974 
EM (electron microscope) tomography 
558 
embryogenesis, cell migration in 951 
embryoid bodies, from iPSs_ 1257F 
embryonic development 
apoptosis in 1022 
asymmetries 1151-1152 
blastomere to differentiation 1148F 


cadherin-dependent cell-cell adhesion 


in 1040-1041 
cell memory in 1148, 1150, 1162, 
1164 


control of timing 1176-1184 
epithelial-mesenchymal transitions 
1042 
fundamental processes in animals 
1145 
germ cells and somatic cells 1158 
intracellular control programs 1179 
morphogenesis 1184-1193 
neural development 1198-1213 
overview 1147-1155 
retina 1236 
spatial patterning in 1150-1154 
specific genes in animals 1149 
tissue morphogenesis 1059 
embryonic stem cells 
Nanog regulator 378F 
transgenic mice 496 
embryos 
cell-cycle control system 967, 971 
genome activation 1147 
parthenogenetic 987 
see also blastula 
embryos, Drosophila 
control of nuclear transport 655F 
discovery of signaling pathways 1154 
germ-band extension 1045 
mitosis without cytokinesis 1002, 
1003F 
syncytium development 1157 
transcription regulators 392-395 
embryos, frog 
blastula stage components 1147F 
epigenetic inheritance and 205-206 
evidence for differentiation without 
gene loss 369-370 
reprogramming by donor nuclei 
205-206 
embryos, vertebrate 
inductive signaling 1166, 1167F, 1177, 
1184, 1198 
nervous systems 1041F 
spatial patterning 1167-1169 
EMT (epithelial-mesenchymal transitions) 
1042, 1101 
endocrine cells, in signaling 815 
endocytic-exocytic cycle 731 
endocytic vesicles 
endosome delivery 696F, 707 
endosome formation and 730-732, 
735 
phagosomes as 738 
synaptic vesicles from 746 
endocytosis 
of activated TGFB receptors 865 
defined 695 
influenza virus infection by 709, 1280 
as a lysosome delivery pathway 725 
pathways 730-741 
phagocytosis as 738 
and receptor down-regulation 830 
receptor-mediated 709, 727, 
732-735, 849, 1281 
endoderm 1147-1148, 1156, 1158, 
1167-1168 
endolysosomes 723, 724F, 730, 734F, 
736, 1299, 1300T 
endonucleases 
AP endonuclease 269 
mRNA destruction 427 


Page numbers with an F refer to a figure; page numbers with a T refer to a table. 
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tRNA splicing 336F 
endoplasmic reticulum (ER) 
among intracellular compartments 
642 
area compared with plasma 
membrane 643 
ER resident proteins 647, 682-683, 
685F, 711, 714 
ER retention signals 682 
ER retrieval signals 713 
ER tail-anchored proteins 682 
functions 669-691 
junction complexes with mitochondria 
691 
membrane asymmetry 681 
microtubules in organization 939 
mitochondrial contacts with 755, 
1757F 
mRNA localization and 421 
nuclear envelope connection to 179, 
180F 
protein retrotranslocation 358, 686 
protein transport to the Golgi 
apparatus 710-722 
rough and smooth ER 642, 670, 671F 
rough ER in B cells 1309 
source of lipid droplets 573 
source of microsomes 445, 671-673 
source of single pass membrane 
proteins 577 
see also sarcoplasmic reticulum 
endoreplication 1194 
endosome maturation 707, 730, 735-736 
endosomes 
among intracellular compartments 
642 
early and late 696F, 707, 730, 735, 
1280-1281, 1285 
Rab5 domain formation 707 
recycling endosomes 696F, 706T, 730, 
737-738, 739F 
Toll-like receptors 1299F 
tubule formation 705 
endosymbiont hypothesis 800 
endothelial cells, in blood vessels 
1235-1238, 1811-1312 
endothelin-3 1186 
energetically favorable and unfavorable 
reactions 
carbon fixation as unfavorable 784 
condensation reactions as unfavorable 
66 
coupling 60-61, 63-65, 76-78, 102 
defined 57 
DNA supercoiling as favorable 314 
electron transfer as favorable 764 
membrane transport of glucose as 
favorable 603F 
energetics 
of active transport 600 
of biological reactions 102 
bond strengths 44F 
catalysis and energy use 51-73 
extraction of energy from food 73-88 
geochemical energy 12 
of glycolysis and oxidative 
phosphorylation 756 
hydrogen burning and oxidative 
phosphorylation 761 
membrane transport proteins 163 
mitochondrial protein import 661F 
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oxidation of organic molecules 54-55 
of protein folding 114-115, 549 
see also free energy 
energy conservation 52-54 
energy storage 
as fat 78-79 
as starch 80-81 
through coupled reactions 76-78 
Engrailed gene (Drosophila) 1160-1161, 
1162F 
engrailed protein (Drosophila) 120 
enhancers 312, 1175 
see also cis-regulatory sequences 
enkephalins 744 
enolase 105 
enterocytes see absorptive cells 
enteroendocrine cells 1218, 1219F, 1221, 
1223-1224 
entrainment 877-878 
entropy 52-53, 60, 103 
enveloped viruses 1274, 1275F, 1280, 
1286, 1287F, 1288 
viral envelopes 1275F, 1280-1281, 
1287F 
environmental cues 
cell migration 1185-1186 
metamorphosis of amphibians 1182 
plant flowering times 1182-1184 
environmental factors 
identical twin studies 412 
stem cell fates 1222 
enzyme amplification in microscopy 540 
enzyme cascades 848, 873, 881 
enzyme-coupled receptors 
as cell-surface receptor class 818F, 
819 
insulin receptor as 824F 
receptor serine/threonine kinases as 
865 
RTKs as 837F 
enzyme kinetics 141-144 
enzyme-substrate complexes 
formation by collision 59—60 
lysozyme example 146F 
enzymes 
activation energy effects 57-58 
activation sequence for Src kinases 
156 
allostery 151-153 
cellular compartmentalization 
148-149 
choice of reaction pathways 58, 59F 
classes and nomenclature of 140T 
concentrations and metabolic rate 
148 
DNA repair function 266 
electrostatic attractions in 95 
multienzyme complexes 148 
“perfect enzymes” 143 
positive and negative regulation 
149-151, 152F 
as protein catalysts 6, 51 
as proteins 48 
specificity as catalysts 140-141 
speed of molecular motions and 
59-60 
zymogens as proenzymes 736 
eosinophils 1239, 1240F, 1241, 1245, 
1302F, 1317 
EPEC (enteropathogenic E. coli) 1278 
ephrin-Eph binding/EphrinB 1224 


ephrin proteins 850T, 858, 1188, 1206, 
1207F 
epidermal stem cells 1225-1226 
epidermolysis bullosa 947, 949, 1069 
epigenetic changes 
in cancer cells 1094, 1096-1097, 
1109-1111, 1125-1126 
nuclear reprogramming and 
1252-1253 
tumor Suppressor gene inactivation 
1109 
epigenetic inheritance 
and chromatin structure 194, 
204-206, 409-41 1 
of gene expression 411-412 
mechanisms acting in cis and trans 
412, 413F 
epilepsy 627, 913F 
epinephrine see adrenaline 
epistasis analysis 490 
epithelia 
basal lamina and 1035, 1062, 
1068-1069 
cell-cell junctions 1044-1049 
cell-matrix attachments 1036F 
hemidesmosomes in 1076 
mesenchymal interactions 1190 
planar cell polarity 1189-1190 
protection by mucus 719-720, 749, 
1276-1277, 1298 
simple columnar epithelia 1036, 1047 
stem cells in 1217-1227 
epithelial barrier to infection 1265, 1276- 
1277, 1298 
epithelial cells 
adherens junctions 1044 
apical and basolateral domains 749, 
893, 1036, 1047 
aquaporins 612 
carcinomas as cancers of 1092 
keratin filaments in 946 
keratocytes as 953 
polarization 749, 1047 
renewal, in colon and rectum 1122 
transport of solutes 605 
epithelial tubes 1045 
epithelial-mesenchymal transitions 1042, 
1101 
epitope tagging 450-451 
Epulopiscium fishelsoni 13F 
equilibrium binding experiments 458, 
459F 
equilibrium centrifugation 672 
equilibrium constants 
actin polymerization 900, 902 
deriving from standard free energy 
changes AG° 62-63, 63T, 139F 
protein binding strength and 138-140, 
458 
protein-promoter complexes 511 
equilibrium potential, Nernst equation 616 
equilibrium reactions 
ATP synthesis and hydrolysis 
775-776 
energetics of 61-63, 103 
enzymes and 58 
see also reversible reactions 
equilibrium sedimentation 447 
ER see endoplasmic reticulum 
eraser enzymes 201 
ERk (MAP kinases) 856-857, 861F 


ERM protein family (ezrin, radixin, and 
moesin) 905, 913 
error correction 
in DNA synthesis 243-245, 250-251 
proofreading 339, 344-345 
by tRNA synthetases 338-339 
see also quality control 
error rates 
DNA replication and RNA synthesis 
244, 244T 
meiosis 1010 
and viral evolution 1291 
errors in data 525 
erythrocytes see red blood cells 
erythropoietin 864T, 1011, 1244-1245 
ES cells see stem cells, embryonic 
Escherichia coli (E. coli) 13F, 16F 
ABC transporters 609 
arabinose metabolism 521-523 
DNA replication 255, 256F 
enteropathogenic (EPEC) 1278 
F plasmid 469 
gene transcription 380-381 
genome 22, 23F, 1269 
as Gram-negative 1267F 
historical importance 22 
homologous recombination 279 
horizontal gene transfer in 19 
Lac operon 382-383 
mutation rates 237 
promoters 308F 
rRNA genes 327 
strand-directed mismatch repair 250 
universal gene families and 21T 
ESCRT complexes (Endosome Sorting 
Complex Required for Transport) 
intraluminal vesicle formation 
736-737 
ubiquitin-binding proteins 735 
ESR (electron spin resonance) 
spectroscopy 5/0 
essential amino acids 86, 87F 
estradiol 8/76F 
ethidium bromide 466, 1024 
ethylene, plant growth regulator 881-882, 
1087 
eubacteria 15F 
euchromatin and heterochromatin 194 
eukaryotes 
ABC transporters in 163F, 609F, 610 
cell cycle 185, 258, 964-966 
cell features 24-25 
cell lines 442-444 
chromatin structure and chromosome 
function 206 
distinction from prokaryotes 12-15 
DNA localization and packaging 
178-182 
DNA repair in 271, 281 
DNA replication in 253-254 
genomes 23-39 
Homo sapiens as _ 16F 
last common ancestor 880 
mRNA compared to prokaryotic 316F 
mutation rates 234 
numbers of ribosomes 340 
organelles common to all 641-643 
plasma membrane composition 568, 
572, 575-576 
probable origins 24-26 
protein kinase numbers 154-155 


regulation of protein synthesis 361F, 
405 
RNA interference in 429F 
RNA polymerases in 390 
single-celled 24, 25F, 30 
transcription control 384-392 
translation initiation 422-425 
eukaryotic cell sizes 644 
eukaryotic cytoskeleton 896-897 
eukaryotic initiation factors (elFs) 347, 
348F 
eukaryotic parasites 1277, 1282-1284, 
1290 
eukaryotic pathogens 1266, 1271-1273, 
1282-1283, 1286F, 1290 
Eve (even-skipped) gene 392-394, 395F 
evolution 
cancer as a microevolutionary process 
1092, 1119 
of chemiosmotic processes 794-796 
convergent evolution 665 
critical stages in human evolution 226 
of energy-conversion 753 
of eukaryotic cell and its membranes 
643-645 
of genomes 216-234, 804 
infection as a driver 1331 
of ion channels 626-627, 630 
molecular clock 220 
nonsense-mediated decay in 352 
of NPCs and vesicle coats 650 
of photosynthetic reaction centers 
793, 794F, 796 
in plants compared with animals 
880-881 
of protein kinases 154-155 
of protein synthesis 365-366 
and RNA splicing patterns 323 
sources of genetic variation 16-17, 
217-218, 221, 227-232 
of vertebrates 227 
of viruses 1291, 1292F 
see also natural selection 
evolutionary tracing 136-137 
exchangers see antiporters 
excitatory neurotransmitters 629-631 
excitatory postsynaptic potential 633 
executioner caspases 1022-1025, 1026F 
exocrine cells 
aquaporins 612 
pancreatic 671F 
exocytosis 
defined 695 
in endocytic-exocytic cycle 731 
of lysosomes and multivesicular 
bodies 729 
of residual bodies 739 
synaptic vesicles 744-746 
exon definition 322, 323F 
exon skipping 321-323, 324F 
exons (expressed sequences) 
evolution rate 220F 
inhuman genome 183F, 184, 318F 
recombination 230 
size range 322 
exonucleases 
exonucleolytic proofreading 243-244, 
250 
mRNA destruction 426 
exosomes 326, /29 
explants 441 


expression vectors 483 
see also plasmids 
extracellular bacterial pathogens 
1269-1271 
extracellular cadherin (EC) domains 1039, 
1040F 
extracellular matrix 
in animals 1057-1074 
basal lamina as 1068-1069 
cell-matrix interactions 1064 
cell-matrix junctions 1035-1038, 
1044F, 1074-1080, 1081F 
degradation 1072-1073 
fibroblast differentiation and 1229 
fibrous proteins 124 
general features 1035-1038 
isolating and culturing cells 440, 441 
migrating cells and 955 
modification by cancers 1101 
neuronal growth and 1202 
plant cell walls as 1053 
three classes of macromolecules 
1057-1058 
variety of forms 1057-1058 
extracellular signal molecules 
in intercellular communication 813 
responses and concentration 828 
transcription regulation response 
867-880 
extracellular signals 
range 814-815 
response to concentration changes 
830-831 
responses as programmed 816-817 
responses speeds 826F 
transcription regulator activation by 
395, 399-400 
extremophiles 10 
extrinsic activation pathway, apoptosis 
1023-1025, 1028-1029 
Eyeless transcription regulator (Pax6) 
397-398F, 1146F, 1171 
eyes 
cornea 1057, 1063-1064, 1069F 
lens 1038, 1065-1066 
segregation of inputs 1212 
see also retina 


F 


F-box proteins 159-160, 167, 168F, 971, 
972F 
F-type ATPases see ATP synthases 
FACS (fluorescence-activated cell sorters) 
440, 441F, 967F 
Factors V and VIII, lectin binding 712 
Factor VIII gene 301F, 318F 
FADD (Fas-associated death domain) 
1025F 
FADHb (reduced flavin adenine 
dinucleotide) 
as an activated carrier 69 
in the citric acid cycle 83 
in the respiratory chain 768F, 772, 
774, 775T 
FAK (focal adhesion kinase) 863, 1037T, 
1079-1080 
familial hypertrophic cardiomyopathy 923 
FAP (familial adenomatous polyposis colli) 
1123-1124 
Fas death receptors 1024, 1025F, 1031 
Fas ligand 1024, 1025F, 1029, 1031, 1334 
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Fat cadherin 1039F 
fate maps 1159F, 1167 
fats 
energy storage as 78-79, 81, 82F 
energy storage in plants 785-786 
oxidation to acetyl CoA 81-82, 83F 
fatty acids 
breakdown in peroxisomes 667 
as a mitochondrial fuel 758, 759F 
and other lipids 98-99 
in phospholipid tails 566 
product yields from oxidation 775T 
unsaturated fatty acids 98 
favorable and unfavorable reactions see 
energetically favorable 
Fbx15 gene 1255F 
Fc receptor, and phagocytosis 738F, 739, 
1317 


FDG (fluorodeoxyglucose) 1092F 
feed-forward loops/motifs 401-402, 403F 
incoherent and coherent 522-523 
feedback loops 
and cell memory 401-402 
mitogen-activated transcription 
1012-1013 
need for quantitative analysis 38 
in signaling systems 825, 828, 829F, 
830-831, 856-857 
feedback regulation 
of cell size 1193 
cholesterol 655, 656F 
enzyme inhibition 150-151 
generating Ca?+t waves 838-840 
in photosynthesis 784 
in stem-cell division 1121 
see also negative; positive feedback 
FepA protein 581 
fermentations 75-76, 780 
ferredoxin 790F, 792-793 
ferritin 427 
fertilization 
Ca2* changes 547F, 839F 
centrosome 987 
membrane fusion in 709 
Xenopus egg rotation 1156, 1167 
fetal hemoglobin 229 
FG-repeats (phenylalanine glycine) 652, 
653F, 654 
FGF (fibroblast growth factor) 
action via receptor tyrosine kinases 
850T 
in branching morphogenesis 
1190-1191 
interaction with heparan sulfate 1073 
production of embryoid bodies 1257F 
FGF receptor and achondroplasia 1196 
fibril-associated collagens 1062-1064 
fibrillar collagens 1058F, 1062-1064 
fibrillin 1065-1066 
Fibrinogen 1076T, 1077 
fibroblasts 
clathrin coated vesicles 699F 
cytoskeleton in cell division 890, 892F 
endocytosis 731, 732F 
extracellular matrix organization 1057, 
1062F, 1064, 1228 
Golgi apparatus 715F, 720 
inducing cell-cell adhesion 1041 
intermediate filaments 946 
irradiation damage to DNA 282F 
mitosis 966F 
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motion 951, 957F 
nucleus of human 180F, 329-330F 
in rat cornea 1057F 
replicative cell senescence 265 
reprogramming 396-397, 398F, 
1254-1256, 1258 
and their transformations 1228-1232 
tumor-associated 1101 
viewed in different microscope types 
535F 
fibronectin 
in basal lamina 1070 
fibronectin type 3 domains 122F, 1073 
integrin binding 1067-1068 
matrix organization 1066-1067 
RGD sequence in 1067, 1075, 1078 
secretion and assembly 1064, 1068 
size and shape 1058F 
syndecan interaction 1061 
fibronectin-coated substrata 1079-1080F 
fibronectin fibrils 1068 
fibronectin receptors 1076T 
fibronectin repeats 1066F, 1067-1068, 
1073 
fibrous proteins 124-125 
field emission guns 554, 559, 561F 
filaggrin 946-947 
filamin 905, 911F, 912, 913F, 957, 958F 
filopodia 
actin filaments in 890, 951 
axon growth cones 952, 1201, 1202F 
Cdc42 and 956, 957F 
endothelial tip cells 1236 
of fibroblasts 892F, 911F 
profilin and 906 
fimbrin 905, 911-912 
first law of thermodynamics 53, 102-103 
fish 
medaka fish 547F 
melanocytes in 939, 940F 
sticklebacks 1174-1175 
see also zebrafish 
FISH (fluorescence in situ hybridization) 
472F 
5’ end, DNA 175, 177F 
fixatives 
electron microscopy 555-556, 559 
light microscopy 535-536 
fixed anions 612, 615 
FK506 655F 
flagellae 
bacterial 942, 1267, 1271 
built from microtubules 941-942 
flagellar axoneme 756F 
flagellar motors 778, 781 
flagellins 294, 1290 
Flamingo cadherin 1039F, 1190 
flavin adenine dinucleotide see FADH>2 
Fic (flowering locus C) gene 1183-1184 
fleas 1276F 
flip-flops between lipid layers 570, 588, 
689-690 
FLIP protein 1025 
flippases (phospholipid translocators) 
570, 574, 690 
flow cytometry 524, 966, 967F 
flower spike, wheat 559F 
flowering times, plants 1182-1184 
fluorescein 537, 544, 545F 
fluorescence-activated cell sorters (FACS) 
440, 441F, 967F, 1242 


fluorescence anisotropy, protein 
interactions 458, 459F 
fluorescence microscopy 503, 524, 547 
asymmetric cell division 1002F 
confocal microscopy 541 
contractile ring 997F 
endoplasmic reticulum 669F, 757F 
kinetochores 988F 
Listeria-infected cell 914F 
localization of specific molecules 
536-537 
migrating fibroblast 1068F 
mitochondrial reticulum 803F 
mitotic spindles and chromosomes 
980-981, 986F, 991F, 994F 
myelinated axons 625F 
neurons 634F 
optic cup derived from ES cells 1258F 
pre-cellularization Drosophila embryo 
1003F 
superresolution techniques 549-551 
time-lapse 75/7F, 803F, 875F, 935F, 
991F, 1178 
TIRF 547-548 
use of multiple probes 538F 
vesicle budding events 705 
zebrafish neural map 1204F 
fluorescence recovery after 
photobleaching (FRAP) 545-546, 
588-589 
fluorescence resonance energy transfer 
(FRET) 459, 543-544, 545F, 
855F 
fluorescent antibodies 
against axon terminals 634F 
against cofilin 954F 
DNA probes and 472F 
Golgi apparatus 715F 
against microtubules 756F 
against mitochondrial membranes 
800F 
against myosin 1234F 
against phosphotyrosine and actin 
1080F 
in structured illumination microscopy 
590F 
visualizing cell components 538-539F 
fluorescent dyes 
antibody blotting techniques 454 
DNA labeling 466, 467F, 478, 503-504 
examples 537 
excitation and emission wavelengths 
537F 
lllumina® sequencing 480 
labeling cancer cells 1120 
photoactivation 544 
protein arrays 458 
quantitative RT-PCR 502-503 
in single-particle tracking 589 
fluorescent indicators 546-547 
fluorescent labeling and microscope 
resolution 532 
fluorescent molecules, imaging using 
PALM 553F 
fluorescent proteins 
gene fusion and 495, 501F, 502, 504F 
in microscopy 537F, 542-546 
see also GFP 
fluorescent reporters 1178 
fluorochromes, inorganic 538 
FNR (ferredoxin-NADP* reductase) 790F, 
792-793 


foam cells 1266 
focal adhesions 
actin network and 955, 956F 
fibronectin and 1068F, 1079 
integrins and 863, 952F, 957, 959, 
1075, 1079 
stress fibers and 891, 924, 957 
vimentins and 959 
folding see histone; immunoglobulin; 
protein structure; RNA 
follicle cells 1158, 1165 
food 
authenticating, using PCR 475 
extraction of energy from 73-88 
storage as fats, starch and glycogen 
78-81 
forensic science 233, 473-477 
formins 905, 906-907, 909F, 911, 957, 
996, 997 
N-formylmethionine 347, 800, 958, 1031 
Forster resonance energy transfer see 
FRET 
fossil record 
phylogenetic trees and 219 
sequence information 223 
founder effects 231-232 
4E-BP protein 1017F 
FoxP3 transcription regulator 1336 
fractal globules 212 
FRAP (fluorescence recovery after 
photobleaching) 545-546, 
588-589 
free energy 
of accurate translation 345-346 
definition 103 
and living things 8, 102-103 
protein folding and 114-115, 354 
sources 11 
see also activation energies 
free energy changes AG 
actin filament elongation 901-902 
ATP formation and hydrolysis 65, 71, 
775 
concentration of reactants 61, 
775-776 
electron transport chain 763-764 
favorable and unfavorable reactions 
57, 60-61 
microtubule polymerization 927 
Nernst equation derivation 616, 762 
oxidative phosphorylation 761 
redox potential difference and 
763-765, 767 
RNA polymerase binding 382 
transcriptional synergy 388-389 
see also standard free energy changes 
free radicals 1115 
freeze-etch electron microscopy 937F, 
942F, 948F, 1085F 
freeze-fracture electron microscopy 1047, 
1049F, 1051F 
freeze substitution 1061F 
frequency see oscillations 
FRET (fluorescence/Forster resonance 
energy transfer) 459, 543-544, 
545F, 855F 
Fringe family transferases 868 
Frizzled receptors 868-871 
frog eggs 996F, 1156F 
frog embryos 
blastula stage components 1147F 


epigenetic inheritance and 205-206 
evidence for differentiation without 
gene loss 369-370 
reprogramming by donor nuclei 
205-206 
frogs 
myosin thick filaments 915, 919 
neuromuscular junctions 630-631 
Rana pipiens 36F 
tadpole skin collagen 1063F 
see also Xenopus 
frontotemporal dementia 324 
fructose 
1, 6-biphosphate in glycolysis 75F, 
104 
6-phosphate in glycolysis 85F, 104 
structure 96 
fruit fly see Drosophila 
FT (flowering locus T) gene 1183 
FtsZ proteins 806, 896-897 
Fuga rubripes (puffer fish) 29, 223 
fumarase as a “perfect enzyme” 143 
fumarate 107 
functional reconstitution, membrane 
proteins 585, 586F 
fungi 
bacterial inhibitors from 351 
as eukaryotes 26-27 
genetic code variation 349 
Neurospora and defining the gene 
416-417 
pathogenic 1271-1272, 1299F 
see also yeasts 
fura-2 indicator 547, 839 
FUS protein 133 
Fused protein kinase 871-872 
fusion proteins (made by fusion) 451 
Cas9 with activators and repressors 
497 
in FRET 545 
with GFP 501F, 502, 504F, 543, 589, 
715F, 802-803, 884F 
with red fluorescent protein 875F 
fusion proteins (promoting membrane 
fusion) 571, 708 


G 


Go phase 
initiation factors and 424 
as aresting state 965, 1012 
G1-Cdk 972, 1012-1014 
G4 cyclins 969 
G+ phase 
Cdk inactivity in 1002-1004 
cell cycle position 964-965 
chromosomes 486 
Go phase, cell cycle 964 
G2/M transition 968-970, 973, 978, 1014 
G418 antibiotic 1255F 
G-protein coupled receptors (GPCRs) 
adrenaline effects 827, 832, 835T 
bacteriorhodopsin and 588 
CXCR4 as 1185 
desensitization by phosphorylation 
848-849 
effects 818, 832 
signaling overlaps with RTKs 861-862 
in smell and vision 843-846 
G proteins (trimeric GTP-binding proteins) 
588, 820, 832-834 


four major families 846T 
G12 form 843, 846T, 959F 
Gi3 form 846T, 959F 
Gi form 833F, 834, 836, 843, 846T, 
848, 862, 959F, 1278 
Go form 846T 
Goif form 844, 846T 
GPCR activation of 833F 
Gg form 836, 838F, 846T, 847F, 859 
Gs form 833F, 834, 836, 843, 846T, 
848, 862 
G (transducin) 845, 846T, 848 
regulation of cyclicAMP 833-834 
regulation of ion channels 843 
signaling via phospholipids 836-838 
subunits 832-834 
Gi/S-Cdk 971, 984, 1004, 1012-1014 
G,/S cyclins 969, 1013 
y-TuRC (y-tubulin ring complex) 929, 
930-931F, 933 
GABA (y-aminobutyric acid) 
as an inhibitory neurotransmitter 629 
receptors as drug targets 632 
Gag gene, HIV 420F 
GAGs (glycosaminoglycans) 
as extracelluar matrix macromolecules 
1057-1058 
glycosaminoglycan chains 711, 719, 


pectins as resembling 1084 
in proteoglycans 1059-1060, 1061F, 
1070-1071 
gain-of-function mutations 
in cancer 1104, 1105F 
defined 487 
Linl4 gene 1181F 
as typically dominant 489 
Ubx genes 1162F 
galactocerebroside 575F 
galactose 96 
B-galactosidase (B-gal) 282F, 394F, 501F 
galactosyltransferase 546F 
galacturonic acid 1083-1084 
GalS repressor protein 522 
gametes 
abnormal 286, 1010 
defined 1004 
genetic recombination with formation 
231 
mitochondria in 807 
in plants 1183 
see also eggs; meiosis; spermatozoa 
ganglion mother cells 1174, 1179 
gangliosides 575 
gap genes 1159-1160, 1162 
gap junctions 
distinguished from channels 611 
in epithelial cells 1036F, 1037 
structure and function 1050-1053 
gap phases, cell cycle 964 
GAPs (GTPase-activating proteins) 157, 
158F 
coat recruitment 703 
control of GTP-binding proteins 820, 
821F 
Ran-GAP 653-654, 656 
Ras-GAPs 855 
Rho-GAPs 858 
gastrula-stage embryos 36F, 205, 541, 
1045, 1147-1148 
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gastrulation 
cell adhesion and migration 1185, 
1187-1188 
pluripotency loss 1148 
preservation of gene expression 
patterns 1161, 1187 
variability 1155 
zebrafish 1189F 
Gata4 transcription factor 1259 
GATC residues, methylation 250, 251F, 
25/F 
gated transport 646 
gating 
gap junctions 1052 
ion channels 604, 614, 618-620, 636F 
lateral gating of protein translocators 
678-679 
Gcn4 protein 425 
GDI (Rab-GDP dissociation inhibitor) 705, 
706F, 707 
GDIls (guanine nucleotide dissociation 
inhibitor) 858 
GDP (guanosine diphosphate) elF2 
binding 424 
GEFs (guanine nucleotide exchange 
factors) 157, 158F 
chromosome stabilization of 
microtubules 986 
coat recruitment 703 
control of GTP-binding proteins 820, 
821F 
Rab-GEFs 706 
Rac-GEFs 960 
Ran-GEF 653, 656 
Ras-GEFs 855, 860, 862F 
Rho-GEFs 858, 958, 997 
gel electrophoresis 
in DNA cloning 469 
DNA fragmentation in apoptosis 
1024F 
of DNA molecules 465-466 
pulsed-field gel electrophoresis 466 
SDS-PAGE 452, 453F, 454, 584 
two-dimensional 452-454 
gel-filtration chromatography 448-449, 
450F, 455 
gel-forming polysaccharides 1058 
gel-forming proteins 911, 957 
gelsolin 905, 909-910 
geminin 975 
gene amplification and cancer 1107, 1111 
gene control regions 384-385, 392-394 
gene conversion 
homologous recombination and 282, 
285-286 
tumor suppressor genes and 1109 
gene duplication 
diversification in signaling 1150 
in Drosophila melanogaster 34 
evolution of globins 229-230 
frequency in vertebrates 34-35 
as mechanism for innovation 16-18, 
227-228 
transporters and ion channels 
603-604, 622 
whole-genome duplications 35, 228 
gene expression 
chromatin loops and 211-212 
chromatin position and 212, 213F 
chromosome-wide changes 409-411 
circadian clocks and 876-878 
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combinatorial controls 394-398, 400F, 
520-521 
conservative site-specific 
recombination and 294 
control of 369-373, 405 
efficiency of 301F 
enzyme regulation through 150 
epigenetic inheritance of 411-412 
and gene function 485-509 
heterochromatin limitation of 194 
localization with in situ hybridization 
502, 536 
localizing using reporter genes 501 
monitoring in Saccharomyces 
cerevisiae 32 
monoallelic 411 
oscillations 1177 
patterns in development 1149F 
post-transcriptional controls 413-428 
quantitative measurements 502-503 
Ras-MAP kinases signaling and 856 
Ras proteins and 854 
regulation at six points 372, 373F 
regulation by ncRNAs 429-436 
representing a low response 825, 
826F 
response to external signals 372, 522, 
867-874 
riboswitches and 414-415 
serial organization 1163-1164 
time required for 1176 
transcription and translation in 7, 306 
variability between cells 523, 524F 
and vertebrate evolution 227 
visualizing 536 
gene families 
arising from gene duplication 17-18, 
228 
common to all biological domains 20 
evolution of globins 229-230 
gene function 
classical genetics studies 485-488 
cluster analysis and 504 
deducing from DNA sequences 20, 
216-217 
deducing from mutations 21-22, 496, 
498-499 
testing with RNA interference 499- 
501 
gene fusion 495 
gene inactivation by RNAi 433, 499 
gene knockouts 494-496, 1117 
gene products, epistasis analysis and 490 
gene regulation see transcription 
regulators 
gene segments, B cells 1319-1320, 
1321F, 1325, 1332 
gene silencing 
Polycomb proteins and 206 
position effects 194 
gene transcription 
Ca?* spikes and 840 
cyclic AMP effects 836F 
JAK-STAT signaling pathway and 863 
response to phytochromes 884 
gene transfer from organelles 801-802 
see also horizontal gene transfer 
general recombination see homologous 
recombination 
general transcription factors 309-312, 
384-387, 388F, 390F 


post-transcriptional control 405, 425 
TFIl numbering 310 
TFIID and poliovirus 1288 
general translation factors 425 
genes 
coding for multiple proteins 318 
definition of 182, 416-417 
distribution in the nucleus 211-212 
essential genes of unknown function 
499 
horizontal and vertical transfer 16 
identifying new genes by ribosome 
profiling 506, 507F 
isolation and over-expression 464 
mechanisms for innovation in 16-17 
mitochondrial, in different species 
801F 
multiple proteins from 415 
nature of hereditary information 
173-174 
number and density in different 
species 182F, 415-416 
numbers coding proteins 1841, 185 
numbers in bacteria and archaea 16 
proteins encoded by 7 
rapidly evolving and conserved 15-16 
specific deletion 294-295 
specific to animal development 1149 
see also cancer-critical genes 
genetic analysis, of tumor cells 1102 
genetic code 
amino acid equivalents of codons 334 
history 7,178 
possible origins 365 
universality 2-3 
variants 349-351, 805 
genetic engineering 
conservative site-specific 
recombination 294-295 
conservative site-specific 
recombination in 294-295 
embryonic stem cells and 1253 
epitope tagging 450-451 
of fluorescent proteins 545 
transgenic organisms 495-497 
transgenic plants 507-508 
using the CRISPR system 497-498 
genetic innovation 16-17, 217-218, 221, 
227-232 
genetic instability, cancer cells 1097, 
1103, 1111-1112, 1116, 1125, 
1133-1134, 1139 
genetic marker proteins 1221 
genetic recombination 175, 231, 318, 
486, 1005-1006 
see also homologous recombination 
genetic screening 
for mutant phenotypes 488-490 
RNA interference and 500 
timing of embryonic development 
1180 
genome annotation 477-483 
genome editing/genome engineering 494 
genome maintenance genes 1104, 1110F 
genome map distance 486 
genome sequencing 
bacteria 1268-1269 
cancer cells 1095, 1109-1111, 1119, 
1137 
and conserved regions 217, 224-225 
cost of 481 


and evolutionary tracing 136-137, 
292 
exome sequencing 1109 
progress in 439, 477 
resequencing 479 
speed of 464, 477 
tumor biopsies 1141 
viruses 1273 
genome sizes 
in amoebae 182 
in bacteria 1267 
chromosome numbers and 183 
compared 28-29. 33, 182, 223-224 
in Danio rerio 35 
Drosophila melanogaster 34 
in E. coli 22, 23F 
in Fuga rubripes 223 
in humans 178, 179 
in mammals 221-222 
minimum 9 
in vertebrates 222-223 
genome-wide association studies 
493-494 
genome-wide screening, by RNAi 500F 
genomes 
aggregated, human microbiome 1264 
ancient 223-224 
of chloroplasts 800-809 
diversity of 10-23 
eukaryotes 23-39 
evolution 216-234 
genes important for multicellularity 
1149 
hybrid 27-28 
mitochondrial 27, 800-809 
proportion encoding ABC transporters 
609 
see also chloroplast genome; human 
genome; mitochondrial DNA 
genomic analysis, and the tree of life 14, 
218-219 
genomic imprinting 407-409 
genomic libraries 469, 471, 479 
genomic plasticity, viruses 1291 
genotypes defined 485-486 
germ-band extension 1045 
germ cells 
embryonic development 1158 
migration in zebrafish 1185F 
germ layers 1148, 1167-1168, 1187 
see also ectoderm; endoderm; 
mesoderm 
germ line 
introducing altered genes 495-497 
mutation rates 238 
RNAi protection 433 
germinal centers 1313F, 1822-1323 
Get3 ATPase 682F 
Get1-Get2 receptor complex 682F 
GFP (green fluorescent protein) 459F, 
501F, 502, 504F, 543, 546F 
fusion proteins 501F, 502, 504F, 543, 
589, 715F, 802-803 
structure 543F 
GGGTTA repeats 262 
gibberellins 881 
Gibbs free energy, G see free energy 
gigantism, pituitary 1196 
GK domains 1050 
Glanzmann’s disease 1076T, 1077 
GIcNAc see acetylglucosamine 


Gleevec® 1135, 1136F 
Gli1, Gli2 and Gli3 regulator proteins 873 
glial cells 625, 1173-1174, 1179, 1186, 
1198-1202, 1210F 
from neural stem cells 1250 
radial glial cells 1200, 1201F 
glioblastoma 
EGF receptor mutation in 1107, 1117 
Rb pathway disruption 1113 
B-globin 
DNA sequence of human gene 179F, 
223F, 318F 
DNA sequence of mouse gene 223F 
HS4 barrier sequences and 202 
B-thalassemia and 324F 
globins, evolution 229-230 
globular proteins, filaments from 123-124 
glucagon 132F, 835T 
glucocorticoid receptor 400 
glucocorticoids, liver response 372 
glucose 
abnormal metabolism in cancer cells 
1098 
ATP yield from oxidation 775 
structure 96 
transcellular transport 605F 
see also glycolysis 
glucose 6-phosphate, in glycolysis 79, 
80F, 85F, 87F, 104 
glucose residues in cellulose 1083 
glucose transport, and Nat gradients 
603F 
glucose trimming 683F, 684-685 
glucosidases 685 
glucosyl transferases 685 
glucuronic acid 1058, 1060F 
glutamate-gated ion channels 631 
glutamate neurotransmitter 629, 636 
glutamic acid structure 113 
glutamine 
structure 113 
synthesis 66, 70 
glutamine synthetase 66 
glutaraldehyde fixative 535, 555 
glutathione S-transferase (GST) 451 
glycan cross-linking 1083-1084 
glyceraldehyde 96 
glyceraldehyde 3-phosphate 75F, 76, 77F, 
78, 104-105, 760 
in photosynthesis 784-786, 792 
glyceraldehyde 3-phosphate 
dehydrogenase 105 
glycine 
as an inhibitory neurotransmitter 629 
in collagens 1062 
in elastin 1065 
structure 113 
glycobiology 718 
glycocalyx 582 
glycogen 
breakdown 827-828, 835T, 837T 
polysaccharides 71F, 79-81, 87, 97 
glycogen synthase kinase 3 (GSK3) 870 
glycolipids 
as membrane constituents 568, 
575-576 
structure 99 
glycolysis 
ATP production by 74-78, 781 
in cancer cells 1098-1099, 1103, 
1115 


in plants 786 
stages of 104-105 
superiority of oxidative 
phosphorylation 756 
glycophorin 579F, 592F 
glycophosphatidylinositol (GPI) 573, 577F, 
582 


GPl-anchors 573F, 577, 582, 688- 
689, 749, 1039, 1060 
glycoproteins, noncollagen 
distinguishing proteoglycans 1059 
as extracelluar matrix macromolecules 
1057, 1073-1074 
laminin 1058F, 1069-1073, 
1075-1077 
matrix organization by 1066-1067 
nidogen 1058F, 1070, 1071F 
variant-specific glycoprotein (VSG) 
1290 
see also fibronectin 
glycosaminoglycans see GAGs 
glycosidases 716, 718, 722 
glycosphingolipids 690, 749 
glycosyl transferases 716, 718-720, 868, 
1059 
glycosylation 
membrane proteins 582-583, 
683-684, 723 
of Notch protein 868 
in protein folding 685 
purpose of 719-720 
glyoxylate cycle 667 
glyoxysomes 667 
glypicans 1073 
Gm1 ganglioside 575F, 576 
GM (granulocyte/macrophage) progenitor 
cells 1245 
GMCSF (granulocyte-macrophage- 
colony- stimulating factor) 864T 
goblet cells, intestinal 1218, 1219F, 1221, 
1223-1225 
gold, colloidal 539, 652F 
gold particles 
anti-plectin antibodies 949T 
gold nanoparticles 589 
nuclear pore complex investigations 
651 
“golden rice” 508 
Golgi apparatus 
among intracellular compartments 
642 
cis- and trans-faces 715F, 716 
localization 715F 
matrix proteins 721-722 
microscopy and 546F, 556F, 558, 
715F 
microtubules in organization 939 
models of transport 720-721 
oligosaccharide processing 716-718 
protein transport from the ER to 
710-722 
proteoglycan assembly 718-719, 
1059 
transport to lysosomes 722-730 
see also TGN 
golgins 721-722 
gonorrhea 19, 288, 1303 
gout 1301 
GPCR kinases (GRKs) 848, 849F 
GPCRs see G-protein coupled receptors 
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GPl-anchors (glycophosphatidylinositol) 
573F, 577, 582, 688-689, 749, 
1039, 1060 
Gram-positive and Gram-negative 
bacteria 610F, 1267 
grana, thylakoid membrane 783, 789 
granulocytes 864T, 1239, 1241T, 1245 
granzymes 1334 
graphene 554F 
gravity and plant growth 881F, 883, 884F 
Grb2 adaptor protein 824F, 855, 862F 
great apes 218, 219F, 220 
green algae 588, 623 
green fluorescent protein (GFP) 459F, 
501F, 502, 504F 
Grim protein 1029 
GRKs (GPCR kinases) 848, 849F 
Groucho co-repressor 870F, 871 
growth 
of animals and organs 1193-1198 
and degrowth 1248 
growth cones, axons 858, 943, 951, 
1201-1204, 1206, 1208-1211 
endothelial tip cells resembling 1237 
growth factors 
control of cell growth 1011, 1017, 
1114F 
definition 1011 
mTOR and 861 
PI 3-kinase and 1017 
growth hormone 
gigantism and dwarfism 1196 
JAK-STAT signaling pathway and 
864T 
see also plant growth regulators 
growth inhibition by TGFB 1012 
GSK3 (glycogen synthase kinase 3) 
870-871, 872F, 881 
GST (glutathione S-transferase) 451 
GTP-binding proteins see GTPases 
GTP cap, microtubules 927-929 
GTP (guanosine triphosphate) 
in the citric acid cycle 83, 84F, 
106-107 
structure 85F 
GTP hydrolysis 
microtubules 928 
in nuclear import 654 
in protein synthesis 343-344, 
345-346 
in Rab5 domains 707 
tubulin catalysis of 894 
GTPases 
as cell regulators 156-157 
EF-Tu 160-161, 163 
G protein a-subunit as 820, 832 
monomeric 703, 820, 854 
Rab and Ras families 278, 854 
Rho family as 997 
septins as 949 
guanine 
deamination to xanthine 272F, 273 
DNA base pairing with cytosine 176, 
177F 
O®-methyl- 271 
structure 100 
synthesis and riboswitches 414F 
guanosine 
7-methyl- 316-317F 
N,N-dimethyl- 337F 
guanyl transferases 316 
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guanylyl cyclase 844, 846-847 
guide RNAs 328, 329F, 429, 435, 497-498 
gut lining see small intestine 


H 


H* see proton 
Haelll nuclease 465. 468F 
Haemanthus 992F 
hair cells, auditory 560F, 619, 890, 924, 
1171, 1189 
as non-renewable 1227 
hairpin helices, DNA 246, 247F 
hairpin helices, RNA 307, 326 
Hairy protein (Drosophila) 422F 
half-lives 
actin filaments 896-897, 906, 919 
connexins 1052, 1053F 
extracellular, of nitric oxide (NO) 847 
and time to steady state 826, 827F 
Halobacterium salinarum 587, 588F 
handedness of helices 124-125 
haploid-diploid cycle 4860 
haplotype blocks 492-494 
HARs (human accelerated regions) 226 
HATs (histone acetyl transferases) 196 
HDACs (histone deacetylase complexes) 
196, 201, 390, 1257 
HDL (high-density lipoproteins) in 
nanodiscs 586 
head and tail polymerization 72-73, 340 
heart attacks 
apoptosis and 1031-1032 
muscle replacement by scar tissue 
1247 
muscle replacement by 
transdifferentiation 1258 
heart muscle 
actin and myosin isoforms 923 
actin and myosin mutations 948 
effects of acetylcholine 843 
effects of cyclic AMP 835 
localization of mitochondria 755, 756F 
myosin ll in 916-918 
transdifferentiation of fibroblasts 1258 
heart muscle cells 1233F, 1247, 
1258-1259 
heart tissue morphogenesis 1059 
heat 
generation by brown fat 780 
release by biological reactions 61, 65, 
76, 102 
heat shock proteins 
hsp60 355-357, 662 
hsp70 355-357, 659-662, 683, 702 
as molecular chaperones 355 
heavy chains 
dynein 938F, 942 
immunoglobulin 118F, 1316-1318, 
1320-1322 
kinesis 936 
myosin (MHC) 915, 923, 958F 
Hedgehog genes 1160-1161F 
Hedgehog pathway 1150, 1154, 1160 
Hedgehog proteins 871-873, 1160 
see also Sonic hedgehog 
helicase loaders 255, 256F 
helices 
closed ring alternative 124 
collagen triple helix 124-125, 1061, 
1070 


cytoskeletal filaments 893-894 
DNA double helix 3F 
handedness 124-125 
microtubules 926 
reasons for abundance 124 
S4 helix 622 
see also a-helices; double helix 
Helicobacter pylori 1129F, 1132, 1263, 
1265, 1277 
helix-loop-helix motifs 377, 1171 
helix-turn-helix motifs 376 
helper T cells (Ty cells) 
B cell activation 1316-1317, 1320, 
1322-1323, 1335 
CD4 expression on 1331-1332, 1335 
co-stimulatory proteins/signals 1314 
cytotoxic T cells and 1333 
effector helper T cells 1326, 1335, 
1338F 
follicular helper T cells 1336 
HIV invasion of 1332 
macrophage activation 1335 
possible differentiation 1335-1336 
presentation of peptide- MHC 
complexes 1130, 1327F, 1338F 
T cell classes 1325 
hematopoiesis 1241, 1243F 
hematopoietic cell survival 1246-1247 
hematopoietic progenitor cells 1243, 
1245-1246 
hematopoietic stem cells 1239, 
1242-1243, 1308 
hematoxylin 535, 536F 
heme group 
biosynthesis 760 
in cytochrome c 766 
and iron, in hemoglobin 147F, 148 
hemichannels (connexons) 1051-1052 
hemidesmosomes 893F, 946, 947F, 1036, 
1076 
hemoglobin 
expression in blood cells 371 
fetal 229 
heme groups in 147F, 148 
interspecies comparisons 37F 
as a multisubunit protein 123 
see also globins 
hemophilia 291, 300F 
heparan sulfate 871, 1058, 1060F, 1073, 


1151, 1279 
hepatitis-B and -C viruses 1129-1130, 
1132 


hepatitis delta virus 5F 
hepatocytes, I-cell disease 728-729 
Her2 protein 1137 
Herceptin® 1137 
hereditary cancers 282-282, 1107-1108 
hereditary nonpolyposis colorectal cancer 
(HNPCC) 250, 1124-1125 
heredity 
as Characteristic of life 2 
DNA and the mechanism of 174, 
177-178 
epigenetic inheritance 194 
hermaphroditism, C. elegans 33, 1180F, 
1194 
herpes viruses/herpes simplex virus 
1273T, 1279, 1286, 1288 
Hes genes 1178 
heterochromatin 
around centromeres 432-433 


in cancer cells 1097, 1109 
and euchromatin 194 
and gene silencing 206-207 
localization 211-212 
pericentric heterochromatin 204F 
RNA interference and 432-433 
in X-inactivation 410 
see also HP1 
heteroduplex DNA 278, 280F, 284-286 
heterokaryons 444F, 588 
heterophilic binding 1039F, 1055, 1056F 
heterotypic membrane fusion 712 
heterozygosity, loss of 281 
hexokinase specificity 141 
HGF (hepatocyte growth factor) 1073 
HHV-8 human herpesvirus 1132 
Hid protein 1029-1030 
HIFs (hypoxia-inducible factors) 1191, 
1237 
high-mannose oligosaccharides 717, 
718F 
Hill coefficients 517-518 
Hindlll nuclease 465-466F 
Hippo pathway 1197 
hippocampus 
LTP (long-term potentiation) 636-637 
microtubule-associated proteins 932F 
neuron turnover rate 1250 
histamine 742, 1239, 1241T, 1317 
histidine 
chlorophyll-protein complex 787-788 
metal-ion binding 450 
structure 112 
histone acetylases 202 
“histone code” 198 
histone demethylases 196, 201, 1257F 
histone folds 188, 189F 
histone H1 (linker histone) 192-193, 1252 
histone methyl transferases 196, 390, 
1257F 
histone modification factors 1257F 
histone-modifications, chromatin 
immunoprecipitation 505 
histone-modifying enzymes 
and transcription activators 386 
in transcription initiation 313 
in X-inactivation 411 
histone octamers 188, 189F, 261, 976 
histone proteins, in nucleosome 187-188 
histone tail side chain modifications 
196-197 
histone tails, N-terminal 189F, 190 
in chromatin compaction 193F 
inheritance of specific modifications 
205F 
meaning of specific modifications 
200F 
in nucleosome stacking 192 
histone variants 
fibroblast reprogramming and 1257F 
histone variant H3.3 205-206 
incompatibility of methylation and 
acetylation 207 
site-specific insertion 198 
histones 
and non-histone proteins 187 
nuclear reprogramming 1252-1253 
separation from samples 440 
side-chain modifications 196-197 
synthesis and the cell cycle 261 
Histoplasma capsulatum 1272 


HIV (human immunodeficiency virus) 
and cancer 1130T, 1132 
capsid structure 562F 
combination therapies 1140 
detecting in samples 475F 
diversification from SIV 1291F 
genome 420F 
infection and membrane fusion 709, 
710F 
invasion of helper T cells 1332 
regulated nuclear transport and 
419-421 
replication errors and evolution 1291 
transcription attenuation 414 
HIV-1 1273T, 1286, 1289, 1291 
receptors required 1279-1280 
HLA proteins (human leukocyte 
associated) 1327 
HNPCC (hereditary nonpolyposis 
colorectal cancer) 250, 
1124-1125 
hnRNPs (heterogeneous nuclear 
ribonucleoproteins) 323F, 326 
Holliday junctions 283F, 284-285 
homeodomain family 120, 122, 378F 
homeodomain motif 376, 1163 
homeostatic mechanisms 1195, 1217, 
1226, 1247 
homeotic selector see Hox 
Homo sapiens 
as a eukaryote 16F 
as amodel organism 29T, 33, 
491-492 
homogenates 
cell-free systems 451 
from subcellular fractionation 445 
homolog pairing 1005F, 1006-1007 
homologous chromosomes 
and imprinting 408F 
segregation in meiosis | 1008 
homologous genes 216 
homologous protein species 
interchangeability 1146 
homologous recombination 276-287 
BRCA1 and BRCA2 defects 266T, 
267 
cellular regulation of 280-282 
double strand break repair 275, 
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in meiosis 282-285 
in transgenic organisms 495, 497, 
498F 
use by cancer cells 1100 
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in human cells 180 
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axon growth 1202 
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contrasted with heterophilic 1039F 
DSCAM proteins 1207 
immunoglobulin superfamily and 
1055, 1056F, 1202 
sealing strands 1048 
homotypic membrane fusion 712, 721, 
748 
homunculi 1155, 1205 
Hooke, Robert 439, 1081 
horizontal gene transfer 
antibiotic resistance 19, 289, 1269, 
1292-1293 


in prokaryotes 18-20 
sexual reproduction and 19-20 
as source of innovation 16, 17F 
three mechanisms 1268 
hormone-secreting pituitary cells 840 
hormones 
in extracellular signaling 815 
gene expression response 372, 400 
in hematopoiesis 1244 
melatonin 877 
moderated by cyclic AMP 835T 
production through DNA cloning 484 
somatostatin 835-836 
steroid hormones 875-876 
timing of developmental transitions 
1182 
horseradish peroxidase 735F 
horseshoe crab 223 
housekeeping genes 406, 407F 
Hox (homeotic selector) complex 
cell memory and 1164 
inhumans 1169 
serial gene expression 1163-1164, 
1169 
Hox genes 1162-1164 
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1169 
rhombomeres and 1188 
HP1 (heterochromatin-specific protein) 
197, 200F, 202F, 204F, 206, 
210-211 
HPLC (high-performance liquid 
chromatography) 449, 457 
HPV (human papillomaviruses) 1131, 
1265, 1273T 
HS4 barrier sequences 202 
hsp60 (heat shock protein) 355-357, 662 
hsp70 protein family 
BiP 683 
clathrin removal 702 
hsp60 and 355-357 
mitochondrial 659, 660-661F, 662 
human body 
daily ATP turnover in 774 
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1263-1264 
mutation rate 1091, 1094, 1095F 
number of cells and cell types 2, 
1091, 1217, 1264 
number of lymphocytes 1308 
proteins unique to humans 122 
human brain, number of neurons 627, 
1198 
human cells 
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tRNA and anticodon numbers 336 
human chromosome 5 472F 
human chromosome 6 132F, 1327 
human chromosome 9 472F, 1093, 1094F, 
1135F 
human chromosome 13 1108 
human chromosome 22 
compaction in mitosis 187 
sample section 183F 
translocation with chromosome 9 
1093, 1094F, 1135F 
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human genome 
analysis, and medical treatment 506 
conserved regions as functional 217 
evidence of migrations 232 
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as genetic model 36- 37 
haplotype blocks 492 
human mitochondrial genome 
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MHC polymorphisms 1331 
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mobile genetic elements in 218F, 287, 
291-292 
mouse genome compared with 220F, 
221-222 
mutation rates 238 
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expressed 371 
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1062 
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number of genes coding for 
intermediate filaments 944, 946 
number of genes coding for ion 
channels 635 
number of genes coding for orphan 
receptors 875 
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number of genes coding for receptor 
proteins 814, 832 
number of mcRNAs produced 429, 
435 
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183F, 184T 
other statistics 184T 
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1112-1113 
protein-coding genes 120, 122-123 
replication origins 260-261 
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183F 
sample section of X-chromosome 
300F 
size 178, 179, 184T 
variability 38, 232-234 
human genome project 
genomic DNA cloning 471 
repetitive DNA problem 479 
human microbiota 1263-1275, 1277 
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as a genetic model 491-492 
genome-wide association studies 493 
population growth and genetic 
variation 231 
Hunchback transcription activator 393F, 
394, 395F, 1161F, 1179 
huntingtin gene 224F 
hyaluronan 1057F, 1058-1060, 1061F 
hybridization 
nucleotide sequences from 472 
in recombinant DNA technology 464, 
473 
in situ hybridization 502, 536 
hybridization (DNA) 
and homologous recombination 
277-278 
in microarrays 257, 258F 
hybridomas 444 
hydride ions, NADH and NADPH 67-68 
hydrocarbon tails, fats 83, 98-99 
hydrocarbons 
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structures 90 
hydrodynamic measurements 455 
hydrogen bonding 
in the a-helix and B-sheet 116, 
579-580 
in B-barrels 582 
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codon-anticodon matches 345F 
in the DNA double helix 175, 255 
edge recognition of base pairs 374 
glycolipids 575 
in hybridization 472 
transmembrane proteins 579-580 
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hydrogen bonds 
binding site example 135F 
as noncovalent attractions 44, 94 
in the nucleosome 188 
polar amino acids 114 
structure of 45, 92 
in water 44, 94,613 
hydrogen ion release 481 
hydrogen nuclei in NMR 461 
hydrogen peroxide in peroxisomes 
666-667 
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hydrolases 
activity in endosomes 736 
in lysosomes 722-723, 727 
transport to endosomes 728-729F 
hydrolysis 
DNA damage from 267T 
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see also ATP hydrolysis; GTP 
hydrolysis 
hydrolytic editing 339F 
hydronium ions 46, 613, 773 
hydropathy plots 579-580, 681F 
hydrophilicity 
contrasted with hydrophobicity 44, 92 
soluble immunoglobulins 1317 
hydrophobic chromatography 448, 452 
hydrophobic forces 
in lipid bilayers 99 
as noncovalent attractions 44-45, 95 
and protein structure 111, 114, 355 
hydrophobicity 
contrasted with hydrophilicity 44, 92 
ligand-modulated transcription 
regulators 874 
as low in disordered regions 126 
membrane-bound immunoglobulins 
1317 
nonribosomal proteins 808 
and quality control mechanisms 357, 
359 
transmembrane proteins 577, 579F, 
593 
hydrothermal vents 11-12 
hydroxyl groups, a- and B-links 97 
hydroxyl ions 46, 93 
hydroxylapatite (calcium phosphate) 1229 
hydroxyproline 1062F, 1084 
hygromycin 351F 
hyperpolarization 844-845 
hypervariable immunoglobulin regions/ 
loops 1318-1319, 1321 
hypochlorite 1239 
hypothalamus 877 
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hypoxia 1115, 1116F, 1191, 1238F 

hypoxia-inducible factors (HIFs) 1191, 
1237 
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l-cell disease (inclusion-cell disease) 
728-729 
IAPs (inhibitors of apoptosis) 1029-1030 
iCAD inhibitor (of CAD) 1024F 
ICAM1 1055F, 1076T, 1077 
ICAMs (intercellular cell adhesion 
molecules) 1055, 1056F 
identical twins 412, 477 
iduronic acid 1058, 1060F 
IFN see interferon 
IFT (intraflagellar transport) 943 
Ig see immunoglobulins 
IGF family (insulin-like growth factors) 
860, 1114 
IGF1 (insulin-like growth factor-1) 850T, 
852, 1196 
Igf2 (insulin-like growth factor-2) 407, 409 
IGFBP (insulin-like growth factor-binding 
protein) 1066F 
iHog protein 872 
IkB kinase kinase (IKK) 875F 
IkB proteins 874 
IL1, 5, 6, 10, 12, 13, 17 and 21 see 
interleukins 
ILK (integrin-linked kinase) 1079 
llumina® sequencing 480 
image enhancement 534, 561 
imaginal discs (Drosophila) 1195 
immediate early genes 856, 1013F 
immortalized cell lines 442-443 
immune rejection, transdifferentiation and 
iPS cells 1259 
immune response 
cancer treatment using 1137-1139 
phase variation in bacteria 294 
see also adaptive; innate immune 
system 
immune system 
nonsense-mediated decay in 352-353 
see also CRISPR system 
immunization 
as the basis of vaccination 1307 
primary and secondary immune 
responses 1310, 1311F 
“self” molecules 1313 
tumor cells and 1324 
immunoblotting (Western blotting) 455 
immunodeficiency 957 
see also HIV 
immunofluorescence 539F 
immunofluorescence microscopy 
cytotoxic T cells 1334F 
dendritic cell 1326F 
embryonic nervous systems 1041F 
epithelial cells 946F, 967F 
mitochondria 756F 
neurons 932F 
nuclear localization signals 652F 
Sordaria prophase | 1007F 
spermatozoa 590 
synaptonemal complexes 1009F 
immunoglobulin fold 121 
immunoglobulin heavy chains 118F, 
1316-1318, 1320-1322 


immunoglobulin (lg) domains 1318-1319, 
1321F, 1338 
Ig-like domains 1325, 1338 
immunoglobulin (lg) superfamily 
in antigen recognition 1338-1339 
IgA 1316-1317, 1318T, 1320, 1322, 
1323F 
IgE 1316-1317, 1318T, 1319F, 1320, 
1322, 1323F, 1336 
IgG 1316-1317, 1318T, 1320, 1322, 
1323F 
IgM 1316-1317, 1318T, 1319-1320, 
1322-1323 
immunoglobulin light chains 1316, 1317F, 
1318, 1319F, 1320-1322 
V region 1320, 1321F, 1323F 
immunoglobulins (lg) 
B and T cells distinguished 
1308-1309, 1324 
and B lymphocytes 1315-1324 
binding site versatility 138, 1320 
as bivalent molecules 1316 
in cell-cell adhesion 1055-1056, 1077 
diversification mechanisms 1323F 
evolution 1338-1339 
exon recombination in 230 
five classes of mammalian 1316-1317 
Ig-like domains 1325, 1338 
primary human repertoire 1319 
secondary classes 1322-1323 
secondary human repertoire 1320 
soluble and membrane-bound 
1315-1316, 1317 
V(J)D recombination in 290 
see also antibodies 
immunogold electron microscopy 
556-557 
immunological memory 1309-1311, 1326 
immunological self-tolerance 1307, 
1336-1337 
AIRE gene in 1333 
central and peripheral tolerance 1314 
FoxP3 gene in 1336 
mechanisms of 1313-1315 
MHC-peptide complexes in 
1328-1329 
immunological synapses 1333, 1334F 
immunoprecipitation 449-450 
co-immunoprecipitation 457, 505, 
506F 
immunostaining 1202F 
immunosuppression 
immunosuppressive drugs 655F 
tumor microenvironments 1137, 
1138F 
importins see nuclear import 
imprinting, genomic 407-409 
in silico, meaning 472 
in vitro, meanings 440-441 
in vivo, meanings 440-441 
inactivation, and signal response speeds 
825-826, 848 
incontinentia pigmenti 300F 
indels 492 
independent-choice theory (stem cells) 
1222, 1223F 
indirect immunocytochemistry 539 
indole-3-acetic acid (auxin) 882, 883F 
induced-fit models 153, 345 
induced proximity 822 
inductive signaling 1150-1151 


morphogens in 1151, 1153 
sequential induction 1153-1154, 1160 
in vertebrate embryos 1167F, 1177, 
1184, 1198 
infants 
antibody transport in newborns 737 
birth defects 1154 
infection strategies 1276-1294 
infectious diseases 
immunity and the adaptive immune 
system 1297 
mortality 1263 
see also pathogens 
inflammasomes 1301 
inflammatory response 
and cancer 1132 
and pathogens 1300-1301 
and white blood cells 1240 
influenza virus 
as a disease-causing virus 1265, 
1273T 
effect on host-cell transcription 1288 
genome reassortment 1291, 1292F 
infection by endocytosis 709, 
1280-1281 
influenza A 1273T, 1280 
negative strand genome 1289 
nomenclature 1292F 
pandemics 1291, 1292F 
structures 1274 
inheritance 
chromosome-wide gene expression 
changes 409-411 
of organelles 648, 807 
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inhibitors of protein function 459-460 
inhibitory neurotransmitters 629 
inhibitory postsynaptic potential 633 
inhibitory signals, sequential 820, 821F 
initiation factors 
elFs (eukaryotic initiation factors) 347, 
348F, 423-428 
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phosphorylation 423-424 
initiator caspases 1022-1025, 1028 
initiator proteins 254-255, 256-25/7F, 259 
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speed of 1298 
Toll and Toll-like receptors 1165 
innate immune system 
as first line of defense 1298-1306 
natural killer cells in 1304 
innate lymphoid cells 1326 
innexins 1051 
inosine, in tRNA 336, 337F 
inositol 1,4,5-trisohosphate (IP3) 837, 847 
inositol phospholipid signaling pathways 
836, 848, 852, 853F, 861 
inositol phospholipids (phosphoinositides) 
572, 574, 836 
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profilin binding 906 
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insect vectors 1276 
insertional mutagenesis 488, 491 
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insulin 
action via receptor tyrosine kinases 
850T 
assembly 130 
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1332 
PI 3-kinase-Akt signaling pathway and 
860, 1114-1115 
recycling endosomes and 738 
secretion by pancreatic B-cells 1226 
insulin receptor 823, 850T, 851-852 
Int? gene 869 
integrases 291 
integration, intracellular signaling systems 
825 
integrin clusters and 1079 
integrins 
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1077-1078 
binding sites 1075 
in cell adhesion and motility 955-956, 
959 
cytoplasmic tyrosine kinases and 
862-863 
fibronectin binding 1067-1068 
intracellular signaling proteins and 
1079-1080 
a B2 integrin (LFA1) 1055F, 1076T, 
1077 
as matrix receptors 1074 
mediation of cell-cell adhesion 1038, 
1054, 1077 
mediation of cell-matrix junctions 
1037F, 1038 
a- and B-subunits 1075 
as transmembrane adhesion proteins 
1037 
types of 1076T 
interacting proteins, identifying 457-458 
interaction domains 822-824, 852-853, 
860 
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interference effects, in microscopy 531, 
532F, 550 
interferon-a (IFNa) 
as acytokine 1304 
JAK-STAT signaling pathway and 
864T, 1304 
interferon-B (IFNB) 1304 
interferon-y (IFNy) 
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1335-1336 
JAK-STAT signaling pathway and 
864T 
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interleukins 
IL1 (interleukin-1) 873, 1257F, 1301 
IL3 1257F 
IL4 1336 
IL5 1335F 
IL6 1301, 1335F, 1336 
IL10 1335F, 1336 
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IL21 1335F, 1336 
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and the nuclear envelope 890 
and the nuclear lamina 891, 944 
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and septins 944-950 
intermembrane space, mitochondria 658, 
663-664 
internal membranes 
chloroplasts 27F 
eukaryotic cells 24 
mitochondria 26F 
interphase 
as cell cycle stage 185, 964 
chromosome state in 185-188, 198, 
207-208, 209F 
components 964, 965F 
interpolar microtubules 982 
intestinal crypts 1122, 1124, 1218-1219 
intracellular compartments see organelles 
intracellular development programs 1179, 
1200 
intracellular pathogens 1278-1279, 1282, 
1283F, 1284-1285 
intracellular signaling cascades see 
enzyme cascades 
intracellular signaling complexes 822, 
823F, 851F, 859 
intracellular signaling pathways 814 
combating noise 820-822 
interaction domains 822-824, 
852-853, 860 
in plants 880-885 
response behaviors 824-825 
response speeds and turnover rate 
825-826 
see also signaling pathways 
intracellular signaling proteins 578, 700, 
814, 1079-1080 
intraepithelial neoplasia 1131 
intragenic mutations 16, 17F 
intralumenal vesicles 724F, 729-730, 
735-737, 738F 
intrinsic activation pathway, apoptosis 
1023-1029 
anti-cancer drugs binding to Bcl2 
1031 
Bcl2 regulation 1025-1028 
mitochondrial involvement 1025 
intron sequence ambiguities 416 
introns (intervening sequences) 
consensus nucleotide sequences 
319F 
discovery 417 
evolution rate 220F 
inhuman genome 183F, 184, 224F, 
318F 
self-splicing 324 
size range 319, 321 
transcribed, removal by RNA splicing 
315-316, 317-318, 320F, 336 
unique removal by IRE1 688 
invadopodia 952 
invariant membrane proteins 1316-1317, 
1326, 1336 
invariant polypeptide chains, BCR/TCR 
1327F, 1328-1329, 1330F, 1337F, 
1338 
invariant sections, MHC protein 1331 
invasin 1281 
inversion mutations 487 
inverted repeats 289, 603, 604F, 676F 
ion-channel-coupled receptors 817F, 818, 
843 
ion channels 
all-or-nothing opening 626 
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cyclic-nucleotide-gated 843-844 
depolarizing and hyperpolarizing 627 
G protein direct regulation 843 
gating 604, 614, 618-620, 636F 
ion selectivity 613-614, 617-618 
mechanosensitive 619-620 
paracellular pores as 1049 
photosensitive 623 
sequential activation in neuromuscular 
transmission 632-633 
ion-concentration gradients 601-604 
ion-exchange chromatography 448-449, 
452 
ion-gated channels 614 
ion pumps 164 
ion-sensitive indicators 546-547 
ion sources, mass spectrometry 456-457 
ion torrent™ sequencing 481 
ionotropic receptors see ion-channel- 
coupled receptors; transmitter- 
gated ion channels 
ions, inorganic, inside and outside cells 
598T 
IP3 see inositol 1,4,5-trisohosphate 
IP3 receptors (IP3-gated Ca?t-release 
channels) 837-840 
ipilimumab 1138 
iPS cells (induced pluripotent stem cells) 
398, 401, 1254-1259 
IRE1 protein kinase 687-688 
IRES (internal ribosome entry sites) 
425-426, 1288 
iron 
and atmospheric oxygen levels 797 
in cytochrome c and hemoglobin 766 
iron-sulfur clusters 
chloroplasts 787, 792-/794F 
mitochondria 760, 766-768, 
769-771F, 773F 
irradiation see radiation 
IRS1 (insulin receptor substrate 1) 824F, 
852 
isocitrate 106-107 
isoelectric focusing 453-454 
isoforms 
of actin 898 
of collagen 1070, 1071F, 1072 
of connexins 1051 
and defining the gene 417 
of DSCAM proteins 415, 1207 
of fibronectin 1067 
of histone H3 1253 
of laminin 1070, 1072F 
of tubulin 925 
isoleucine, structure 113 
isomers/isomerizations 
amino acids 112 
citric acid cycle 106 
in glycolysis 104 
monosaccharides 96 
isopeptide bonds 15/7, 159F 
isotopic labeling, equilibrium 
sedimentation 447 
radioisotope labeling 452, 454, 
466-467, 1219 
Ixodes scapularis 1264F 
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J gene segments 1320, 1321F, 1325 
JAK-STAT signaling pathway 863-864, 
1304 


Janus kinases (JAKs) 863 

jellyfish 543, 547 

“jumping genes” see mobile genetic 

elements 

junction complexes/junctional complexes 
ER with mitochondria 691 
integrin clusters and 1079 
tight junctions 1049-1050 

junctional diversification 1321 

junctional epidermolysis bullosa 1069 
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K* channel types 634-636 
K* gradients 607, 615, 617 
Kt leak channels 614-615, 617-618, 
619F 
K-Ras proto-oncogene 1123, 1125 
Kai proteins 878 
Kartagener’s syndrome 942 
karyopherins (nuclear transport receptors) 
326, 653 
karyotypes 
abnormalities and cancer 1097, 1108, 
1111, 1125F 
human chromosome set 181F, 182 
KASH proteins 949, 959 
katanin 933, 935-936 
Kcnq1 gene 409 
KDEL receptors/sequences 713-714 
keratan sulfate 1058, 1060-1061F 
keratins 944T, 946-947, 949, 953, 1046 
a-keratin 115, 117, 124 
keratocytes 951, 953, 954-955F 
a-ketoglutarate 84, 106-107 
ketoses 98 
kidney cells 536F, 558F 
kidney disease 1071 
kidney glomeruli 1063T, 1069-1071 
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see also cytotoxic T lymphocytes 
kinase cascades 820 
kindlin 1037T, 1075F, 1078F 
kinesin-1 936, 937F, 939 
kinesin-2 943 
kinesin-4 983, 991, 992F 
kinesin-5 983, 985, 987F, 994 
kinesin-10 983, 991, 992F 
kinesin-13 933-934, 936 
kinesin-14 983, 985, 987F 
kinesins 163, 459, 460F, 936-937, 939, 
1288 
kinesis-related proteins 983 
kinetic binding experiments 458 
kinetic energy 54F 
kinetic proofreading 345 
kinetic rate constants 458, 516F 
kinetics of enzyme 141-144 
kinetochore fibers 989F, 990 
kinetochore microtubules 982 
kinetochores 
bi-orientation 988-989 
in meiosis 1008 
microtubule attachment 990F 
in mitosis 980-981, 988F 
as protein complexes 186 
sister-chromatid attachment 987 
Skp1 protein in 167-168 
species differences 203 
Kit gene/kinase 
mutation effects in human and mouse 
37F, 1187 


stem cell factor and 1244 
KKXX sequences 713 
KIf4 transcription regulators 398-399F, 

1254-1255 

see also OSKM factors 
Km (half-maximal reaction rate) 141, 143 
Krebs cycle see citric acid cycle 
kringle domains 122 
Kruppel gene 1159 
Kruppel protein 393F, 394, 395F, 1179 
Ku protein 275F 
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L1 element see LINEs 
L15 protein 347F 
labeling see isotopic labeling 
laboratory culture see cultured cells 
Lac operon 381F, 382-383 
B-lactamases 1293 
lactate production in cancer cells 1098 
lactic dehydrogenase 118F 
LacZ gene/protein 382F, 384F, 1162F, 
1221-1222F 
lag phase, actin filament growth 899-900, 
902 
lagging strand 
defined 242 
RNA primer synthesis 245 
lakritz mutant 488 
lamellae 1083-1086 
lamellipodia 890, 892F, 906, 912, 
951-957, 958F, 959 
axon growth cones 1201, 1202F 
laminins 1058F, 1069-1073, 1075-1077 
cultured cells and 1223 
structure 1070F 
laminopathies 948 
lampbrush chromosomes 207-209, 211 
Langerhans, islets/cells 1226, 1240 
“lariat” formation, RNA splicing 318-319, 
320F, 321, 324 
lasers 
confocal microscopy 541 
in FRAP 545, 546F, 589 
multiphoton imaging 542F 
photoactivation/photoswitching of 
fluorescent tags 544, 546F, 
551-552 
in TIRF microscopy 548 
last common ancestor 
cancer diversification from 1119F 
eukaryotes 880 
of humans and chimpanzees 218, 
219F 
of humans and mice 221 
species divergence and 16F, 17, 37F, 
226 
late endosomes 
delivery to lysosomes 696F, 723, 727, 
733 
early endosome maturation 730, 
735-736 
Rab proteins 706T, 707 
latent transcription regulators 
Hedgehog as 871 
NF«B proteins as 873 
Notch protein as 868 
plant photoproteins and 884 
in regulated proteolysis 867 
Smad family proteins as 865 
STAT proteins as 863, 865 


lateral diffusion in bilayers 570, 588, 589F 
lateral inhibition 867, 868F, 1151-1152, 
1174F, 1178-1179 
Notch mediated 1171-1173, 
1224-1225 
latrunculin 904, 906 
Lck protein kinase 862, 1332, 1337F 
LDLs (low-density lipoproteins) 733-734 
LDV sequence (Leu-Asp-Val) 1075 
leading strand, defined 242 
“leaky scanning” 348, 424 
lectin pathway, complement system 
1302-1303 
lectins 
carbohydrate layer affinity 582 
in cell recognition 575-576 
ER chaperones as 685-686 
export from ER 711-712, 720 
mannose-binding lectin (MBL) 711, 
1301, 1302F 
receptors as PRRs 1300 
selectins as 1054, 1312 
LEF1/TCF transcription regulator 870F, 
871 
Lefty protein 1168 
Legionella (L. pneumophila) 739, 1284F, 
1285-1286 
Legionnaire’s disease 1263, 1285 
Leishmania (L. tarentolae) 804, 1284F, 
1286F 
leptin genes 219-220F 
leptotene 1006, 1007F 
Let7 gene 1180-1181 
lethal mutations 487, 488, 491, 496 
leucine-rich repeats 1299-1300 
leucine structure 113 
leukemias 
acute lymphocytic leukemias 1117 
Bcl2 overproduction and 1246-1247 
as cancers of white blood cells 1092 
chronic myelogenous leukemia (CML) 
1093-1095, 1135-1136 
leukocyte adhesion deficiency 1076T, 
1077 
leukocytes see white blood cells 
LeuT protein 604F 
lever arm, myosin 916, 917F, 925F 
LFA1 (aLß2 integrin) 1055F, 1076T, 1077 
Lgr5 gene 1221-1223, 1224F 
LHC (light-harvesting complexes/antenna 
complexes) 788-789, 794F 
licensing, of replication origins 974-975 
life, origins of 362-366 
ligand-gated channels 614, 629, 636 
ligand-modulated transcription regulators 
874-876, 877F 
ligands 
binding and allostery 152-153 
binding and linkage 151 
binding and RIK dimerization 851 
defined 134 
in extracellular signaling 815-816 
see also binding sites 
light chains 
dyneins 937 
immunoglobulin 1316, 1317F, 1318, 
1319F, 1320-1322, 1323F 
kinesin 936, 939 
myosin (MLC) 915-916, 922, 957, 
958F, 997 
light-driven cation channels 623 


light-driven proton pumps 586-588, 602F, 
606 
light microscopy 
asymmetric cell division 1002F 
bright- and dark-field 533-534, 535F 
confocal microscopes 540-542, 544F 
continuing utility 529 
differential-interference-contrast 
microscopy 533-535, 560F, 
1002F, 1202F 
image deconvolution 540, 542 
image enhancement 534-535 
of living cells 533-534 
of mitochondria 756, 802 
multiphoton imaging 542 
plant cell in telophase 1000F 
resolution 530-532 
sister chromatid separation 992F 
specimen preparation 535-536 
stains for 529, 535, 536F 
superresolution techniques 549-554 
three-dimensional imaging 540, 541F, 
542, 550F, 553F 
typical design 531F 
see also fluorescence microscopy 
light reactions 783-785, 787F, 793 
light-sensitive phytochromes 883-885 
lignin 1082-1083 
Lin4 gene 1180-1181 
Linl4 gene 1181F 
Lincoln, Abraham 1066 
LINEs (long interspersed nuclear 
elements) 218F, 291 
linkage and ligand binding 151 
linkage tetrasaccharides 1059 
linker DNA in nucleosomes 187-188 
linker histones (histone H1) 192-193, 
1252 
linker proteins 
in cartilage 1061F 
cytoskeleton 948-949, 959 
lipid anchors 577-578, 593 
see also GPl-anchors 
lipid bilayers 99, 566-576 
assembly in the ER 689-691 
association of membrane proteins 
with 576-577 
asymmetry 573-574, 590, 681, 690 
composition and its effects 571T 
deformation by membrane-bending 
proteins 573-574 
domains 572-573 
electrochemical gradients 612 
flip-flops between monolayers 
570-588 
as fluids/solvents 569-573, 588 
fusion 708F 
lateral diffusion in 570, 588, 589F 
membrane transport proteins in 597 
osmium tetroxide binding 555 
overview 99, 566-576 
as self-sealing 568, 569F 
small molecule diffusion across 598 
spontaneous assembly 566 
spontaneous formation 9, 568-569 
synthetic 569-571, 597-598 
lipid droplets 573 
lipid kinases 574 
lipid rafts/raft domains 572, 573F, 575, 
590, 689, 749-750 
caveolae from 731, 750 
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lipids 
as amphiphilic 566 
density in cell membranes 566 
dolichol as 684 
fatty acids and 98 
lipoprotein production in smooth ER 670 
liposomes 569-570, 572F, 585 
lissencephaly 939 
Listeria (L. monocytogenes) 423F, 
914-915, 953, 1281, 1284, 1287, 
1288-1289F 
listeriolysin O 1284 
lithotrophic organisms 11, 12F, 13, 14F 
liver 
carcinogen activation by 1128 
conversion to nerve cells 396, 397F 
glucocorticoid response 372 
hepatocyte renewal 1226-1227 
hepatocytes and I-cell disease 
728-729 
membrane enclosed organelles 643T, 
644F 
peroxisomes 667F 
protein abundances in brain and 372F 
size regulation 1022 
urea cycle 760 
vasopressin-induced Ca?+ oscillations 
840F 
liver cancer and hepatitis viruses 1132 
liverworts 806F 
living cells 
bilayer formation and 569 
light microscopy of 533-534, 
538-539, 541F, 542-546 
monitoring ion concentrations 
546-547 
IncRNAs see long noncoding RNAs 
local mediators, cell signaling 815 
localization 
auxin transporters 883 
protein machines 164 
loci, defined 486 
logic functions 
AND logic 521-522, 523F 
AND NOT logic 521-522 
logic operations see switches 
long noncoding RNAs (IncRNAs) 
conserved sequences producing 225 
function 305T, 435-436 
in imprinting 409 
Xistas 411 
loop structures 
chromatin 207-208, 211-212 
DNA looping 383-384, 385F, 386, 391 
ion channels 618 
see also feedback loops 
loss-of-function mutations 
defined 487 
disease predisposition and 494 
following gene duplication 228-229 
Lin14 gene 1181F 
tumor suppressor genes and 1104 
as typically recessive 489 
Ubx genes 1162F 
loss of heterozygosity 281 
Lou Gehrig’s disease 947 
low complexity domains 132-133 
low-light photography 534 
lox sites 496-497 
LPSs (lipopolysaccharides) 1267, 1269 
LRP (LDL-receptor-related protein) 
870-871 
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LRP4 receptors 1210 
LRR (leucine-rich repeat) receptor kinases 
881 
LTD (long-term depression) 637 
LTP (long-term potentiation) 636-637 
lumens 
defined 644 
endoplasmic reticulum 669 
small intestine 1037 
luminescent proteins 547 
luminescent reporters 1117F 
lung cancer 1095, 1110F, 1129F, 1136 
lungs, formation and structure 1190 
luteinizing hormone 835T 
Lyme disease 1264F, 1286F 
lymph nodes (lymph glands) 1313F 
lymphatic system 
antigen presentation 1305, 1306F 
endothelial cells in 1236 
lymphocyte recirculation 1311-1313 
metastases use of 1101-1102, 1123, 
1236 
see also lymphoid organs 
lymphocytes 
autoimmune diseases and 1031 
B and T lymphocytes 1240, 1297 
differentiation in immunological 
memory 1309 
elimination by apoptosis 1022 
killer (cytotoxic) lymphocytes 1024, 
1025F 
L-selectin and 1055 
maturation 1310 
migration 1185-1186 
number in the human body 1308 
tyrosine-kinase-associated receptors 
and 862 
see also B cells; T cells 
lymphoid follicles 1312, 1313F, 1322, 
1336 
lymphoid organs 
human lymphoid organs 1308F, 1309 
peripheral, and antigen display 1324, 
1326, 1332 
peripheral, and self-tolerance 1314 
peripheral, antigen binding in 1308- 
1312, 1313F, 1317, 1322, 1323F 
lymphomas 1092 
lysine 
acetylation during transcription 387, 
388F 
elastin cross-linking 1065 
histone modifications 196-197, 200F 
incompatibility of methylation and 
acetylation 207 
KKXX sequences 713 
in nuclear localization signals 650 
in the nucleosome 188, 196-197 
regulatory use of methylation 165T 
structure 112 
lysosomal storage diseases 728-729 
lysosome-dependent/independent 
pathways 1283 
lysosomes 
among intracellular compartments 
642 
delivery pathways 725, 727-728 
exocytosis 729 
fusion with the autophagosome 726 
heterogeneity 723-724 
maturation 724F 


plant and fungal vacuoles 724-725 
structure and function 722-724 
transport from Golgi apparatus to 
722-730 
lysozyme 
catalysis illustrated 6F, 144-146 
disulfide bonds in 127 


M 


M-Cdk 973, 978-979, 982, 985-986, 
1003 
M-cyclins 969, 971, 978, 993, 1004 
M phase 
chromosome changes 215 
position in cell cycle 963, 964F 
macromolecules 
complex formation 50F 
conformations 49 
in the cytoplasm 60F 
extracellular matrix 1057-1058 
formation by condensation reactions 
49 
head and tail polymerization 72-73 
importance in biology 43, 47-49 
localization using immunogold 
electron microscopy 556-557 
precursors from the citric acid cycle 
85F 
relative sizes 1059-1060F 
self-assembly 128-129 
and small molecule monomers 47 
synthesis driven by ATP hydrolysis 
70-73 
visualization using AFM 548-549 
visualization using negative staining 
or cryoelectron microscopy 
559-561 
visualization using TIRF 547-548 
macrophages 
CSFs and production of 1245 
derived from monocytes 1239 
inhibitory receptors 1245 
Legionella pneumophila and 1286 
as professional phagocytes 739, 1301 
selectivity 1031 
macropinocytosis 725, 732, 733F, 738 
as a lysosome delivery pathway 725 
virus entry 1280 
Mad2 protein 994 
magnesium in chlorophyll 766, 787 
magnesium (Mg?*) ions 
integrin binding 1075 
NMDA receptors and 636 
magnification and appearance of cells 
529, 530F 
maintenance methyl transferases 
404-405 
maize 
pachytene chromosomes 550F 
regulatory DNA influence 1175 
major groove, DNA double helix 177F, 
373-374F 
location of methylated cytosines 405 
transcription regulator binding 
373-377, 405 
malaria 610, 804, 1272-1273, 1276, 
1282, 1331 
see also Plasmodium 
malate 107 
MALDI (matrix-assisted laser desorption 
ionization) 456-457 


malignancy, defined 1092, 1093F 
maltoporin 581 
mammals 
conserved body plan 227 
hematopoietic stem cells 1244 
imprinting restricted to 409 
ion concentrations inside and outside 
cells 598T 
limited regenerative abilities 1249 
LTP in the hippocampus 636-637 
membrane lipids 567 
the mouse as a model organism 
35-36 
phylogenetic tree 221F 
manganese clusters 790, 791F, 793-/794F 
mannose 96 
mannose-6-phosphate (M6P) 727, 
728-729F, 742F 
MAP kinases (mitogen-activated protein 
kinases) 
activation as all-or-none 830F 
mammalian Raf, Mek and Erk kinases 
856 
MAP kinase module 856 
in protein kinase evolutionary tree 
155F 
Ras activation 855-856, 1012 
see also Ras-MAP kinase 
MAPKK (MAP kinase kinase) 856-857, 
1271 
MAPKKK (MAP kinase kinase kinase) 
856- 857 
MAPs (microtubule-associated proteins) 
932-934, 986 
MAP2 932-934 
see also centrosomes 
Marchantia 801F, 805F 
Marfan’s syndrome 1066 
mass spectrometry (MS) 455-457 
mast cells 1239, 1240, 1317 
master transcription regulators 398-399, 
1254-1256 
maternal-effect genes 1158-1159 
maternal-zygotic transition 1181 
mathematics in biology 509-525 
see also models 
mating factors/receptors/pheromones 
813-814, 832, 857 
matinib 1135-1136 
matrix metalloproteases 1072-1073 
matrix receptors and co-receptors 1074 
matrix space, mitochondria 658, 660, 
661-662, 757-759 
MBL (mannose-binding lectin) 1301, 
1302F, 1303 
Mbl protein 896-897 
MCSF (macrophage-colony-stimulating 
factor) 850T 
Mdm2 protein 1014, 1015F, 1016, 1017F 
Mar1 gene 1139 
MDR (multidrug resistance) protein 610, 
1293 
mean lifetimes 513F, 514 
mechanical stability 
compression resistance 1036F, 1082, 
1084F 
intermediate filaments 944, 946-948 
tensile strength 1046, 1057, 1063, 
1070, 1082-1083, 1084—1085F 
mechanically gated channels 614, 619 
mechanotransduction 1043, 1044F, 1074, 
1080-1081 


medaka fish 547F 
Mediator protein 
as acoactivator 386 
in transcription initiation 312, 313F, 
385F 
Mef2c transcription factor 1259 
megacolon 1186 
megakaryocytes 180, 1002, 1239, 1241T 
platelet derivation from 1239 
polyploid nucleus 1242 
meiosis 
chromosome behavior in 1004-1010 
comparison with mitosis 1005F, 
1008F 
genetic recombination in 486 
Holliday junctions in 284 
homologous recombination and 277, 
282-285 
pachytene chromosomes 550F 
in Saccharomyces cerevisiae 31 
meiosis | 1004-1006, 1008-1010 
meiosis II 1005, 1008F, 1009-1010 
Mek (MAP kinases kinase) 856 
melanocytes 729, 939, 940F 
melanomas 1092, 1093F, 1136, 1138 
melatonin 877 
membrane-associated proteins 577 
membrane attack complexes 1303 
membrane-bending proteins 573-574, 
699, 701-702, 732, 737, 896 
membrane-bound antibodies 417 
membrane disruption, by viruses 1280F 
membrane fusion 
endolysosomes 723 
homotypic and heterotypic 712 
SNARE proteins in 708-709 
membrane markers 697 
membrane potentials 
changes mediating fast responses 
825, 826F 
contribution to electrochemical 
gradients 599, 662, 762-763 
ionic basis 617F 
protein import into mitochondria 
661-662 
resting membrane potentials 615, 
617, 629-630 
role of K+ leak channels 614-615 
see also voltage-gated channels 
membrane proteins 576-594 
association with lipid bilayers 
576-577 
asymmetry 590, 681 
ATP synthases as 776-778 
attachment of GPI anchors 688 
BCRs and TCRs 1317, 1325-1326, 
1337 
cellulose synthase 1085 
complement system 1302 
densities 758-759 
detergent solubilization 583-586, 766 
diffusion 588-593 
functional reconstitution 585, 586F 
glycosylation 582-583, 723 
in large complexes 588-589 
lysosomal 723 
peripheral membrane proteins 577 
phospholipase CB as 836 
phospholipid translocators as 570 
proportion of all proteins 576, 580 
regulation by transcytosis 738 
see also transmembrane proteins 


membrane-spanning a-helices 677, 679, 
680, 777 
membrane traffic disruption 1284-1286 
membrane transport proteins 
active and passive transport 599-600 
energetics 163 
for membrane-bound organelles 641 
as multipass transmembrane proteins 
599 
as proportion of membrane proteins 
597 
role 9, 597-600 
transporters and channels 597-599 
membranes, biological 
double membranes of mitochondria, 
chloroplasts and bacteria 753F 
lipid bilayers and membrane proteins 
in 565-566 
mitochondria in biosynthesis 760-761 
phospholipids in 98-99 
recycling 743 
types, in liver and pancreas 643T 
see also cell membranes 
memory cells 13811-1313, 1314F 
memory B cells 1312, 1316, 1322 
memory T cells 1312, 1322, 1326, 
1332 
and transmembrane polypeptides 
1336 
memory formation 836, 841 
see also cell memory; molecular 
memory 
Mendelian inheritance 408F, 493 
contrasted with cytoplasmic 
inheritance 807-808 
B-mercaptoethanol 452-453 
meristems 442, 443T, 1082, 1085, 1183 
mesenchymal cells 
epithelial-mesenchymal transitions 
1042, 1101 
mesenchymal stem cells 1229 
mesenchyme-epithelial interactions 1190 
mesoderm 1147-1148, 1156, 1158, 
1167-1169, 1187F, 1189 
connective tissue derived from 1228 
presomitic mesoderm 1177-1178 
Twist specificity 1165 
messenger RNAs (mRNAs) 
5'-end cap snatching 1288 
3’-end processing 324 
accumulation delays 1176 
analysis with microarrays and 
RNA-seq 503-504 
bacterial as polycistronic 348 
Capping and polyadenylation 315-316 
cloning 469 
degradation and gene expression 
372, 373F 
eukaryotic and prokaryotic 316F 
export from the nucleus 325-327, 655 
“factories” 332 
in gene expression monitoring 32 
localization in eggs 1156 
localization in the cytosol 421-422 
localization with in situ hybridization 
502 
polyribosome formation 675 
quantifying with RT-PCR 503 
regulation of transport from the 
nucleus 419-421 
ribosome profiling and 505-506 
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RNA editing 418-419 
signal initiated translation 506 
stability and gene expression 
426-428 
in translation 343-344. 351-352 
untranslated regions (UTRs) 184T, 
421-423, 430 
see also pre-mRNA; UTRs 
metabolic pathways 
enzyme catalysis in 51, 52F 
epistasis analysis and 490 
glycolysis and the citric acid cycle 
among 87F 
metabolic processes 
carcinogen activation by 1128 
catabolism and anabolism 51-52 
rate and ATP utilization 148 
metabotropic receptors 629-630 
metamorphosis 1182 
metaphase 
chromatid alignment 964 
in mitosis 980 
metaphase chromosomes 214, 486, 988F 
metaphase plates 990-991, 992F 
metaphase-to-anaphase transition 965, 
968, 969F, 973T, 977 
proteolysis in 970 
sister-chromatid separation 992-994 
spindle assembly checkpoints 
993-994 
metastases 
cross-section 1100F 
defined 1092 
invasiveness resembling EMT 1101, 
1120 
micrometastases 1102-1103, 1120 
monitoring in mice 1117-1118 
and natural selection 1096 
progress of 1102F, 1119-1120 
uncertainties over 1119-1120 
Methanococcus jannaschii 16F, 676F 
methionine 
as initiator of protein synthesis 347, 
800 
N-formyl- 347, 800, 958, 1031 
SRP protein 673 
structure 113 
methionine, S-adenosyl- see 
adenosylmethionine 
methylation 
of CG sequences and differentiation 
404 
changes in nuclear reprogramming 
1252-1253 
DNA damage from nonenzymatic 
2671, 271 
of GATC residues in mismatch repair 
290,250; 25/F 
and genomic imprinting 407-409 
and induced pluripotency 1256 
inheritance of DNA methylation 
patterns 404-413 
of ribose in MRNA 316F 
of ribose inrRNA 328 
of tumor suppressor genes 1109 
see also cytosine; lysine 
Mg?* ions 
integrin binding 1075 
NMDA receptors and 636 
MHC (major histocompatibility complex) 
122 
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class I, recognition by NK cells 
1304-1305 
class | and class Il 1826-1327, 1328F, 
13307, 
polymorphisms 1328, 1330-1331 
in T-cell-mediated immune responses 
1307, 1824-1339 
Mia40 protein 663F, 664 
micelles 99 
bilayers and 568, 569F 
critical micelle concentration 583, 
584F 
Michaelis-Menten equation 142-143 
microarrays, DNA 
mRNA analysis 503-504 
replication studies 257, 258F 
microbiota 1263-1275, 1277 
microelectrodes 546 
microfibrils 1065 
microfilaments see actin filaments 
Bo-microglobulin 1327, 1328F, 1330T, 
1339 
microinjection of altered genes 495 
micrometastases 1102-1103, 1120 
microorganisms, preponderance 10 
“microprocessor” activity of proteins 
155-156 
microRNAs (miRNAs) 
regulatory role 429-431 
timing of embryonic development 
1180-1182 
microscopy see AFM; electron 
microscopy; light microscopy 
microsomes 445, 448, 671-673 
microtomes 535, 555 
microtubule-associated proteins (MAPs) 
932-934 
see also centrosomes 
microtubule depolymerization 990-991, 
994 
microtubule flux 991, 992F, 994 
microtubule-organizing centers (MTOCs) 
891, 929-930, 931-932F, 936 
microtubules 
astral microtubules 982-985, 987, 
992, 994, 999, 1000F, 1001-1002 
building cilia and flagella 941-942 
in cell migration 959-960 
in cell wall orientation 1086-1087 
chemical inhibitors 459, 904T 
cortical arrays 1086 
in the cytoskeleton 891, 925-944 
drug effects 929 
dynamic instability 927-929, 935, 986 
fluorescence microscopy 538F, 553F 
Golgi apparatus 713, 715 
mitochondria and 755, 756F 
mitotic spindle formation 889-890, 
940-941, 982-984 
motor proteins and 936-938 
nulceation 929, 930F 
plus and minus ends 927 
severing and tubulin sequestering 
proteins 935-936 
subunits as asymmetrical 894 
see also tubulins 
microvilli 890-892, 893F, 924 
midbodies 997, 998F 
minor groove, DNA double helix 177F, 
190, 373-374F 
minus-end-binding proteins 931, 932, 934 


minus-end-directed transport 937-938 
miRNAs (microRNAs), function 305T, 
1149 
misfolded proteins see protein misfolding 
mismatch proofreading 250-251, 257F 
mismatch repair system 
colorectal cancer, defects in 1116, 
1124-1125 
gene conversion from 286F 
MutS in 549F 
strand-directed mismatch repair 244T, 
245, 250-251 
mitochondria 
acetyl CoA production 81-82 
among intracellular compartments 
642 
ATP production in 774-783 
ATP production in plants 81 
biochemical fractionation 758 
chloroplasts compared with 782-783, 
797 
contacts with ER 755, 757F 
in cytokinesis 1001 
cytoskeletal microtubule association 
755, 756F 
electron transport in chloroplasts and 
755F 
features and origins 25, 26-27F, 644, 
663 
heat-shock proteins 355 
human mitochondrial genome 
801-802F, 804-805 
intrinsic activation pathway, apoptosis 
1025, 1026F 
junction complexes with ER 691 
lipid imports 691 
overview 755-763 
peroxisome function and 666 
protein transport into 658-664 
pyruvate conversion 75, 81-82 
roles in cellular metabolism 759-761 
self-splicing RNAs 324 
separation by centrifugation 445 
site of the citric acid cycle 83 
structure 658F, 757F 
tobacco 544F 
urea cycle 760 
yeasts 451F 
mitochondrial activation pathway, 
apoptosis see intrinsic activation 
mitochondrial carrier family 779-780 
mitochondrial DNA 
in different cells and tissues 802 
genetic code variants 334, 349-350 
location 759 
mitochondrial genomes 27, 800-809 
mutation effects 807-808 
mutation rate 220 
mitochondrial fission 755-756, 802-803 
mitochondrial fusion 802-803, 804F, 808 
mitochondrial membranes 
inner and outer membranes 658, 
663-664, 757-758 
inner membrane transport proteins 
779-780 
proton gradient across 103 
small molecule transport across 664 
mitochondrial precursor proteins 659-662 
mitochondrial reticulum 757F, 803, 808 
mitogens 
in cell division 1011-1012, 1013F 


effects on G1-Cdk and G,/S-Cdk 
1012-1014 
response to excessive stimulation 
1017F, 1114F 
mitophagy 727 
mitosis 
cell cortex in 913 
in the cell cycle 978-995 
chromosome compaction in 185, 
186F, 187, 215-215 
chromosome painting in 180-181 
comparison with meiosis 1005F 
five stages of 978 
M-Cdk and 978-979, 982, 985-986 
nuclear envelope in 656-657 
in Saccharomyces cerevisiae 31 
without cytokinesis 1002, 1003F 
mitotic chromosomes 979F 
mitotic index 966 
mitotic recombination 1109 
mitotic segregation in mitochondrial 
inheritance 807-808 
mitotic spindles 
acentrosomal 987 
Apc protein and 1124 
assembly 984-985 
in asymmetric cell division 1001-1002 
bipolarity 986-987 
centrosome role 930 
chromosome attachment 988, 989F 
disassembly 995 
fluorescence microscopy 538F, 545F 
forces on chromosomes 990-992 
kinesins and 936 
metaphase 983F 
microtubules and 889-890, 940-941, 
982-094 
monastrol effects 460F 
positioning role in cytokinesis 
997-999, 1001 
MLCK (myosin light-chain kinase) 922, 
923F, 958F 
mobile genetic elements 
conservative site-specific 
recombination 292-297 
human genome 218F, 287, 292 
transposition and 287-292 
viruses as 290 
model organisms 
cell cycle 966 
in embryonic development studies 
1148 
examples 29T, 33 
genome sizes 28F, 29T 
mutation libraries 498-499 
protein-coding genes 122-123 
models (protein structure) 115F, 460, 
461F 
models (simulations) 524, 1178 
modulated repetition 1162 
moiré patterns 550 
mold-yeast transitions 1271 
molecular clocks 220-221 
molecular dynamics calculations 584F 
molecular memory devices 840-841 
molecular motions, speed of 59-60 
molecular oxygen 
in chemiosmotic coupling 754, 
761-762 
end of electron transport chain 767, 
771 


in evolution of large organisms 771 
from photosynthesis 782, 790, 791F 
production by cyanobacteria 782 
use by perixisomes 666-667 
molecular switches see switches 
molecular weights 
analytical ultracentrifugation 455 
gap junctions and 1051F 
plasmodesmata and 1054 
SDS-PAGE separation 452 
see also macromolecules 
molecule localization with fluorescence 
microscopy 536-537 
molten globules 354 
monastrol 460F 
monoallelic gene expression 411 
monoclonal antibodies 
fractionating tumor cells 1122 
from hybridomas 444-445 
ipilimumab 1138 
in microscopy 539, 591F 
against T cell inhibitory proteins 1337 
trastuzumab 1137 
monocytes 1231F, 1239-1240, 1241T, 
1243F 
monomeric GTPases 820, 821F 
Rho family 843, 956 
monosaccharides, as aldoses and ketoses 
98 
monoubiquitylation 735 
morphogenesis 
branching morphogenesis 1190, 
1191F 
defined 1146 
in embryonic development 
1184-1193 
morphogens 
Bicoid as 1159-1160 
Dorsal as 1165 
graded effects 866, 1151 
neural development 1200 
range 1153 
response timing 824 
Wnt and Hedgehog as 868, 871 
morphological diversity, sticklebacks 
1174-1175 
mosaicism 410-411 
motor neurons 630, 633, 634F, 858, 1199, 
1208-1209, 1211 
motor proteins 161-163 
cytoskeleton 896 
and microtubules 936-938, 1288 
and mitotic spindle 982, 983-984 
Rab effectors as 706 
Toxoplasma gondii 1282 
viruses and 1288 
visualizing with TIRF 548 
see also myosin 
mouse (Mus musculus) 
brain 502F, 542F, 624, 1250F 
cancer-critical gene function 
1118-1118 
embryonic development of paw 1022 
epidermis 1226F 
epithelial cell lifetimes 1219 
evolution rate 220F 
genome comparison with human 
220F, 221-222, 292 
Hox gene effects 1171F 
imprinting in 408F 
iPS cell incorporation 1254 


liver 1197 
macrophage 739F 
as model organism 29T, 33, 35-36 
myosin mutation effects 923F 
transposons in 292 
X-inactivation in 411F 
mouse, transgenic 
“prainbow mice” 502F 
knockout mice and 495-496, 1117 
oncogene collaboration 1117-1118 
RNA interference in 500 
telomere studies 265 
mouse-human hybrid cells 589F 
movement, and protein conformation 
changes 160-163 
M6P receptor proteins 727, 728F 
Mre11 complex 282-283F 
MreB protein 896-897 
mRNAs see messenger RNAs 
mRNPs (messenger ribonucleoproteins) 
655 
MRSA (methicillin-resistant 
Staphylococcus aureus) 1276 
MS/MS (tandem mass spectrometry) 457 
MscL and MscS channels 620 
MTOCs (microtubule-organizing centers) 
891, 929-930, 931-932F, 936 
see also centrosomes 
mTOR (mammalian target of rapamycin) 
860F, 861, 864T, 1114-1115 
mucins 718 
mucus 719-720, 749, 1276-1277, 1298 
multicellularity 
defined by the genome 29-30 
evolution in plants and animals 
880-881 
noncoding DNA in 182 
SH2 domains in 122 
multidrug resistant cancers 1139 
multidrug treatments, for cancer and AIDS 
1139-1140 
multienzyme complexes 148-149 
multinucleate cells 1249 
see also syncytia 
multipass transmembrane proteins 578, 
581-582 
combinations of start- and stop- 
transfer 679-681 
double-pass transmembrane proteins 
679, 681F 
ion-channel-coupled receptors as 818 
KDEL receptor as 714 
membrane transport proteins as 599, 
664 
Patched receptor as 872 
rhodopsin/bacteriorhodopsin as 
586-588, 681F 
seven-pass transmembrane proteins 
870, 872 
multiphoton imaging 542 
multiple sclerosis 625 
multispecies conserved sequences 
225-226 
multiubiquitylation 735 
multivesicular bodies 
ESCRT complexes and 736-737 
exocytosis 729 
fusion 736 
muntjac deer 183 
Mus musculus see mouse 
muscarinic acetylcholine receptors 843, 
847F 
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muscle cells 
Ca2+ pump in SR 606-607 
insect flight muscle 919F 
muscle types 917, 1232, 1233F 
myoblast migration 1185-1186 
myotendinous junctions 1075, 1076T 
neuromuscular junctions 630 
muscle contraction 
actin and myosin in 916-920 
calcium (Ca?+) ions in 920-923 
smooth muscle 921-923 
speed of 919 
muscular dystrophy 948-949, 1072, 
1077, 1234 
MuSK receptor tyrosine kinase 1210 
mutagenesis 
Ames test for 1128 
carcinogenesis link to 1094 
mutational landscapes 1112F 
mutations 
characteristic order in colorectal 
cancer 1125-1126 
Classical genetics approaches 
485-488 
complementation tests 487, 490 
deducing gene function from 21-22, 
496, 498-499 
disease risk and 479, 493-494 
effects in mitochondrial genomes 
807-808 
in germ cells and somatic cells 
238-239 
heterochronic mutants 1180, 1181F 
identification through DNA analysis 
491 
intragenic mutations 16, 17F 
libraries of 498-499 
missense, in cancer-critical genes 
1110F 
mobile genetic elements and 287, 
292 
neutral mutations and population size 
231-232 
rates of in eukaryotes 234, 237 
rates of in mitochondrial genomes 
803-804 
and RNA splicing patterns 323 
S. pombe mutant phenotype 21F 
somatic, in cancer cells 1094, 1104 
types of 487, 1104 
see also gain-of-function mutations; 
loss-of-function mutations 
mutator genes 250 
MutH, MutL and MutS proteins 251F 
MutL and MutS genes 1124 
MutS protein 251F, 549F 
mutualism, microbes with hosts 1264 
myasthenia gravis 1315 
Myc regulating protein 
in cancers 1107, 1118 
fibroblast reprogramming 1254-1255 
p53 response 1115 
see also OSKM factors 
Mycobacterium tuberculosis 1265, 1281, 
1284F, 1285 
Mycoplasma genitalium 9, 10F, 168, 499 
myelination of neurons 625, 667 
myeloid cells 1240 
myoblasts 1185-1186, 1196, 1233-1234 
MyoD gene 206F 
MyoD transcription regulator 396, 399, 
1170-1171, 1254, 1258 
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myoepithelial cells 1232, 1233F 
myofibrils 918-921 
myosin 
access to actin bundles 912 
actin and, in muscle contraction 
916-920 
in actin arrowheads 898 
ATP hydrolysis by 916 
in cell migration 954-955 
in the contractile ring 890 
fusion with GFP 548F 
as amotor protein 162 
in non-muscle cells 923-925 
myosin gene mutations 923 
myosin protein superfamily 924-925 
myosin | 924 
myosin Il 911, 915-918, 923-924, 936, 
948, 953, 955-957 
adherens junction links 1042, 1044 
in cell motility 955-957 
light and heavy chains 915, 922-923, 
923F, 957, 958F 
myosin XIV 924F, 1282 
myostatin 1196-1197 
myotendinous junctions 1075, 1076T 
myotonia 627 
myotonic dystrophy 324 
myristic acid 578 


N 


N-formylated peptides 958 
N-linked oligosaccharides 683-684, 
685-686, 716-720 
complex and high-mannose 717 
N-terminal histone tails see histone tails 
N-termini 
acetylation and degradation 360-361 
anchoring Src kinases 155 
cadherin linking 1040F 
polypeptide backbone 110F 
pro-peptides, protein precursors 744 
signal sequences at 647, 663, 679 
Nat-Ca2* exchanger 607 
Nat-driven CI--HCO3-exchanger 604 
Nat gradient 602, 603F, 604, 605F, 608, 
616 
Na*-H* antiporters 781 
Nat-independent Cl--HCO3" exchanger 
604-605 
Na*-K* pump proteins 585F, 607-608. 
615 
Na*-linked symporters 603, 604F, 605, 
632 
NAD? (nicotinamide adenine dinucleotide) 
as an electron carrier 754 
regeneration 760 
NADH dehydrogenase complex 
(Complex I) 767-770, 769F 
NADH (reduced nicotinamide adenine 
dinucleotide) 
as an activated carrier 64 
as an electron donor 67-68, 764 
in catabolic reactions 68, 74 
production by glycolysis 74-78 
production in the citric acid cycle 
82-84, 759 
respiratory-chain complexes and 
766-768 
NADPH oxidase complex 1301 
NADPH (reduced nicotinamide adenine 
dinucleotide phosphate) 


as an activated carrier 64 
as an electron carrier 67-68, 760 
in anabolic reactions 68 
in photosynthesis 755, 784 
as areducing agent 68F 
naive B and T cells 
co-stimulatory signal requirement 
1314 
locating APCs 1311 
in lymphocyte differentiation 1310 
naive B cells 1316-1317, 1322 
naive T cells 1314, 1326, 1328, 
1332-1333, 1335-1337, 1338F 
NANA (N-acetylneuraminic acid) 575F, 
717F 
nanodiscs 586 
Nanog gene 1254F 
Nanog protein 378F, 506F 
nanomachines see protein machines 
nanoparticles 
immunogold electron microscopy 557 
quantum dots as 538 
Nanos protein 1158F 
natural killer (NK) cells 1240, 1241T, 1304, 
1333-1334 
natural selection 
mutation and 15-16 
and pathogens 1289 
protein structure and 119-120 
purifying selection 219-220, 221F, 
223, 225, 231 
small selective advantages 226 
and tumor progression 1091-1092, 
1096, 1104, 1118, 1119F, 1125 
NCAMs (neural cell adhesion molecules) 
1055-1056, 1338F 
Ndc80 complex 987, 988F, 990F, 991 
Neanderthals 220, 223-224, 232F, 300F, 
479 
nebulin 919, 920F 
necroptosis 1021 
necrosis 
contrasted with programmed cell 
death 1021, 1022F 
as typical of cancer cells 1099, 1110F, 
1116 
negative feedback 
Cdc20-APC/C 993 
in cell regulation 515-516 
in cell signaling 829-830, 853 
circadian clocks and 876-878 
delayed negative feedback 516, 517F, 
1179F 
Hedgehog pathway 873 
Hes and Her genes 1178, 1179F 
JAK-STAT signaling pathway 863 
NF«B pathway 874, 875F 
rate constant fluctuations 516F 
Smad signaling pathway 866 
Wnt signaling pathway 871 
negative feedback loops 402 
common to all cells 402 
rod cells 845 
negative regulation of enzymes 151, 152F 
negative selection 1332, 1333F 
negative staining 559-561 
Neisseria gonorrhoeae 
antibiotic resistance 19 
protection against the complement 
system 1303 
Neisseria meningitidis 1291 


Neisseria spp. 
phase variation 1290 
nematodes see Ascaris; Caenorhabditis 
elegans 
neoblasts 1248-1249 
neoplasms, defined 1092 
see also cancers; tumors 
Nernst equation 615-616 
nerve cells see neurons 
nerve impulses see action potentials 
nerve terminals 1209 
nesprins (KASH proteins) 949, 959 
netrins 1202-1204 
network motifs 402 
see also feedback 
neural crest cells 951, 959, 1041-1042, 
1076T, 1186 
neural development 1198-1213 
four phases 1199F 
vertebrates 1199-1200 
neural map formation 1204-1206 
neural stem cells 1201F, 1250-1251 
neural tube 1040F, 1041, 1045F, 1186, 
1192, 1199 
neuraminic acid, N-acetyl- (NANA) 575F 
neurites 940, 1202 
neuroblasts 1173-1174, 1179-1180, 1200 
neurodegenerative diseases 
amyloid fibrils in 130-131 
neurofilaments in 947 
plectin gene mutations and 949 
stem cell potential 1250-1251 
neurofilament proteins (NF-L, NF-M and 
NF-H) 944T, 947 
neurogenesis, in Drosophila 1173-1174 
neurogenic ectoderm 1166, 1173-1174, 
1179, 1186 
neurological diseases 939 
neuromuscular junctions 
acetylcholine receptors 630-631 
regeneration 1071-1072 
synapse formation 1209, 1210F, 1211 
neuromuscular transmission and ion 
channels 632-633 
neuronal doctrine 441 
neuronal migration 913F 
neuronal specificity 1205 
neurons 
absence of energy storage 87 
activity-dependent synaptic change 
1211-1212 
basket cell 1198F 
commissural neurons 1202-1204 
computation by single neurons 
633-636 
as examples of long-distance 
signaling 815 
firing rate 623, 635 
firing rule 1212 
fluorescence microscopy 543F 
growth cone 858, 943, 951 
intermediate filaments 947, 948F 
lateral inhibition 867, 868F 
liver cell conversion to 396, 397F 
microtubule-associated proteins 932, 
940F 
migration 1200, 1201F 
myelin sheathing 625 
neurotrophic factors and 1208-1209 
normal neuronal death 1208 
olfactory receptor 844F 


optogenetic control in mice 624 
presynaptic and postsynaptic 
628-629 
programmed production 1201F 
RNA localization 421 
secretory vesicles 744-746 
stathmin role in amygdala 935 
structure and function 620-621, 940 
survival factors 1030F 
sympathetic 1018 
turnover rate 1250 
types and firing properties 627 
types of ion channel 614, 631 
visualizing in mouse brain 502F 
see also axons; central nervous 
system; dendrites; glial cells; 
synapses 
“neurospheres” 1250, 1251F 
Neurospora 416-417 
neurotransmitter receptors 
ionotropic and metabotropic 630 
neurotransmitter transporters 603, 604F, 
628 
neurotransmitters 
binding to transmitter-gated channels 
629 
excitatory and inhibitory 629-630 
as extracellular signaling molecules 
815-818, 825 
in synaptic vesicles 745-746 
see also transmitter-gated ion 
channels 
neurotrophic factors 1208-1209 
neurotropic alpha herpesviruses 1288 
neutrophils 1239, 1240F, 1241, 
1244-1246 
chemotaxis 958, 959F 
CSFs and production of 1245 
cytoskeletal rearrangements 890, 892 
as professional phagocytes 739, 
740F, 1301 
newts 
limb regeneration 1248-1249 
lung cells 991 
NF«B signaling pathway 873-874, 1301 
NGF (nerve growth factor) 
action via MAP kinases 856 
action via receptor tyrosine kinases 
850T, 853-854 
as a neurotrophic factor 1208-1209 
sympathetic neuron growth and 1018 
NHEJ see nonhomologous end joining 
NHL (non-Hodgkin’s lymphoma) 1092F 
nicotinamide derivatives see NADH; 
NADPH 
nidogen 1058F, 1070, 1071F 
nitric oxide (NO) 846-847 
nitrogen cycle 85-86 
nitrogen fixation 12, 14F, 86 
nitrogen in C-N chemical groups 91 
nitroglycerine 847 
nitrosated polyamines and peptides 267T 
nitroxide spin labels 569-570 
NLRs (NOD-like receptors) 1300-1301 
NMDA receptors (N-methyl-D-aspartate) 
636-637 
NMR (nuclear magnetic resonance) 
studies 
of human brain 913F 
of protein structure 461-462 
of transcription regulators 375 


nocodazole 904T 
NOD-like receptors (NLRs) 1300 
Nodal protein 1168-1169 
nodes of Ranvier 625 
Noggin protein 1168-1169 
noise 
in data points 525 
in genetic control systems 1246 
in intracellular signaling 820-822, 831 
in microscope images 532-533, 561 
non-crossovers 284-286 
non-genetic variability 524 
non-histone proteins 187, 209-210 
nonclassical cadherins 1037T, 1038, 
1039F, 1046F 
noncoding DNA 
conserved 224-225 
in humans and other species 28-29, 
182 
in the nucleolus 330 
regulatory DNA as 7, 29 
see also introns 
noncoding RNAs 305, 327 
localization with in situ hybridization 
502 
regulation of gene expression 
429-436 
RNA sequencing and 482 
small ncRNAs and RNA interference 
429-431 
noncovalent interactions/noncovalent 
bonds 
in biological membranes 565 
cytoskeletal filaments 893 
four types 44-45, 45T, 94-95 
ligand binding 134, 139F, 140 
in macromolecules 49-50 
membrane-associated proteins 577 
in protein folding 110-111, 114 
nondisjunction 1010, 1109F 
nonenveloped viruses 1274, 1280-1281 
nonfibrillar collagens 1063T 
nonhomologous end joining (NHEJ) 
274-275, 282, 289 
B cell switch sequences 1323 
homologous recombination and 275F, 
278 
nonretroviral retrotransposons 291-292 
nonsense-mediated MRNA decay 327F, 
352-353 
normal flora 1264-1265, 1276, 1293, 
1298 
NOS (NO synthases) 847 
Notch protein 
activation by proteolysis 869F 
O-glycosylation 720, 868 
Notch receptor 867-868, 869F, 1117 
Notch signaling pathway 
diversity 1150 
Hes genes 1178 
lateral inhibition 1152, 1171-1178, 
1224-1225 
stem cell maintenance 1224-1225 
notochords 1147F, 1167, 1185F, 1189F, 
1200F 
Noxa protein 1028 
NPCs see nuclear pore complexes 
NPV (human papillomavirus) 1131-1132 
NSF protein 709, 712F 
NtrC protein 383F 
nuclear envelope 
DNA localization and 178-179 
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as feature of eukaryotes 24 
inner and outer membranes 649 
intermediate filaments and 890 
linker proteins 948-949 
in mitosis 656-657, 978, 980-981, 
985-986, 995 
repair factory tethering 213-214 
nuclear export signals/receptors 652-655 
nuclear import signals/receptors 650, 
652, 653F, 654-656 
nuclear lamina 
heterochromatin and 211-212 
intermediate filaments and 179, 180F, 
891, 944 
in mitosis 656, 986, 995 
role 649, 650F 
nuclear lamins 
A-type lamins 948 
caspase cleavage of 1023 
nuclear localization signals 650-651, 
652-653F, 654-655, 657 
amino acid phosphorylation 655 
import receptor binding 652 
nuclear matrix or scaffold 214 
nuclear pore complexes (NPCs) 
arrangement 651F 
export of mMRNA-protein complexes 
326-327 
gated transport 646 
import receptor binding 652 
phosphorylation in mitosis 985 
Ran-GTPase and 653-654 
virus entry 1281 
nuclear pores 179, 180F, 202F, 213, 327, 
559F, 561F 
nuclear receptor superfamily 875-876, 
877F 
nuclear reprogramming 1252-1253 
nuclear-to-cytoplasmic ratio 1182 
nuclear transplantation 1252 
nuclear transport receptors 326, 653 
nucleation, actin filament formation 
899-900, 902, 906-907, 
908-909F, 953, 954F, 1289F 
nuclei 
as Characteristic of eukaryotes 13 
DNA virus replication 1278 
human chromosome localization 
211-212, 213F 
as intracellular compartments 642 
loss from red blood cells 1245F 
noncoding RNAs 327-330 
protein import 649 
RNA export 325-327, 419-421, 649, 
655 
subcompartments and structures 
213-214, 331-333 
transplanted into enucleated eggs 
205, 206F, 369-370 
transport between cytosol and 
649-658 
nucleic acid structures 1101 
nucleic acid synthesis 71-72 
nucleoids 759, 800F 
nucleolus 213, 329-330, 331F, 340 
snoRNAs 305T, 328 
nucleoporins 126, 649-650, 651F, 656 
nucleoside triphosphate hydrolysis 894, 
896, 902-903 
see also ATP hydrolysis; GTP 
hydrolysis 
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nucleosome assembly factors 976 
nucleosome core particles 
DNA packaging in 188-190 
histone proteins in 187-188, 189F 
susceptible sites to DNA damage 
267F 
nucleosome mark recognition 199F 
nucleosome sliding 190, 191F 
nucleosomes 
assembly behind replication forks 
261-265 
as basic chromosome structures 
187-188 
chromatin remodeling and 190-193, 
380 
cooperative binding and 379-380 
DNA replication and 254 
fibroblast reprogramming as iPS 1256 
histone H1 binding 193F 
nucleotide excision repair 2661, 270-271 
nucleotide-gated channels 614 
nucleotide sequences see DNA 
sequences 
nucleotide-sugar intermediates, in 
glycosylation 684 
nucleotides 
in activated carriers 69 
biosynthesis 86 
complementarity 3F 
functions 101 
in the nitrogen cycle 85-86 
nomenclature 101 
as nucleic acid monomers 3 
number in human genome 28 
number in the nucleosome 188 
as phosphorylated nucleosides 101 
structures 3-4, 100-101 
see also bases 
nucleus see nuclei 
nullcline analyses 518, 519F 
numerical aperture and resolution 532, 
533F 
numerical integration 512 


O 


O-linked glycosylation 719 
O-linked oligosaccharides 684, 718, 868 
O®-methylguanine 271 
obesity 1115, 1128F, 1129, 1264 
occludin 1049-1050 
Oct4 transcription regulators 398-399F, 
506F, 1254-1255 
see also OSKM factors 
B-octylglucoside 583, 584F 
ocular dominance columns 1212 
Okazaki fragments 241-242, 243F, 
245-250, 253-254, 255F, 261 
oleic acid 98 
olfactory neurons 943, 1227 
olfactory receptors 832, 843-844, 846T, 
1250F 
oligodendrocytes 625 
oligonucleotides, affinity chromatography 
449 
oligosaccharides 97 
glucose trimming 683F, 684-685 
mannose trimming 686 
N-linked 683 
processing in the ER and Golgi 
apparatus 718F 


processing in the Golgi apparatus 
716-718 
oligosaccharyltransferases 677, 683F, 
684, 717F 
Omi protein 1029 
OMPLA protein 581F 
oncogene dependence 1135 
oncogenes 
collaborative action 1118 
discovery 1105-1107 
DNA sequence changes 1110F 
gain-of-function mutations and 1104 
v-Ras 1106 
from viruses 1131-1132 
see also proto-oncogenes 
oocytes see eggs 
open reading frames (ORFs) 424, 457, 
482 
ribosome profiling technique 505-506 
operons 380-385. 385 
Lac operon 381F, 382-383 
opportunistic pathogens 1265, 1268, 
1276 
opsin 845 
opsonization 1301F 
optic cup 1258F 
optic tectum 1204-1206, 1207F 
optical isomers, amino acids 112 
optical microscopy see light microscopy 
optical sections 540 
optical techniques, protein interactions 
458 
optogenetics 624-625 
OR logic 521F 
ORC (origin recognition complex) 
259-261, 974-975, 976F 
order 
in biological structures 51F 
disorder as entropy 52-53, 60, 103 
thermodynamics of 52-54 
organelles 
biparental and maternal inheritance 
807 
endosymbiont hypothesis 800 
energy-converting 753 
four families of 645 
gene transfer to the nucleus 801-802 
growth and proliferation 648, 658, 
800, 806 
lipid droplets 573 
membrane-enclosed 565, 641-649, 
889, 896, 938-939, 1001 
not constructed de novo 648, 658 
protein movements between 645-647 
Rab family GTPases in 706T 
secretory and endocytic pathways 
646F 
from subcellular fractionation 445 
volumes in a liver cell 643T 
organic chemistry 
biological importance 47 
chemical bonds and groups 47, 
90-91 
organisms 
shared proteins 482 
size differences 1010 
see also species 
Organizer signaling center 1167-1168 
organoids 1223 
organotrophic organisms 11 


organs 
generation from stem cells 
1256-1257, 1258F, 1266-1267 
grafted/transplanted 1329, 1331 
growth of animals and 1193-1198 
regeneration of 1247-1249 
size regulation 1226 
sizes of transplanted 1193 
transcription regulators and creation of 
1170-1171 
see also lymphoid organs 
origins of replication see replication origins 
orphan receptors 832, 875 
orthologs 17-18, 36 
oscillations 
Ca?+ waves 838-840, 842, 843F 
circadian clocks 876-878 
metaphase chromosomes 991 
negative feedback effects 516, 517F, 
829F, 830, 875F, 876-878 
NF«B activation 874, 875F 
and robustness 520 
vertebrate segmentation 1177 
OSKM factors (Oct4, Sox2, KIf4, and Myc) 
1254-1255, 1256F 
osmium tetroxide 555-556, 716F 
osmotic equilibrium 1083 
osmotic gradients 612 
osmotic pressure 612, 619-620, 724 
osmotic stress 857, 1115 
osteoblasts 1057, 1229-1232 
osteoclasts 1230-1232, 1239, 1243F 
osteocytes 1229-1230 
osteoporosis 1232 
ovarian cancer 1113, 1116 
OXA complex (cytochrome oxidase 
activity) 659, 660F, 663F, 664 
oxaloacetate 70F, 83-84, 85F, 87, 
106-107 
oxidation 
as an electron transfer 55-56 
DNA damage from 267T 
of organic molecules 54-55 
oxidative phosphorylation 84-85, 86F, 
753, 759F, 761-763 
in cancer cells 1098 
in plants 786 
oxidative stress 1115 
oxygen 
C-O chemical groups 91 
carbonyl oxygen atoms 612, 618, 
619F 
detoxification 796 
origins of atmospheric 11, 26, 
796-797 
utilization in peroxisomes 666 
see also molecular oxygen 
oxygen-carrying molecules, evolution 


229-230 

P 

P-bodies (processing bodies) 427-428, 
430 


Pego chlorophyll 790, 791F, 794F 
P element, Drosophila 416, 486 

P-glycoprotein (MDR) 610, 1293 
P-granules 1002F 

32P labeling 466-467 

p21 protein 1014, 1116 

p27 protein 971F, 973T, 1004 


p53 protein 
apoptosis-promoting function 1015, 
1028, 1115-1116 
cell-cycle arrest 1016 
loss of function in cancers 1126, 
1132-1133 
post-translational modification 166 
target of Chk1 and Chk2 kinases 
1014 
as transcription regulator 1116 
p53 regulatory pathway in cancers 
1113-1116, 1123 
P-type pumps 606-608, 690 
pachytene 550F, 1006, 1007F 
packaging proteins 704 
PAGE (polyacrylamide-gel electrophoresis) 
see polyacrylamide-gel; 
SDS-PAGE 
pair-rule genes 1159-1160, 1162 
pairing, homologs 1005F, 1006-1007 
PALM (photoactivated localization 
microscopy) 552, 553F 
palmitic acid 98 
PAMPs (pathogen-associated molecular 
patterns) 
dendritic cell activation 1306F, 1326 
peptidoglycans and LPSs as 1267 
recognition by PRRs 1298, 1300 
PAMs (protospacer adjacent motifs) 434F, 
497 
pancreas 
acinar cells 748 
B-cell renewal 1226-1227 
exocrine cells 671F 
membrane types in liver and 643T 
negative selection to protect 1332- 
1333 
pancreatic cancers 1117 
pandoraviruses 1274 
Paneth cells 1218-1219, 1221-1225 
PAP (poly-A polymerase) 325 
papillomaviruses 1129, 1130T, 
1131-1132, 1276 
PAPS (3'-phosphoadenosine- 
5'-phosphosulfate) 719 
paracellular pores 1049 
paracellular transport 1047 
paracrine action, Type | interferons 1304 
paracrine signaling 815 
paralogs 
distinguished from homologs and 
orthologs 17-18 
in vertebrate genomes 34, 120 
Paramecium 614, 941 
parasites, eukaryotic 1277, 1282-1284, 
1290 
parasitism 1264 
parathormone 835T 
Parkin ubiquitin ligase 727 
Parkinson’s disease 130, 324, 727, 1251 
ParM protein 897 
PARP (PolyADP-ribose polymerase) 
inhibitors 11383-1135, 1139 
parthenogenicity 987 
parvin 1079 
parvoviruses 1274 
passenger mutations 1104, 1111-1112, 
1119F, 1137 
patch-clamp recording 626, 627F 
Patched receptors 871-873 
paternity testing 476 


pathogenicity islands 1268-1269, 1282F 
pathogens 
cross-species transmission 1279, 
1291 
drug resistance 1291-1294 
engulfment by phagocytic cells 
1301-1302 
epithelial barrier to infection 1265, 
1276-1277 
evolution by antigenic variation 
1289-1291 
extracellular 1269, 1277-1278 
facultative and obligate 1268 
fungal and protozoan 1271-1273 
host interactions 1264-1265 
host specificity 1265 
infection strategies 1276-1294 
insect vectors 1276 
intracellular 1278-1279, 1282, 1283F, 
1284-1285 
opportunistic pathogens 1265, 1268, 
1276 
primary pathogens 1265, 1276 
use of the host’s cytoskeleton 
913-914, 1286-1288 
viral, bacterial and eukaryotic 1266 
viruses Causing human diseases 
1273T 
see also bacteria; parasites; viruses 
patterns, biological see order 
patterns, in development see spatial 
patterning 
Pax6 gene (Eyeless) 397-398F, 1146F, 
1171 
paxillin 1037T, 1080 
PCNA 247, 262 
PCR (polymerase chain reaction) 
chromosome structure analysis 209F 
DNA cloning using 473-477 
ion torrent™ sequencing 481 
quantitative RT-PCR 502-503 
PD1 receptors 1138F, 1139, 1337 
PDGF (platelet-derived growth factor) 
action via receptor tyrosine kinases 
850T, 853F 
as a mitogen 1011-1012 
pericyte and smooth muscle 
recruitment 1238 
production of embryoid bodies 1257F 
PDI (protein disulfide isomerase) 682, 686 
PDK1 (phosphoinositide-dependent 
protein kinase 1) 860 
Pdm transcription regulator 1179 
PDZ domains 1050 
pectins 1083-1084 
pemphigus 1046 
penicillin 1267, 1291, 1293 
penis, cyclic GMP in 847 
pentose phosphate pathway 760 
peptide-binding grooves, MHC proteins 
1327 
peptide bonds 
in the a-helix and B-sheet 116F 
as C-N bonds 91 
energetics of 61, 346 
hydrogen bonding among 94, 
110-112 
in peptides and proteins 110-112 
in protein synthesis 339-344 
see also polypeptides 
peptide synthetase 365 
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peptidoglycans 1267, 1292 
peptidyl-transferases 343, 348, 352T 
peptidyl-tRNAs 339, 340F, 342F, 348, 
351F 
Per protein 878 
“perfect enzymes” 143 
perforin 1334 
pericentric heterochromatin 204F 
pericentriolar material 930, 931F, 943F 
pericentriolar matrix 982, 983F, 984, 985F 
pericytes 1235, 1236F, 1238 
perinuclear space 649, 650F 
periodic table of the elements 43F 
peripheral tolerance 1314 
periplasmic substrate-binding protein 
610F 
periplasms 1267F 
periventricular heterotopia 912, 913F 
PERK protein kinase 688 
perlecan 1058F, 1070, 1071F 
permeases see transporters 
peroxidases 735F 
peroxins 667 
peroxisome proliferation-activated 
receptors (PPARs) 875 
peroxisomes 
Abcd1 gene and 301F 
among intracellular compartments 
642 
as organelles 666-669 
persistence length 898, 926, 945 
persistence of responses, intracellular 
signaling 825 
persistent inputs, and feed-forward motifs 
522-523 
pertussis (whooping cough) 834, 1277 
PET (positron emission tomography) 
1092F 
Pex1 and Pex6 Al Pase 668 
Pex5 import receptor 667-668 
Peyer’s patches 1308F, 1311 
pH 
acidity of lysosomes 722-723 
KDEL receptor affinity and 714 
pH scale 46, 93 
regulation by vacuoles 724 
regulation of cytosolic 604-605 
pH changes, ion torrent™ sequencing 
481 
pH gradients 
contribution to electrochemical 
gradients 662, 762-763 
isoelectric focusing 453 
see also proton gradients 
PH (pleckstrin homology) domains 822, 
824F, 859F, 860 
phagocytosis 
of bacteria by host cells 1281-1282 
coiling phagocytes 1286 
defined 730 
as feature of eukaryotes 24, 25F 
as a lysosome delivery pathway 725 
plasma membrane enlargement 748 
professional phagocytes 739 
see also macrophages; neutrophils 
phagolysosomes 1284, 1301, 1306F, 
1329 
phagosomes 725, 738-740 
autophagosomes 725F, 726-727 
Listeria monocytogenes and 1284F 
PRRs and 1298 
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Salmonella enterica and 1285F 
phalloidins 904, 916F, 953-954F, 957F 
pharmaceutical production through DNA 

cloning 484, 506 

see also drug discovery 
phase, light waves 531, 532F 
phase-contrast microscopes 533-535, 

591F 
phase transitions, lipid bilayers 571, 572F 
phase variation, bacteria 294, 1290 
phencyclidine 636 
phenobarbital 1022 
phenotypes 

behavioral changes 488 

defined 485-486 

stochastic effects on 523-524 

synthetic 491 
phenylalanine 

structure 113 

tRNA for 335F 
pheromones 844, 857 
Philadelphia chromosome 1093, 1094F, 

1095, 1135 
phosphate bonds 
bond energies 78, 79F 
phosphates and phosphoanhydrides 
65-66F, 78, 79F, 91, 101 
phosphates, position in nucleotides 100 
phosphatidylcholine 567, 570F, 571T, 
572F, 573 
synthesis 689 
phosphatidylethanolamine 566-567, 
571T, 574, 689-690 
phosphatidylinositol (Pl) 574, 577F 
interconversion with PIPs 700, 701F, 
737, 859 
phosphatidylserine 566-567, 571T, 574, 
689-690 

and phagocytosis 740, 1030-1031 

and PKC 837 
phosphodiester bonds 

reformation by DNA ligase 246F, 

269-270 
reformation by DNA topoisomerase 
251-252 
in RNA 302F 
phosphodiesterases see cyclic AMP; 
cyclic GMP 
phosphoenolpyruvate 79F, 85F, 105 
phosphofructokinase 104 
phosphoglucose isomerase 104 
3-phosphoglycerate 76, 77-78F, 85F, 105, 
785 
phosphoglycerate kinase 77F, 78, 105 
phosphoglycerate mutase 105 
phosphoglycerides, cell membrane 
566-567 
phosphoinositide phosphatases 859 
phosphoinositides 
AP2 binding to 699 
marking organelles and membrane 
domains 700 
membrane bending and 594 
in signaling complex formation 822, 
823F 
phospholipases 574 
phospholipase C 574 
phospholipase C-B (PLCB) 836, 837T, 
838F, 840F, 859, 862F 
phospholipase C-y (PLCy) 852, 853F, 
859, 862F 


phospholipase C-¢ (PLCC) 839F 
phospholipid exchange/transfer proteins 
691 
phospholipid translocators 570, 574, 690 
phospholipids 
G protein signaling through 836-838 
mobility 569-571 
in plasma membranes 9, 98 
sites of synthesis 760 
spontaneous bilayer formation 
568-569 
structures 566-567 
phosphorylation 
autophosphorylation 364F, 688, 841, 
842F 
in the cell-cycle control system 968, 
970F 
initiation factors 423-425 
nuclear localization signals 655 
regulating protein degradation 360 
regulating protein function 153-156, 
165T 
regulating signal responses 826 
self-phosphorylation of P-type pumps 
607 
of serine in RNA polymerase tails 312, 
316-317 
of serine in the nucleosome 196-197 
phosphorylation cycles 156 
phosphotyrosine docking/binding sites 
824F, 850, 852-853, 863 
photoactivation, fluorescent dyes 544, 
546F 
photochemical reaction centers 783-784, 
788-790, 793, 796 
photolyases 885 
photoproteins 884 
photoreceptors, rod and cone 844-846, 
848 
photorespiration, peroxisomes in 667, 
668F 
photoswitchable probes 551 
photosynthesis 
and atmospheric oxygen 796-797 
ATP production by 753 
categories of reaction 783-784 
charge-separation in 788-789, 
792-793 
as complementary to respiration 55 
electron-transfer process 788 
energetics of 54F 
green sulfur bacteria 796 
role of chloroplasts in 782-799 
thylakoid membrane as site of 
786-787 
see also chloroplasts 
photosynthetic reaction centers 588 
photosystems 
in photosynthetic organisms 754-755, 
789F 
of the thylakoid membrane 786 
photosystem | 755F, 789, 790F, 792-793, 
794F 
photosystem Il 755F, 789-791, 792F, 793, 
794F 
photosystem II complex 588, 791F 
phototrophic organisms 11, 13, 14F 
phototropin 885 
phragmoplasts 1000, 1001F 
phylogenetic classification of bacteria 
1267 


phylogenetic trees (tree of life) 218-220, 
221F 
construction 10 
genomic analysis and 14 
primary branches 14-15, 220 
phytochromes 883-885 
PI 3-kinase/Akt/mTOR signaling pathway 
1114-1115 
PI 3-kinase-Akt signaling pathway 
860-861, 1030F 
TOR and 861, 1017 
see also RTK/Ras/PI3K pathway 
PI 3-kinase (phosphoinositide 3-kinase) 
and cell survival 860-861 
in chemotaxis 958-959 
classes 1a and 1b 859 
as a growth factor 1017 
signaling protein recruitment 574, 852 
pili 1267, 1277 
pilin gene/protein 1290 
pinch 1079 
pineal gland 877 
Pink1 protein kinase 727 
pinocytosis 730-732 
macropinocytosis 725, 732, 733F, 738 
PIPs (phosphatidylinositol phosphates) 
700, 701F, 703 
PIS)P 701K: 707, 737 
PI(3,4)P2 701F 
PI(3,4,5)P3 701F, 739, 740F, 859-860, 
958, 1017F, 1115, 1117F 
PI(4)P 701F 
Pl(4,5)P2 701-702, 733, 739, 
836-837, 859-860, 959, 
1017F, 1078F 
PI(5)P 701F 
piRNAs (Piwi-interacting RNAs) 305T, 
429, 433 
pituitary glands 1196 
Pitx1 gene expression 1174-1175 
PKA (protein kinase A) see cyclic-AMP- 
dependent protein kinase 
PKB (protein kinase B) see Akt 
PKC see protein kinase C 
plague 1276, 1281 
plakins 948-949, 959 
plakoglobin (y-catenin) 1037T, 1042, 
1046F 
plakophilin 1037T, 1046F 
planar cell polarity 1189-1190 
planar polarity pathway 869 
planarian worms 1247-1249 
plant growth regulators 881, 1087 
plants 
cell growth and cell wall orientation 
1085-1087 
cell size and ploidy 1194 
cell walls 26, 1000, 1053, 1081-1087 
cytokinesis in 1000-1001 
energy storage as starch 80-81 
flowering times 1182-1184 
intracellular signaling pathways 880- 
885 
model organism 32 
and nitrogen-fixing bacteria 12 
peroxisomes in 667, 668F 
receptor serine/threonine kinases in 
881 
regeneration in culture 442 
RNAi and viruses 431 
transgenic plants 507-508 


plasma, distinguished from serum 1011 
plasma cells, effector B cells as 1309 
plasma membrane repair 748 
plasma membranes 
Ca?+ gradient/pumps 607, 838 
carbohydrate layers protecting 582 
cell polarization and 748-750 
clathrin-coated pits 553F 
composition in eukaryotes 572 
depolarization by action potentials 
607, 621-624, 627-629, 632, 
634-636 
depolarization by neurotransmitters 
629, 631, 633 
endocytosis 730-741 
endosome recycling 737-738 
hyperpolarization 844-845 
pinocytic vesicle formation 731 
potential enlargement by secretory 
vesicles 746-748 
recruitment of intracellular signaling 
proteins 700 
red blood cell 565F 
soluble and membrane-bound 
immunoglobulins 1317 
synaptic vesicles from 746 
universality of 8-9 
virus fusion 1280 
plasmalogens 667 
plasmid vectors 468-469, 508F 
plasmids 
origins 18 
segregation 897 
tumor viruses 1130-1131 
virulence plasmids 1268 
plasmodesmata 1053-1054, 1082F 
Plasmodium (P. falciparum) 610, 801F, 
804, 805F, 1272-1273, 1276 
life cycle 1272F 
plastids see chloroplasts 
plastocyanin 789, 792-793, 794F 
plastoquinol 791 
plastoquinones 76/F, 789, 791-792F, 793 
platelets 1002, 1011, 1054-1055, 1076T, 
1077-1078, 1241T 
derivation from megakaryocytes 
1002, 1239 
PLCs see phospholipase C 
plectin 933, 948-949, 1076 
Plk (Polo-like kinase) 978-979, 985 
ploidy, and cell size 1194 
pluripotency 
blastula stage 1148 
fertilized egg 1253 
neoblasts 1248 
see also stem cells 
plus-end-depolymerization, kinetochores 
991 
plus-end-directed transport 938 
plus-end-polymerization, kinetochores 
987-988 
+TIPs (plus-end tracking proteins) 933, 
935, 958, 960 
podosomes 952 
point mutations 218, 487 
cancer genomes 1111, 1116 
immunoglobulins 1322 
Ras oncogenes 1106 
RNA splicing errors and 324 
tumor suppressor genes 1109 
point spread functions 540, 551-552, 
553F 


polar amino acids 109-110 
polar covalent bonds 56 
polar ejection force 991-992 
polarity 
actin filaments 892F, 894, 896, 898, 
899F 
spindle bipolarity 986-987 
polarization of the embryo 1155-1157 
polarized light 458, 459F 
Pold and Pole 246 
poliomyelitis 1266F, 1273T, 1275, 1281 
poliovirus 1266F, 1273T, 1274, 1281, 
1288-1289 
polyacrylamide-gel electrophoresis 
for DNA 465, 466F 
protein phosphorylation 879F 
SDS-PAGE 452, 453F, 454 
polyacrylamide-gel electrophoresis (PAGE) 
372F, 454, 455F, 466F, 879F 
see also SDS-PAGE 
polyadenylation 
effects on resultant protein 417-418 
of MRNA 3’ ends 315-316, 324-326, 
32/F 
poly-A shortening 426-427 
polycistronic transcripts 348, 806-807 
polycomb chromosomes 208-211 
polycomb group proteins 1164, 1165F, 
1183 
polycomb protein group (PcG) 206, 211 
polyethylene glycol 444F 
polyisoprenoids 99 
polymers see macromolecules 
polymorphisms 
MHC genes/proteins 1328, 
1330-1331 
SNPs 232-233, 492, 493F 
types 492 
polymorphonuclear leukocytes see 
neutrophils 
polypeptides 
amino acid addition 339-344 
disordered 125 
loops as binding sites 118F, 121, 138 
mode of ER membrane transport 
675-677 
multiple chains in large proteins 123 
number of possible variations 
118-119 
proteins as 6, 109 
side chains of 109-110 
synthesis outside ribosomes 365 
polyploidy, megakaryocyte nucleus 1242 
polyprotein precursors 744 
polyps, adenomatous 1123 
polyribosomes 349, 675 
polysaccharides 97 
glycogen as 79 
Golgi apparatus synthesis 711 
lysozyme action on 144 
pectins 1083-1084 
synthesis by condensation reactions 
IF 
see also cellulose 
polytene chromosomes 208-211, 391F, 
540F 
polyubiquitin chains 157-158 
polyubiquitylation 
caspases 1029 
distinguished from monoubiquitylation 
735 
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population size 
and neutral mutations 231 
and response smoothing 829 
porins 
aquaporins 580F, 599, 612-613 
in bacteria and mitochondria 662, 758 
barrel formation 580F, 581 
distinguished from channels 611 
porphyrin rings 766, 787 
Porphyromonas gingivalis 1266 
position effect variegation 195-197, 
201-202, 205 
position effects 
embryonic transcription regulators 
393 
and gene silencing 194 
positional labels 1161, 1206 
positional values 1163, 1169, 1205 
positive feedback 
cell memory and 520 
CICR (Ca?t-induced calcium release) 
838 
cooperative binding and 517, 519F 
heterochromatin formation 195 
iPS conversion 1255 
in M-Cdk activation 979 
pattern generation 1152 
self-amplification of cell differences 
1152 
self-amplification of nerve impulses 
621 
vesicle recruitment to membranes 
707 
positive feedback loops 
cell memory and 401-402, 412, 413F, 
432 
helper T cells 1337 
M-Cdk in 1004 
switches and bistability 517, 518-520, 
827-829 
positive regulation of enzymes 151, 152F 
positive selection 1332 
post-transcriptional controls 413-428 
post-translational and co-translational 
processes 6/0, 6/7 
post-translational import 
endoplasmic reticulum 677, 683, 685 
mitochondria and chloroplasts 659, 
664, 670 
post-translational modifications 
covalent modification 165-166 
of GFP 543 
mapping using tandem mass 
spectrometry 457 
multisite 166 
post-translational translocation 677, 6/78F 
postcapillary venules 1311-1312 
potassium see Kt 
poxviruses 1274, 1280, 1289 
PPARs (peroxisome proliferation-activated 
receptors) 875 
pre-mRNA modification and RNA splicing 
316F, 317-321, 323F 
pre-synaptic terminals 747F 
precursor oligosaccharides 683-684 
premature aging 265 
RNA splicing errors and 324 
Xpd knockout mice 497F 
prenyl chains 572, 577F, 578 
preprophase 999, 1001F 
prereplicative complexes (preRCs) 
259-260, 974-975, 976F, 1002 
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presenilin 868 
presomitic mesoderm 1177-1178 
prickle cells 1226F 
primary axes, embryo polarization 
1155-1157 
primary cell walls 1082-1085 
primary cilia 824F, 845F, 873, 942-943, 
949, 950F 
primary ciliary dyskinesia 942 
primary cultures 441 
primary pathogens 1265, 1276 
primary tumors 
derived from one abnormal cell 
1093-1094 
genomes of metastases and 1119 
primer strand, DNA 240-241F, 243, 255, 
262 
primers, RNA 245, 247, 249, 253 
prion diseases 130-131, 132F 
probability 
statistical methods 524-525 
thermodynamics and 52, 60, 102-103 
procollagen 704, 705F, 720 
professional antigen-presenting cells 
(APCs) 1307, 1328 
professional phagocytic cells 
macrophages and neutrophils as 739, 
1298 
PRRs on 1298 
profilins 905-907, 909F 
progenitor cell commitment 1243 
proinsulin 130 
prokaryotes 
as bacteria and archaea 14-15 
circular DNA in 23F 
distinction from eukaryotes 12-13 
diversity 12-14 
horizontal gene transfer 18-19 
mRNA compared to eukaryotic 316F 
prolactin 864T 
proline 
in collagens 1062 
in elastin 1065 
hydroxyproline 1062F, 1084 
structure 113 
prometaphase 978, 980, 986, 990-991, 
992F, 994F 
promoter complexes 510-512 
promoters 
in CG islands 406-407 
fraction bound by activators and 
repressors 511-512, 517F 
gene clusters from 380 
in gene control regions 384 
orientation and DNA inversion 294 
protein concentrations and 513 
repressors and activators 381-388, 
386 
in transcription 306, 308-309, 388 
proneural genes/cells 1172 
proofreading 
aminoacyl-tRNAs 339, 344-345 
kinetic proofreading 345 
proofreading, DNA replication 242-244, 
250-251, 257F 
mismatch proofreading 250-251, 
257F 
translesion polymerases 273 
see also error correction; quality 
control 
proopiomelanocortin 744F 


prophase 
formation of sister chromatids 964 
M-Cdk activity 978, 985-986 
in meiosis 1006-1007 
in mitosis 978-980 
preprophase 999, 1001F 
spindle assembly 985 
prostaglandins 837 
prostate cancer 1117, 1128-1129F 
proteases, serine, and matrix 
metalloproteases 1072 
proteasomes 
function 357-359, 685 
ubiquitylation and 157, 358F 
protein activity control 
gene expression 372, 373F 
hyperactivity in cancer 1106-1107 
protein arrays 458 
protein assemblies 
cooperative allosteric transition 
152-153 
as molecular machines 164 
tetranucleosomes 192 
for transcription initiation 313 
protein-coding genes 120, 122-123 
protein complexes 
aggregation to achieve retention 714 
cellulose synthase rosettes 
1085-1086 
intracellular signaling complexes 822, 
823F, 851F, 859 
membrane proteins 588-589 
see also Arp2/3 complex 
protein concentrations 513-514 
protein degradation 
effects on mean lifetimes 514 
N-terminus acetylation and 360-361 
phosphorylation regulating 360 
protein domains 
ABC transporters 609F 
domain shuffling 121-122 
and genetic recombination 318 
in immunoglobulins 230 
as modules 117-118, 121-122 
as secondary structure 117-119 
SH2 example 115 
transmembrane proteins 579-580 
Type Ill fibronectin repeats 1066F, 
1067-1068, 1073 
protein families 
evolution of kinases 154-155 
examples 119 
ligand binding sites 136-137 
mitochondrial carrier family 779-780 
trimeric GTP-binding proteins 846T 
protein folding 
co-translational 353-354, 355F 
as conserved 460 
energetics of 114-115, 354 
formation of binding sites 6, 135F 
molecular chaperones and 114 
multipass transmembrane proteins 
581F 
NMR spectroscopy and 462 
noncovalent bonds in 50, 110-114 
oligosaccharide tagging 685 
see also a-helices; B-sheets; protein 
misfolding; protein unfolding 
protein function 
binding ability and 134-169 


FRET monitoring of dynamics 
543-546 
inferring from interaction maps 168 
investigation using site-specific 
recombination 294-295 
phosphorylation in regulating 153-156 
proteins structure and 462-463, 483 
proteins with unknown function 123 
protein kinase A (PKA) see cyclic-AMP- 
dependent protein kinase 
protein kinase B (PKB) see Akt 
protein kinase C (PKC) 
as calcium dependent 837, 852 
plasma membrane binding 574 
protein kinases 
cyclic-AMP-dependent (PKA) 827, 
834-837, 841, 843, 845T, 848 
as enzyme-coupled receptors 819 
evolutionary tree in eukaryotes 
154-155 
illustrating feedback mechanisms 
829F 
initiating autophagy 726 
regulation of molecular switches 819 
and replication origins 260 
as serine/threonine or tyrosine kinases 
819 
see also cyclin-dependent kinases; 
MAP kinases; serine kinases; 
tyrosine kinases 
protein machines 
ATP synthase as 754, 776-778 
coordination in 164 
recombination complexes 1006 
replication machine 249-250 
V-type pumps as 608 
protein misfolding 
BiP chaperones and 683, 712 
ER export and degradation 685-686, 
712 
prion diseases 130-131 
retrotranslocation 358, 686 
protein-only inheritance 131 
protein phosphatases and molecular 
switches 819 
protein phosphorylation, enzymes 
154-156 
protein-protein interactions 
aquaporins 580F 
biochemical and optical methods 
457-459 
cytoskeletal filaments 894 
identifying interacting proteins 
457-458 
interaction mapping 166-168 
interface types 137 
involving transcription regulators 385 
mediated by interaction domains 852 
regulation by 157 
transmembrane proteins 580, 590 
use of FRET 459 
see also protein subunits 
protein sequences, interspecies 
comparisons 36, 37F 
protein sorting 
pathways in the TGN 742 
in polarized epithelial cells 749F 
see also signal sequences; sorting 
signals 
protein structure 
coiled coils 116-117 


immunoglobulin fold 121 
models and representations of 115F, 
460, 461F 
primary, secondary, tertiary and 
quaternary 117, 841 
and protein function 462-463, 483 
size limits of analytical techniques 
462 
specified by amino acid sequence 
109-114 
three-dimensional 462-463, 579, 586, 
587F, 607 
using NMR spectroscopy 461-462 
using x-ray diffraction 460, 461F 
see also a-helices; B-sheets 
protein subunits 123, 127-128 
actin and tubulin 893-894 
activating subunits in mitosis 971 
symmetrical assemblies 152-153 
trimeric GTP-binding proteins 
832-834 
protein synthesis 
by condensation reactions 71F 
cytosol as location of 641 
global regulation 423-424 
inhibitors 351, 352T 
on polyribosomes 349 
possible evolution 365-366 
quality control and regulation 
351-353, 357, 361F 
speed of 341 
as transcription and translation 299 
transfer RNA role 334-340 
protein translocation 646 
in mitochondria and chloroplasts 
658-666 
roadmap 646 
three routes 678F 
protein translocators 580 
aqueous channels in 675-677 
in the endoplasmic reticulum 677F 
lateral gating 678-679 
in mitochondria 659, 800, 802F 
see also phospholipid translocators 
protein tyrosine kinases 1080 
see also Src protein kinases 
protein tyrosine phosphatases 864-865 
protein unfolding and AFM 549F 
proteins 
abundance 785 
accumulation delays 1176 
analytical methods 452-462 
availability in quantity 483-484 
conditionally short-lived 359 
conformation and chemistry 135-136 
crystallizing 561 
encoded by genes 7, 178 
fluorescent tagging in living cells 
542-546 
functions 5-6 
generating movement 160-163 
identifying 455-458 
multiplicity of functions 48-49, 109 
number in a eukaryotic cell 641 
numbers in human cells 168 
in plant primary cell walls 1084 
purifying 445-451 
role as catalysts 5—6 
transcription regulators as 373 
see also enzymes; membrane proteins 


a-proteobacteria 798F 
proteoglycans 582 
aggrecan 1058F, 1060, 1061F 
decorin 1058F, 1060 
in the extracellular matrix 1057-1061, 
1073-1074 
GAGs occurring as 1057, 1059-1061 
Golgi apparatus assembly 718-719, 
1059 
as matrix receptors 1074 
perlecan 1058F, 1070, 1071F 
proteolysis 
the caspase cascade 1022-1023 
in cell cycle control 970, 972F 
enzyme regulation through 150 
in insulin and collagen assembly 130 
isolating cell from tissues 440 
proteolytic cascades 1022-1023, 1025, 
1302 
see also caspase cascades 
proteomics, defined 167 
proto-oncogenes 
conversion into oncogenes 
1106-1107 
gain-of-function mutations in 1104 
protocadherins 1038, 1207 
protofilaments 
actin 899F 
cytoskeletal 894, 895F 
intermediate filaments 945 
microtubules 894, 926, 928-929, 
930F, 934-935, 937 
protofilament curvature 928-929, 934 
proton gradients (H+ gradients) 
active transport in bacteria, yeasts 
and plants 586 
bacteriorhodopsin and 587 
electron-transport chain and 86F 
mitochondrial membrane 103, 662, 
727 
thylakoid membrane 791, 793-794 
use by ATP synthases 586, 606, 779, 
793 
proton-motive force 754, 762F, 76/F, 768, 
803 
chemiosmosis predating 780-781 
in mitochondria and chloroplasts 794, 
795F 
proton pumps (Ht pumps) 
ATP synthases reversibility as 778 
bacteriorhodopsin as 586-588 
cytochrome c oxidase 770-773 
cytochrome c reductase 768-770 
of the electron-transport chain 
763-774, 767, 774F 
endosomes 736 
in the first living cells 795 
lysosomes 723 
mitochondria 754, 762 
NADH dehydrogenase complex 768 
thylakoid membrane 783-784 
proton-translocating proteins 773 
“proton wires” 773 
protons 
behavior in water 45-46, 613, 773 
membrane transport 164 
protozoa 
as eukaryotes 24, 25F, 30 
pathogenic 1272-12783, 1282 
variety of 31F 
protrusions see cell-surface protrusions 
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PRRs (pattern recognition receptors) 
classes 1299-1300 
dendritic cell expression 1305 
inflammatory response 1300 
recognition of PAMPs 1298, 1326 
recognition of viruses 1303-1304 
see also Toll; Toll-like receptors 
pseudogenes 
globin family 230 
in the human genome 184T 
loss-of-function mutations 229 
and purifying selection 221 
Pseudomonas aeruginosa 1268 
pseudopods 739, 740F 
pseudosymmetry 603, 604F 
pseudouridine 328, 329F, 335F 
PSP (postsynaptic potential) 633-634, 
635F, 636 
PTB (phosphotyrosine-binding) domains 
822, 824F, 852-853 
PTEN phosphatases 859, 1115, 1117 
puffer fish (Fuga rubripes) 29, 223 
pulsed-field gel electrophoresis 466 
pulses of gene activation 522 
Puma protein 1028 
pump proteins see membrane transport 
proteins 
“purifying” DNA 473 
purifying selection 219-220, 221F, 223, 
225, 231 
purines 
in complementary base-pairing 176, 
177F 
structures of adenine and guanine 
100 
purines, hydrolysis see depurination 
Purkinje cells 547F, 650 
puromycin 351, 352T 
purple bacteria 794F, 797, 798F 
“purple membrane” 586, 587F, 590, 591F 
pus 1301 
pyrene fluorescent probe 900F 
pyrimidines 
in complementary base-pairing 176, 
177F 
structures of thymine, cytosine and 
uracil 100 
pyrophosphate hydrolysis in biosynthesis 
71-72 
pyruvate 
anaerobic breakdown 76F 
conversion to acetyl CoA 75, 81-82 
oxidation in the citric acid cycle 
82-84, 758, 759F 
produced in glycolysis 75, 105 
as a substrate of several enzymes 87 
pyruvate carboxylase 70F 
pyruvate dehydrogenase complex 82, 
107, 149F 
pyruvate kinase 105 
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Q cycle 770, 771F, 791, 792F 

quality control 
aspect of apoptosis 1022 
protein exit from the ER 711-712 
RNA splicing 323-324, 336 
RNA transport from the nucleus 

419-421 

translation 351-353, 357 
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quantitative analysis 38 
quantitative approaches 
importance for biology 509 
transcription activators 510-514 
transcription repressors 514 
quantitative measurements 
of gene expression 502-503 
quantitative PT-PCR 502-503 
quantum dots 538 
quantum mechanics 532 
quaternary structures 117, 841, 842F 
quinones 
as electron carriers 764-766, 767F, 
768, 788 
plastoquinones 76/7F, 789, 791-792F, 
793 
ubiquinone 765-766, 767F, 768-770, 
771F, 772-773 
quorum sensing 813 
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R-spondin protein 1222 
Rab cascades 707, 708F, 721 
Rab effectors 706-708 
Rab family GTPases 278 
as Ras superfamily members 854T 
subcellular location 7O6T 
in vesicular transport 705-709 
Rab proteins, Legionella modification 739 
rabies 1273T, 1274F, 1279 
Rac protein 
adherens junction assembly and 
1043F 
as Rho family members 854T, 858, 
956-957 
Rad51 protein 279, 281F, 282 
Rad52 protein 281 
radial glial cells 1200, 1201F 
radiation 
and cancer treatment 1132 
and carcinogenesis 1094, 1128F, 
1132-1133 
Schmidtea mediterranea response 
1248 
ultraviolet light 267-268, 269F, 1094 
x-rays and hematopoietic stem cells 
1242 
radioisotope labeling 452, 454, 466-467, 
1219 
radioresistant bacteria 483 
Raf (MAP kinase kinase kinase) 856-857, 
882 
Ramachandran plots 111F 
Ran-GAP protein 653-654, 656 
Ran-GEF protein 653, 656 
Ran-GTPases 
as chromatin positional marker 656, 
986 
and nuclear pore complexes 653-654 
as Ras superfamily members 854T 
Rana pipiens 36F 
“random coils” 1065, 1066F 
random mutagenesis 485 
random walks 59, 60F, 231, 652 
Rap1 GTPase 1078F 
rapid freezing 556, 561F 
rapidly inactivating K* channels 635 
Ras family GTPases 278, 854T 
Ras oncogenes 1106, 1118, 1123, 
1125 


Ras-MAP kinase signaling pathway 
anticancer drugs targeting 1137F 
integrins and 1079 
Myc transcription and 1012 
Raf, Mek and Erk 856 
see also RTK/Ras/PI3K pathway 

Ras-PI3K pathway 1114 

Ras proteins 
activation mechanisms 855F 
as GTPases 157, 158F, 161 
signaling via MAP kinases 855-857 
three human types 854 

rate constants 
actin polymerization 900, 902 
association and dissociation 510F, 

511 
transcription rate constants 513 

Rb gene mutations 1108-1109 

Rb regulatory pathway 1113-1114 

Rb (retinoblastoma) protein family 

1012-1014 
papillomavirus inactivation 1132 
reaction centers, photochemical 783-784, 
788-790, 796 
evolution 793, 794F 
reaction-diffusion systems 1152, 1153F 
reaction rates 
diffusion-limited rates 143 
enzymatic acceleration 144 
Km (half-maximal rate) 141, 143 
and speed of molecular motions 
59-60, 143 

reader-writer complexes 
chromatin domains and 202 
nucleosome modifications 199-201, 

406F 

reading frames 
and genome translation in silico 

477, 482 
lg diversification and 1321 
translation and 334, 342, 347 
see also open reading frames 

Reaper protein 1029 

RecA protein 279-281 

receptor-activated Smads (R-Smads) 

865-866 

receptor down-regulation 830, 848, 853 

receptor editing 1313-1314, 1321 

receptor inactivation 848 

receptor-mediated endocytosis 709, 727, 

732-735, 849 

receptor sequestration 848 

receptor serine/threonine kinases 
865-866, 881 
receptor tyrosine kinases (RTKs) 
dimerization on ligand binding 851 
extracellular signaling and 850T 
IGF1 action 850T, 852 
insulin receptor as 824F 
signaling overlaps with GPCRs 
861-862 

subfamilies 851F 

see also RTK/Ras/PI3K pathway 

receptors 

cell-surface and intracellular 816F 

in extracellular signaling 815-816, 
831F 

matrix receptors and co-receptors 
1074 

recessive mutations 
cancer-critical genes 1005F, 1104 


complementation tests 490 
loss-of-function, as typically 489 
Reclinomonas (R. americana) 801F, 805 

recombinant DNA technology 
elements of 464, 484F 
and reverse genetics 494 
recombination events 1291, 1292F 
recombination mechanisms 287 
conservative site-specific 
recombination 292-295 
transposons 288-292 
recycling endosomes 696F, 706T, 730, 
737-738, 739F 
red blood cells 
asymmetry of lipid bilayers 573, 574F 
band 3 protein 605 
cytoskeleton 591, 592F, 912 
life-span 1244 
plasma membrane 565F 
see also erythropoietin 
redox-driven pumps 601 
redox pairs 764-765 
redox potentials 
along the respiratory chain 767, 768F, 
879 
chlorophyll Ag 792 
defined 764 
as measure of electron affinity 
763-764 
measurement and calculation of AG° 
765 
in photosynthesis 790F 
reducing agents, in photosynthesis 796 
reduction, as an electron transfer 55-56 
refractive indices 532 
refractory period, ion channels 622-623 
regions of attraction 519F, 520 
regulated exocytosis 748 
regulated nuclear transport 419 
regulated proteolysis 396F, 867, 970 
regulated secretory pathway 
distinguished from constitutive 
secretory pathway 741, 742F 
signaling in 744 
regulatory DNA 
as conserved 217 
differences between animal species 
1149, 1174-1175 
in the human genome 185 
as noncoding 7, 29 
regulatory genes 
multicellularity and 29-30 
in vertebrate evolution 227F 
regulatory networks, mathematics of 
509-512 
regulatory RNA, miRNAs as 1180 
regulatory sites, allosteric enzymes 151 
regulatory T cells 1314, 1325-1326, 
1327F, 1328, 1331-1332, 1333F, 
1335F 
in immunological self-tolerance 1336 
induced regulatory T cells 1336 
release factors 344, 348, 562F 
renaturation see hybridization 
“repair factories” 213-214 
repeated sequences 289 
see also tandem repeats 
replication bubbles 254F, 257-258 
“replication factories” 1287 
replication forks 
asymmetry 240-244, 246 


bacterial 249F 
eukaryotic 253-255, 257-259, 261 
failed, cell cycle response 1014 
nucleosome assembly behind 
261-265 
repair of stalled or broken 277, 280, 
281F, 1134 
in S-phase 974-977 
replication machines 249-250, 259 
replication origins 
in bacteria 254-255 
in eukaryotes 186, 254-255, 258-260 
in human cell division 257 
methylation 257 
S-phase 974-975, 976F 
replicative cell senescence 
avoidance by ES cells 1254 
cultured cells 442 
macrophages and 739 
and mitochondrial DNA 808 
p53 pathway and 1116 
stem cells and 1243 
telomere function and 264-265, 1016, 
1100 
reporter genes 393, 394F, 501-502, 884F 
reporter proteins 543 
repressor proteins see transcription 
repressors 
repulsive interactions, cell-cell junctions 
1188, 1206 
rescue, microtubules 927-928, 932-935, 
986 
residual bodies 739 
resolution 
distinguished from detection 532 
electron microscopy 554, 559, 560F, 
562 
light microscopy 530-532 
superresolution techniques 549-551 
resolution limits 
light microscopy 532, 549 
and wavelength 529, 533F 
resonance energy transfer 787-788, 789F 
see also FRET 
respiratory burst 1301-1302 
respiratory chain 
and electron donation 764 
malfunctions 808 
and oxidative phosphorylation 761 
protein biogenesis 802F, 804 
respiratory chain complexes 758, 762, 
764, 766, 779 
redox potentials 767, 768F 
supercomplex 772-773 
the three complexes 767 
three constituents 767 
response timing, intracellular signaling 
824 
resting membrane potentials 615, 617, 
629-630 
restriction factors 1297 
restriction nucleases 
in DNA cloning 468-469 
interphase chromosome structures 
209F 
in recombinant DNA technology 
464-466 
restriction points see Start 
Ret cadherin 1039F 
retina 
embryonic, tip cells and 1236 


ocular dominance columns 1212 
photoreceptive epithelium 1227 
RGCs (retinal ganglion cells) 
1204-1206 
retinal 147 
as (bacterio)rhodopsin chromophore 
587, 845 
in channelrhodopsin 623 
retinitis pigmentosa 324 
retinoblastomas 1013, 1107-1108, 1111 
see also Rb proteins 
retinoids/retinoic acid 876, 877F, 1257F 
retinotopic maps 1205, 1206, 1207F, 
1211 
retrieval pathways 695-696, 713-714, 
726, 743 
retromers 727, /28F 
retrotranslocation, damaged proteins 
358, 686 
retroviral-like elements 218F 
retroviral-like transposons 291 
retroviruses 
and cancers 1130T 
defined 290 
as oncogene vectors 1105-1106 
RNA editing and 419 
as transdifferentiation vectors 1258 
see also HIV 
Rev protein 420-421 
reverse genetics 494-495, 500 
reverse transcriptases 
AZT inhibition 1292 
in CDNA cloning 470 
quantitative RT-PCR 502-503 
in retrovirus infection 290, 1105, 1289 
telomerases as resembling 262 
reversible amyloids 132-133 
reversible reactions 61 
see also equilibrium reactions 
RGCs (retinal ganglion cells) 1204-1206 
RGD sequence (Arg-Gly-Asp) 1067-1068, 
1075, 1078, 1281 
RGSs (regulators of G protein signaling) 
832, 845 
Rheb GTPase 854T, 861 
Rho family GTPases 
activation by extracellular signals 958 
adherens junction assembly and 
1043F 
and bacterial entry to hosts 1281 
in cell polarization 955-959 
cytoskeleton and 858, 861 
neuron growth cones and 858 
pseudopod shaping 739 
as Ras superfamily members 854T, 


858 
RhoA 
activation in cytokinesis 997, 998F, 
999 


localization in cytokinesis 1000F 
rhodamine 537, 545F, 756F, 936F 
rhodopsin 

as a GPCR 844-845 

retinal and 147, 845 

signal amplification 848-849 
rhodopsins 

ER membrane insertion 681 

as GPCRs 588, 832, 845 
rhombomeres 1169, 1188 
RIAM protein 1078F 
ribbon models, introduced 115 
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ribonucleases, ribosome profiling 
505-506, 507F 
ribose 
methylation in MRNA 316F 
methylation in rRNA 328 
structure 96, 100, 302 
synthesis 86, 366 
ribosomal RNA (rRNA) 
abundance 327 
assembly 331F, 347 
codon-anticodon matches 345F 
evolution of genes for 220 
four types 328 
interspecies comparisons 15-16 
“S” values 309T, 328 
transcription process 305F 
ribosome profiling technique 505-506, 
507F 
ribosomes 
among intracellular compartments 
642 
antibiotic binding sites 351F 
bacterial and eukaryotic compared 
341 
characteristic of rough ER 670 
free and membrane-bound 674-675 
large subunit 331F, 341-343, 
346-347F, 349, 423, 673 
as macromolecular complexes 50F 
negative staining 560 
with a release factor 562F 
response to antibiotics 800 
as ribozymes 346-347 
RNA binding sites 341-342, 347 
role in translation 7, 340-343 
self-assembly 128, 649 
separation by centrifugation 445 
small subunit 331F, 341-344, 345F, 
346, 347F, 348, 423 
structure 346F 
subunit assembly 329, 340 
TOR and S6K effects 1017 
riboswitches 414-415, 423F 
ribozymes 
as catalysts 51, 69 
ribosomes as 346-347 
structure 363F 
in vitro selected 363-364 
ribulose, structure 96 
ribulose 1,5-bisphosphate 785, 786F 
ribulose bisphosphate carboxylase 
(Rubisco) 
in carbon fixation 785 
as example enzyme 48 
x-ray crystallography 461F 
Rickettsia rickettsii 1287, 1288-1289F 
Rickettsia spp. 798F, 805F, 1287-1288, 
1289F 
Rieske protein 770F 
Riftia pachyptila 12F 
RISC (RNA-induced silencing complex) 
429-431, 432F 
RITS (RNA-induced transcriptional 
silencing) complex 432 
RK (rhodopsin kinase) 845, 848 
RLRs (RIG-like receptors) 1300 
RNA catalysis 
and the RNA world hypothesis 
362-363 
in the spliceosome 321 
see also ribozymes 
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RNA cleavage 417-418 
RNA editing 418-419, 806 
RNA folding 302, 303F, 335F, 363 
RNA genes 
in the human genome 185 
for rRNA 327, 330 
in S. cerevisiae 305 
RNA helicases 1300 
RNA interference (RNAi) 
as an experimental tool 433 
as adefense mechanism 431-432 
and heterochromatin formation 
432-433 
limitations 501 
small ncRNAs and 429-431 
testing gene function 499-501 
RNA ligases 688 
RNA localization 421-422 
RNA polymerase holoenzyme 306 
RNA polymerase | 327 
RNA polymerase II 
as an “RNA factory” 317F 
gene control regions and 384-386 
general transcription factors and 
309-312, 384-385 
modifying proteins 312-313 
similarity with bacterial polymerase 
309, 310F 
snoRNAs and 328-329 
RNA polymerase Ill 328 
RNA polymerases 303-305 
DNA polymerases compared 304-305 
in eukaryotes 309T 
paused polymerases 388 
RNA-dependent 431 
RNA primers 245, 247, 249, 253 
RNA processing 
in chloroplasts 806 
control of gene expression 372, 373F 
RNA-reg method 371F 
RNA sequencing (RNA-seq) 
alternative splicing and 482 
deep RNA sequencing 477 
mRNA analysis 503-504 
RNA splicing 
and chromatin structure 323 
consensus nucleotide sequence for 
319 
coordination with transcription 322 
“cryptic” splice sites/signals 321, 
322F, 323, 324F 
errors and disease 323-325 
as a function of snRNAs 305T 
intron sequence removal by 315-316, 
317-318, 320F, 336 
regulation 416 
spliceosome role 319-320 
see also alternative splicing 
RNA transport and localization 372, 373F 
RNA tumor viruses 1105 
“RNA world” 69, 362-366, 415 
RNAs 
antisense 423F, 435 
categories and proportions of 305T, 
306 
conformations 5F 
distinctions from DNA 4-5, 302, 366 
double-stranded, as viral 
characteristic 1304 
rearrangements in the spliceosome 
321 


regulation of transport from the 
nucleus 419-421 
self-replicating potential 364, 365F 
as single-stranded 302, 363 
size of molecule 303 
in telomerase 263T, 429, 435 
in transcription and translation 4-5 
see also long noncoding; messenger; 
ribosomal; small noncoding; 
snRNaAs; transfer 
RNPs (ribonucleoproteins) see hnRNPs; 
mRNPs; snRNPs 
Robo1, Robo2 and Robo receptors 
1203F, 1204 
robustness, in biological networks 520, 
822 
Rock kinase 958F, 997, 998F 
rod photoreceptors 844-846, 848, 943 
ROS (reactive oxygen species) 808 
see also superoxides 
rosettes, cellulose synthase 1085-1086 
rotary catalysis, ATP synthases 776-778 
rough ER, protein glycosylation 683-684 
roundworms see Caenorhabditis elegans 
Rous sarcoma virus 1105, 1265 
RRE (Rev responsive element) 420 
RTK/Ras/PI3K pathway 1113 
RTKs see receptor tyrosine kinases 
Rubisco see ribulose bisphosphate 
carboxylase 
ruffles, in macropinocytosis 732, 733F, 
1280-1281, 1282F 
ruthenium red dye 582, 583F 
ryanodine receptors 838-840 


S 


S. cerevisiae see Saccharomyces 
S-Cdks 974, 979, 1014 
S-cyclins 969, 971, 993, 1013-1014 
S4 helix 622 
S phase, cell cycle 
centrosome duplication 984-985 
DNA replication in 258-260, 963, 
964F, 974-977 
visualization 966 
“S” values (sedimentation coefficients) 
rRNA 309T, 328 
ultracentrifugation 446, 455 
Saccharomyces cerevisiae 21T, 31F 
cell cycle in 966 
centromeres 203 
DNA replication in 253-254, 259 
gene density 182F 
genes essential to growth 499 
genes for voltage-gated ion channels 
627 
intercellular communication 813 
kinesins 936 
mitochondrial inheritance 807 
as model organism 29T, 966 
mutant libraries 498 
myosin V in 925 
RNA coding genes 305 
RNA rearrangements at the 
spliceosome 321F 
separated chromosomes 466F 
see also yeasts, budding 
salamanders 1194-1195 
Salmonella enterica 1268, 1281, 1282F, 
1284F, 1285, 1290 


Salmonella ssp. 
as Gram-negative 1267F 
use of phase variation 294 
SAM complex (sorting and assembly 
machinery) 659, 660F, 662 
Sanger sequencing see dideoxy 
sequencing 
Sar1 proteins 703, 704—705F 
sarcomas 1092 
sarcomeres 918-920, 948 
sarcoplasmic reticulum (SR) 
Ca?+ pump 606-607, 608F, 632F, 633, 
920 
in muscle cells 671, 920, 921F 
satellite cells, skeletal muscle 1234 
scaffold proteins 126 
cullins as 160, 164 
Dishevelled as 870 
in intracellular signaling complexes 
822, 823F, 857 
membrane bending 594 
in protein machines and biochemical 
factories 164, 332F 
septins as 949, 999 
in tight junctions 1049-1050 
scaffold RNA molecules 165, 435 
scanning electron microscopy see SEM 
SCAP (SREBP cleavage activation protein 
656F 
SCF (stem cell factor)/Steel protein 1187, 
1244 
SCF ubiquitin ligase 159-160, 164, 
167-168, 971, 972F, 973, 1004 
schistosomes 1302F 
schizophrenia 494 
Schizosaccharomyces (S. pombe) 
microtubule +TIP proteins 935F 
mitochondrial genome 801F, 805F 
as a model organism 966, 1271 
mutant phenotype 21F 
Schmidtea mediterranea 1247-1249 
Schwann cells 625 
Sciara 987F 
SCN cells (Suprachiasmatic nucleus) 877 
“scramblases” 574, 690 
SDS-PAGE (sodium dodecyl 
sulfate-polyacrylamide-gel 
electrophoresis) 452, 453F, 454F, 
584 
SDS (sodium dodecyl sulfate) 583, 584F 
sea urchin embryo 403F, 524 
sealing strands 1047-1048, 1049F, 1050 
Sec61 complex 676-678 
Sec pathway 665F 
Sec23/Sec24 and Sec 13/31 proteins 
705F 
SecA ATPase 677, 678F 
second-generation sequencing methods 


479-480 

second law of thermodynamics 52, 53F, 
60, 102-103 

second messengers 819-820, 824, 827, 
833 


in intracellular signal amplification 848 
IP3 and diacylglycerol as 837, 838F 
secondary cell walls 1082-1083, 
1085-1086 
secretion systems, bacterial 1271, 1278, 
1281, 1282F, 1285-1286 
secretory and endocytic pathways 
constitutive and regulated secretory 
pathways 741, 742F 


introduced 695, 696F 
secretory granules 132 
secretory proteins 
active precursors 743-744 
aggregation in TGN 742-743 
secretory vesicles 
budding from TGN 716, 742 
endocytosis 744F 
exocytosis 741-744 
localization 744 
maturation and acidification 743 
membrane removal by endocytosis 
746-748 
phosphoinositides in 701F 
Rab3a in 706T 
synaptic vesicles as 745 
securin 971, 973, 978, 992-994, 1009 
sedimentation velocity and equilibrium 
446-447 
segment-polarity genes 1159-1162 
segmental duplications, double strand 
breaks 228 
segmentation clock 1178-1179 
segmentation genes 
insect bodies 1159, 1160F 
regulatory hierarchy 1161F 
in vertebrates 1177 
selectins 720 
cell-cell adhesion 1054-1055 
E-, L- and P-selectin 1054-1055 
homing receptors 1312 
structure and function 1055F 
selective advantage see natural selection 
selectivity filters, ion channels 613, 618, 
619F, 622F 
selenocysteine 350 
self-amplification 
caspase cascade in apoptosis 1023 
complement cascade 1303 
of embryonic asymmetry 1152 
of nerve impulses 621, 623 
of Rab domains 707 
self-assembly 
in cells 9, 128-129 
glycolipids 575 
lipid bilayers 566 
self-association, cytoskeletal filaments 
893, 897 
self-organization, embryonic development 
1145 
self-phosphorylation, P-type pumps 607 
self-renewing tissues 1120, 1217 
self-sealing, lipid bilayers 568, 569F 
self-tolerance see immunological 
self-tolerance 
“selfish DNA” see mobile genetic elements 
SEM (scanning electron microscopy) 
chick embryo neural tube 1192F 
frog neuromuscular junction 630F 
fungal infection on Drosophila 1299F 
NK cell attacking a cancer cell 1304F 
olfactory neuron cilia 844F 
osteoclast on bone matrix 1231F 
pericytes 1236F 
principles 558-561 
semiconservative replication 
of centrosomes 984 
of DNA 240, 242F, 447 
Semliki forest virus 1275F 
senescence see replicative cell 
senescence 


sensitive period, visual system 1212 
sensitivity of target cells 824 
sensory mother cells 1172, 1173F 
sensory neurons 1172, 1208-1209 
separases 993, 1009 
septins 949-950, 959, 999 
septum, Z-ring and 896 
sequence logos 308, 375, 378F 
sequential induction 1153-1154, 1160 
serine 
phosphatidyl- 566-567, 571T, 574 
phosphorylation during transcription 
388F 
phosphorylation in RNA polymerase 
tails 312, 316-317 
phosphorylation in the nucleosome 
196-197 
structure 113 
serine kinases in protein kinase evolution 
155F 
serine proteases 1072 
Asp-His-Ser catalytic triad in 136F 
complement system 1302 
domain shuffling in 121F, 123 
elastase and chymotrypsin compared 
119F 
as a protein family 119, 121F 
serine/threonine kinases 
AKT as 860 
distinguished from tyrosine kinases 
819-820 
PKA as 834-835 
plant photoproteins 884 
serotonin 
as an excitatory neurotransmitter 629 
cyclic AMP response 834F 
GPCRs activated by 832 
serotonin secretion 1218, 1219F, 1239, 
1241T 
serum, distinguished from plasma 1011 
seven-pass transmembrane proteins 
Frizzled as 870 
Smoothened as 872 
see also G-protein coupled receptors 
Sevenless (Sev) RTK 855 
sex chromosomes 
in different organisms 486 
dosage compensation and 410 
as nonhomologous 180 
X-chromosome sample sequence 
300F 
X-inactivation 410-411, 412F, 1252 
sex hormones 99, 875-876 
sexual reproduction 
haploid-diploid cycle 486 
homologous recombination and 277, 
282 
and horizontal gene transfer 19-20 
see also meiosis 
shadowing (electron microscopy) 560 
SH2 (Src homology 2) domains 
as an interaction domain 822, 824F 
in cytoplasmic tyrosine kinases 862 
evolutionary tracing 136-137 
multicellularity and 122 
phosphorylation and 154, 156 
phosphotyrosine binding via 852-855, 
859, 863, 864F 
positioning role 135 
structure 115-116, 118F 
versatility 121-122 
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SH3 (Src homology 3) domains 
as an interaction domain 822, 824F 
binding to proline-rich domains 853 
inhibitory action 118F, 122, 156, 157F 
tyrosine kinases and 854-855, 863, 
864F 
in ZO scaffold proteins 1050F 
shelterins 263 
shibire mutant, Drosophila 702F 
Shigella flexneri 1268, 1284F, 1287, 
1288-1 289F 
Shine-Dalgarno sequence 348, 349F, 
422, 423F 
shotgun sequencing 479 
shugoshin 1009 
shuttle system, NADH 760 
shuttling proteins 654 
sialic acids 575, 717, 1056, 1279, 1303 
Sic1 protein 973T, 1003-1004 
sigma (c) factors 306, 307-308F, 
309-310, 384 
sigmoidal relationships 517 
sigmoidal responses 827-828, 841 
signal hypothesis 672, 673F 
signal patches 647, 728, 729F 
signal peptidases 647, 664, 673, 677 
signal processing, intracellular 825 
signal-processing proteins 
Src kinase example 155-156, 157F 
types of response to signal 
concentrations 827-831 
signal-recognition particles see SRPs 
signal sequences 
amino acid sequences 647, 648T 
discovery 672 
internal 659, 663F, 664, 677-679, 
680-681F 
within sorting signals 647 
for successful gene transfer 801 
translocation into chloroplasts and 
664-666 
translocation into endoplasmic 
reticulum 672-675, 677 
translocation into mitochondria and 
659, 661F, 663 
translocation into peroxisomes and 
667-668 
signal transduction, cell-surface receptors 
818 
signal variability 
among cell populations 829 
as noise 822 
signaling centers 1153, 1160, 1167-1168, 
1178 
signaling hubs, Ras and Rho as 854 
signaling mechanisms 
compared 158F 
in nerve cells 621 
see also cell signaling 
signaling molecules 
response and turnover 825-826 
see also hormones 
signaling pathways 
discovery in Drosophila embryos 
1154 
Hedgehog pathway 1150, 1154, 1160 
JAK-STAT signaling pathway 
863-864, 1304 
NF«B signaling pathway 873-874, 
1301 
PI 3-kinase-Akt signaling pathway 
860-861, 1017, 1030F 
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TGFB signaling pathway 1117 
Wnt signaling pathway 1124, 1157, 
1160, 1179, 1190, 1220-1221 
see also intracellular signaling; Notch; 
Ras-MAP kinase; RTK/Ras/PI3K 
pathway 
signaling proteins 
inducing differentiation in culture 1169 
lysosomal degradation 735 
membrane localization 577-579 
vertebrate embryo patterning 1168 
see also intracellular signaling proteins 
“signatures” of transposition 289 
silent mutations 238 
SIM (structured illumination microscopy) 
550 
simulations 524, 1178 
SINEs (short interspersed nuclear 
elements) 218F, 291 
see also Alu sequences 
single-molecule localization methods 
551-553 
single-particle reconstruction/tracking 
559F, 561-562, 586, 589, 593, 
Ti3F 
single-pass transmembrane proteins 
CD4 and CD8 co-receptors 1331 
enzyme-coupled receptors as 819 
Golgi apparatus 716 
Notch and Delta as 867-868 
structure 579, 582F, 677, 679F, 716 
single-strand binding (SSB) proteins 246, 
247-248F, 253, 256F 
single-strand breaks, mismatch repair 
250 
single-stranded DNAs as probes 472 
siRNAs (small interfering RNAs) 305T, 
429, 431 
sister-chromatid cohesion 964, 977, 992, 
994, 1009 
sister-chromatid resolution 979, 982 
sister chromatids 
bi-orientation 988-989 
formation in interphase 185 
formation in prophase 964 
in homologous recombination 275 
kinetochore attachment 987 
at metaphase 214-215 
nonhomologous end joining and 275F, 
278 
separation 992, 993F, 994 
site-directed mutagenesis 543 
site-specific recombination 
conservative site-specific 
recombination 292-295 
TCR assembly 1325 
transgenic mice 496 
SIV (simian immunodeficiency virus) 
1291F 
S6K (S6 kinase) 1017 
skeletal muscle 
cells as syncytia 1233 
genesis and regeneration of 
1232-1235 
myosin Il in 915-916 
organization of 917-919 
from reprogrammed fibroblasts 
396-397, 398F 
see also muscle contraction 
skin see epidermal 
Skp1 protein (adaptor protein 2) 167-168 


sliding clamps 247, 248-249F, 262, 274F 
Slit protein 1203 
Slug protein 1042 
Smac/Diablo protein 1029 
Smad family proteins 865-866, 1126F 
Smad pathway 1197 
small intestine 
fastest self-renewal 1217, 1218F 
stem cell location 1219 
tight junctions 1037, 1049F 
small molecules 
in cancer treatment 1135-1137 
coenzymes 146-148 
diffusion 59, 598 
inhibitors of protein function 459-460 
intracellular signaling by 874-876 
major families 47, 48F 
passive diffusion into nuclei 650 
second messengers as 819 
shedding by stressed bacteria 620 
transport by ABC transporters 609F 
see also neurotransmitters 
small noncoding RNAs 429-431 
bacterial defense against viruses 
433-434 
miRNAs (microRNAs) 305T, 429-431, 
1149 
piRNAs (Piwi-interacting RNAs) 305T, 
429, 433 
regulatory role of three types 429 
and RNA interference 429-431 
siRNAs (small interfering RNAs) 305T, 
429, 431 
smallpox (variola) 1265-1266, 1273T, 
1275, 1278, 1287F 
SMC proteins (Structural Maintenance of 
Chromosomes) 977, 982 
smell (olfactory receptors) 824, 832, 
843-846, 1250F 
smoking see tobacco 
smooth muscle 916-918, 921-923, 948 
smooth muscle cells 1232, 1233F, 
1235-1236, 1238 
Smoothened protein 871-873 
Smurf ubiquitin ligase 866 
Snail protein 1042 
snake venom 1068 
SNARE complexes 709, 745 
SNARE proteins 705, 707 
in autophagy 726 
in membrane fusion 708-709, 1280 
subunit anchoring to ER 682 
t-SNAREs and v-SNAREs 708-709, 
712, 714, 745F, 747F, 751 
snoRNAs (small nucleolar RNAs) 305T, 
328 
SNPs (single-nucleotide polymorphisms) 
disease predisposition and 493F 
in population studies 232-233, 492 
snRNAs (small nuclear RNAs) 
function 305T 
snoRNAs (small nucleolar RNAs) 
305T, 328 
as spliceosome components 319-320 
snRNPs (small nuclear ribonucleoproteins) 
and Cajal bodies 331-332 
and the nucleolus 330 
as spliceosome components 
319-322, 323F, 325-326, 329F 
sodium see Na* 
sodium dodecyl sulfate (SDS) 583, 584F 


see also SDS-PAGE 
Sog gene (Short gastrulation) 1165-1166, 
1169 
somatic cells 
embryonic development 1158 
mutation frequency 238-239 
regeneration from 1247-1248, 1249F 
self-sacrifice of 1091 
somatic hypermutations 1321-1323, 
1325, 1335-1336, 1338F 
somatic mutations 
in cancer cells 1094, 1104, 1112 
see also driver; passenger mutations 
somatosensory system 1204-1205 
somatostatin 835-836 
somites 1177-1179, 1185, 1187, 1189F, 
1233 
Sonic hedgehog (Shh) protein 1191, 
1199, 1200F, 1202 
Sordaria 1007F 
sorting receptors 647-648 
sorting signals 
basolateral proteins 750 
nuclear localization signals 651 
and protein movements between 
compartments 645-647 
retrieval pathway to ER 713-714 
secretory proteins 743 
see also signal sequences 
Sos protein (Son-of-sevenless) 824F, 855, 
860 
Sox2 transcription regulators 398-399F, 
506F, 1254-1255 
see also OSKM factors 
spatial patterning 
in embryonic development 
1150-1154 
lateral inhibition and 1171-1173 
mechanisms 1155-1175 
polarization as first step 1155 
transient patterns 1160-1162 
vertebrate embryos 1167-1169, 
1177-1178 
special pairs, chlorophyll 788-790, 
791-792F, 793, 794F 
speciation 
conserved DNA sequences and 226 
transposed DNA sequences and 292 
species 
antiviral defenses 432 
characteristic transposon types 292 
cross-species transmission 1279, 
1291 
different sex chromosomes in 486 
exporting the CRISPR system to 
497-498 
extinct, genomes 479 
gene number and complexity 
415-416 
interchangeability of homologous 
proteins 1146 
mitochondrial genomes compared 
801F, 805F 
number living today 2 
protein sequences compared 36, 37F 
regulatory DNA as distinguishing 
1149, 1174-1175 
size differences 1010, 1193-1197 
tissue regenerative abilities 1247 
variation of exon and intron lengths 
322F 


specimen preparation 
electron microscopy 555-556, 559 
light microscopy 535-536 
speckles, nuclear (interchromatin granule 
clusters) 213, 331-333 
speckles, tubulin 991F 
spectinomycin 351F 
spectral karyotyping 181F 
spectrin 591, 592F, 905, 911F, 912-913 
spermatozoa 
as centrosome source 987 
elimination of mitochondria from 807 
localization of mitochondria 755, 756F 
membrane domains 590, 591F 
see also fertilization 
sphingolipids 567-568, 573, 575 
sphingomyelin 567, 571T, 572F, 574-575, 
690 
sphingosine 690 
spin labels 569-570 
spinal cord 
development 1170F, 1186, 
1199-1200, 1202, 1203F, 1209F 
injury 1251 
spinal muscular atrophy 324 
spindle see central spindle; mitotic 
spindles 
spindle assembly checkpoints 993-994 
spindle pole body 557F, 930 
Spirulina platensis 778F 
spleen 
antigen removal 1311 
cells, transplanted in mice 1193 
erythrocyte removal 1244 
hematopoietic stem cells in 1244 
lymphocyte accumulation 1031, 1055 
lymphocyte activation 1308-1309, 
1311 
spliceosomes 
catalysis 321, 324 
GFP-tagged 544F 
intron sequence ambiguities 416 
role 319-320 
splicing enhancers 322 
Spo11 protein 282-285 
SR proteins 322, 323F, 326 
Src protein kinases 117, 118F, 155-156, 
157F, 578 
binding via SH2 domains 852 
as cytoplasmic tyrosine kinases 862, 
1080, 1336 
Lck protein as 1331-1332 
regulation 155-156 
viral cancers and 1265 
SREBP (sterol response element binding 
protein) 656F 
SRP-like pathway 665F 
SRP receptor 673-674, 678F 
SRPs (signal-recognition particles) 
673-674, 675F, 677-680 
and the nucleolus 330 
SSB (single-strand binding) proteins 246, 
247-248F, 253, 256F 
staining 
cell cycle stages 966 
electron microscopy 555-556 
Gram-positive and Gram-negative 
610F, 1267F 
light microscopy 529, 535, 536F 
white blood cells 1240F 
see also fluorescent dyes 


standard free energy changes, AG° 
comparing reactions 61-62, 765 
deriving equilibrium constants 62-63, 
63T, 139F 
phosphate bond breakage 78, 79F 
viability of reactions 71-72 
standard redox potentials 765 
Staphylococcus (S. aureus) 
antibiotic-resistant (MRSA) 1276 
as opportunists 1276 
Staphylococcus spp. as Gram-positive 
1267F 
starch 80-81, 785-786 
starfish eggs 839F 
start codons 347-348 
start-transfer signals 678-680, 681F 
Start transition 965, 966F, 968-970, 972, 
1014 
STAT proteins (Signal transducers and 
activators of transcription) 863 
stathmin (0p18) 933, 935, 959 
statins 733 
statistical methods in biology 524-525 
steady-state concentrations 513F, 514 
steady-state kinetics 142 
steady states 
actin filament growth 901, 902, 904 
bistabililty 518-520 
negative feedback effects 516 
protein-promoter complexes 511 
stable and unstable 519 
time to attain 512,514, 516 
stearic acid 98 
STED (stimulated emission depletion 
microscopy) 551-552 
Steel/SCF protein 1187, 1244 
stem-cell niche 1222 
stem cells 
in artificial replacement 1249-1250 
asymmetric cell division idea 1222, 
1223F 
blood cell formation from 1239-1247 
cancer stem-cells 1120-1122, 1124 
defining properties 1219, 1220F 
epidermal stem cells 1225-1226 
in epithelial tissues 1217-1227 
guided generation of cells and organs 
1266-1267 
hematopoietic 1239, 1242-1243, 
1308 
independent-choice theory 1222, 
1223F 
in intestinal crypts 1122, 1124, 
1219-1220, 1224 
mesenchymal 1229 
multipotent 1219F, 1220-1221, 1222F, 
1229, 1239, 1242-1244 
mutation risk 1243 
myoblasts persisting as 1234 
neoblasts resembling 1248-1249 
neural stem cells 1201F, 1250, 1251F 
organoids from 1223 
restricted possibilities 1251 
therapeutic use 1249-1251 
types 1220 
stem cells, embryonic 
in drug discovery 1258-1259 
guided generation of cells and organs 
1266-1267 
production and pluripotency 1253 
transcription network 399F, 1254 


Page numbers with an F refer to a figure; page numbers with a T refer to a table. 


Index 1:47 


stem cells, pluripotent 
induced pluripotency 398, 401, 
1254-1259 
through reprogramming 398 
stem-loop structures, RNA 414, 423F, 
427F, 435 
stereocilia 890, 892, 924, 1189-1190 
“steric zipper” 131F 
steroids 99, 875-876 
brassinosteroids 881 
sterols, cholesterol as 568 
sticklebacks 1174-1175 
“sticky ends” 465F 
stochastic effects 523-524 
stomach cancer 1093, 1110F, 1120, 
1129F, 1132, 1265, 1277 
stop codons 348-349, 350 
nonsense-mediated mRNA decay 
327F, 352-353 
premature 418 
relative scarcity in ORFs 482 
tumor Suppressor gene mutations 
1110F 
stop-transfer sequences 663, 679-680, 
681F 
STORM (stochastic optical reconstruction 
microscopy) 552, 553F 
strand-directed mismatch repair 244T, 
245, 250-251 
strand exchange (strand invasion), in 
homologous recombination 
264F, 278-282, 283F, 284 
streptavidin 538F, 549F 
Streptococcus (S. pneumoniae) 
as an opportunist 1276 
as Gram-positive 1267F 
transformation with alien DNA 174F 
streptogramin 351F 
Streptomyces spp. 1293 
stress 
MAP kinase response 857 
NF«B signaling and 873-874 
in plants 881 
survival by cancer cells 1099, 1103, 
1115-1116, 1133 
stress fibers 891, 911, 923, 955-957, 
958F, 996 
stress granules 427-428 
striated muscle 891, 918 
stripes, Drosophila embryo 393-395 
strokes, apoptosis and 1031-1032 
stroma, basal lamina 1069, 1100-1101, 
1228 
stroma, chloroplasts 658, 665, 782 
stromal cell contact, stem cells 1244 
STRs (short tandem repeats) 476 
strychnine 629 
subclones 
in adenomatous polyps 1123 
in cancer 1096, 1097F, 1118-1119, 
1123 
substrate affinity, Km aS measuring 143 
substrates 
direction into reaction pathways 58, 
59F 
in enzymes catalysis 51, 57, 140-141 
see also enzyme-substrate complexes 
succinate dehydrogenase (Complex II) 
767-/68F, 772-7783, 775 
succinyl-CoA 107 
sucrose /84, /86 
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sugars 
in GAGs 1058 
glycolysis 74-78 
product yields from oxidation 775T 
ring and complex formation 96-97 
ring numbering 100 
storage as starch and fat 785-786 
see also glucose; monosaccharides; 
oligosaccharides; 
polysaccharides 
sulfur 
sulfate assimilation pathway 86 
sulfhydryl groups 91 
see also disulfide bonds 
SUMO (small ubiquitin-related modifier) 
158 
SUN proteins 948-949 
superhelical tension, DNA 314-315 
superhelices, collagen 1061, 1070 
superoxides 771, 808, 1239, 1301 
superresolution techniques (microscopy) 
549-553 
suppressor mutations 487 
surface imaging, by SEM 558 
survival factors 
apoptosis suppression by 1011, 
1029-1030 
and cell migration 1186-1187 
SV40 simian virus 128F, 732, 1279 
switch helices 161 
switches 
allosteric proteins as 152 
cell-cycle control system 967, 972- 
973 
cooperative binding and 517 
GTP-binding proteins as 157F, 703, 
820, 854 
GTPases (Ran, Ras, Rho) as 653, 854, 
956 
kinases and phosphatases and 819 
Lac operons as 382-383 
M-Cdk activation 979 
microtubule catastrophes and rescues 
934 
operation in Drosophila 392-395 
positive feedback loops and 517, 


518-520 
protein kinases as 829F, 841, 843F, 
846, 852, 857 
riboswitches 414-415, 423F 
Smads in 866 
transcription activation and repression 
871, 873, 878 
transcription control in eukaryotes 
384-392 
transcription regulators as 384-392, 
402-403 
transcription repressors, to activators 
868, 871 
switching times, gene expression 
1176-1177 
symbiosis 


endosymbiont hypothesis 800 

in eukaryotes 25 

plants and nitrogen-fixing bacteria 12 

in tube worms 12F 
sympathetic neurons 

NGF and 1208 

withdrawal from the cell cycle 1018 
symporters 602-603, 604F, 605, 628, 632 


synapses 
adjustment depending on activity 
1198 
chemical synapses 627-630 
described 627 
electrical and chemical 628 
elimination 1211 
formation 1209-1212 
number per neuron 633 
psychoactive drugs at 631-632 
scaffold proteins 1050F 
synapsis and desynapsis 1006-1007 
synaptic clefts 628, 632-633, 1209 
synaptic plasticity 636-637, 1212-1213 
synaptic signaling 815 
synaptic vesicles 628, 744-746, 747F 
synaptobrevin 745F, 747F 
synaptonemal complexes 1006-1007, 
1008-1009F, 1010 
synaptotagmin 745 
syncytia 393F, 655F, 748, 1002, 1003F, 
1157 
skeletal muscle cells as 1233 
syncytial blastoderms 1157-1158, 
1157-1161, 1165 
syndecans 1060-1061 
Synechococcus elongatus 778F, 878 
synonymous codons 219-220 
syntaxin 745F 
synteny 221-222, 224-225 
“synthetic-lethal” treatments 491, 1133, 
1140 
synthetic ribozymes 363-364 


T 


T-antigen, SV40 virus 652F 
T cell activation 
control of nuclear import 655F 
dendritic cells and 1324-1326, 1331 
extracellular signals and 1336-1337 
T-cell-mediated immune responses 
1307-1308 
T-cell receptors 
accessory receptors and co-receptors 
1304, 1325, 1330T, 1331-1333, 
1335-1337 
inhibitory receptors 1138, 1337 
misfolding 712 
T cells 
antigen presentation by dendritic cells 
1305 
co-stimulatory proteins 1305, 1306F, 
1326, 1327F, 1335, 1837-1338 
effector T cells 13809-1311, 1313, 
1324, 1328, 1333-1335, 1337 
y/o T cells 1326 
main classes of 1325 
MHC proteins and 1324-1339 
origin in the thymus 1308 
recognition of foreign peptides 1327F 
thymocytes as developing T cells 
1308F, 1332-1333, 1336 
t-loops 263 
T tubules (transverse tubules) 
in muscle contraction 920, 921F 
voltage-gated Ca?+ channels in 633 
T4 bacteriophage 19F, 324 
T7 bacteriophage 243 
tail polymerization 72-73 
talin 1037T, 1075, 1078-1081 


tamoxifen 1221-1222F 
tandem duplication 
and genetic variation 228 
linking protein domains 122 
tandem repeats 
RNA polymerase tails 317F 
STRs (short tandem repeats) 476 
telomeres 262 
TAP-tagging (tandem affinity purification 
tagging) 451 
TAP (transporter associated with antigen 
processing) 611 
TAR RNA 414 
TAT pathway (twin arginine translocation) 
665F 
TAT protein 414, 420-421 
TATA boxes 310-311, 312F, 385F 
tau protein 932-933, 934F 
Taxol® (paclitaxel) 904T, 929, 936F 
TBP (TATA-binding protein) 310, 
311-312F, 311T 
TBSV (tobacco bushy stunt virus) 129F 
Tbx5 transcription factor 1259 
TCRs (T-cell receptors) 1304, 1325-1326, 
1328, 1831-1333, 1336, 1337F, 
1338 
telomerases 
cancer cells 1100 
expression by ES cells 1254 
human cells not producing 1016, 
1100 
ICRNA in 429, 435 
and the nucleolus 330 
resembling reverse transcriptases 
262, 263F 
telomeres 
as chromosomal structures 186, 263 
end-replication problem 262 
length and its regulation 264-265 
replicative cell senescence 442, 1016, 
1100 
telophase 964, 981, 995 
plant cell 1000F 
TEM (transmission electron microscopy) 
305F, 554, 555-556F, 558, 560F 
temperature effects, lipid bilayers 571 
temperature-sensitive mutations 489 
templated polymerization 3-4 
templating 
active sites 145 
assembly factors 130 
in DNA repair 275 
in DNA replication 177-178, 239-240 
in PCR 474 
RNA 364 
tensile strength 1046, 1057, 1063, 1070, 
1082-1083, 1084—-1085F 
tension sensing 1043, 1044F, 1068F, 
1080, 1081F, 1229 
teratomas 1257 
terminally differentiated cells 400, 816, 
1012, 1021, 1148 
skeletal muscle 1233 
small intestinal villi 1218-1219 
termination (of transcription) 306 
testes 
Leydig cells 671F 
pIRNA in 433 
testosterone, structure 99, 876F 
Tet repressor 495F 


tethering 
by disordered regions 126 
by dynamin 701 
Rab effectors in 706 
repair factories 213-214 
by RNA polymerase II CTDs 316 
TetR efflux pump 1293F 
tetracysteine tags 1053F 
Tetrahymena 263T, 324, 805F 
tetranucleosomes 192 
TFIl see general transcription factors 
TGFB (transforming growth factor-f) 
BMPs as family members 1168 
growth inhibition by 1012 
induced regulatory T cells 1336 
morphogen gradients 1166 
myostatin as family member 1196 
TGFB/activin family 865 
TGFB signaling pathway 
in cancer 1117, 1123F 
in development 1150 
TGFB superfamily 865-866, 1168 
TGN (trans-Golgi network) 716, 722, 
727-729, 730F, 731, 736, 
749-750 
exocytosis and 741-744 
B-thalassemia 323-324 
Thale cress see Arabidopsis 
thermodynamics 
laws of 52-54 
of living cells 53-54, 102-103 
thermophilic bacteria 473F, 483, 572 
thick filaments 915-916, 918-920, 923F, 
936F 
thin filaments 916, 918-921, 922F 
third-generation sequencing methods 481 
thoracic duct, lymphatic system 1236, 
1312 
3C (chromosome conformation capture) 
method 209F, 212 
three-dimensional imaging 
electron microscopy 557-558, 560F 
light microscopy 540, 541F, 542, 
550F, 553F 
3’ end, DNA 175, 177F, 480 
threonine, structure 113 
thrombin 141, 837T, 1078 
thylakoid membranes 606, 658, 664-665, 
783 
reaction centers 783-784, 788-790, 
793, 796 
site of photosynthesis and ATP 
generation 786-787 
site of photosystems | and II 789-790 
thylakoid space 783, 784F, 791-792, 
793F, 794, 795F 
thymidine, radioactive 1219 
thymine 
from deamination of 5-methlycytosine 
405 
DNA base pairing with adenine 176, 
177F 
structure 100 
thymine dimers 268, 269F 
thymocytes 1308F, 1332-1333, 1336 
thymosin 905-907 
thymus, origins of T cells 1308, 1325, 
1336 
thyroid hormones 835T, 874, 876, 1182 
thyroxine 8/76F 


tight junctions 
in epithelial cells 1036F, 1037, 
1048-1049F 
permeability 1047 
polarized cells 749 
preventing lipid diffusion at 590 
structure and function 1047-1049 
tiling 1207 
TIM complex (translocator of the inner 
mitochondrial membrane) 659, 
660F, 661-662 
TIM22 659, 660F, 663F, 664 
TIM23 659, 660F, 661-662, 663F, 664 
Tim protein (timeless) 878 
time delays, kinetic proofreading 345 
time-lapse imaging 1249 
fluorescence microscopy 75/7F, 803F, 
875F, 935F, 991F, 1178 
timing in embryonic development 
1176-1184 
Timothy syndrome 1258 
tip cells, endothelial 1236-1237 
TIRF (total internal reflection fluorescence 
microscopy) 547-548, 553F 
tissue maintenance and repair 1154 
independent of stem cells 1226-1227 
tissue morphogenesis 1059 
tissue remodeling/regeneration 
1043-1045, 1071 
regeneration of organs 1247 
self-renewal and cancer 1120-1121 
see also wound healing 
tissue segregation and homophilic binding 
1041 
tissue specialization 
duplicated genes 229 
E. coliadhesins 1277 
tissues 
derived from ectoderm, endoderm and 
mesoderm 1147, 1167 
isolating cells from 440 
neurotrophic factor release 
1208-1209 
non-renewable 1227, 1247 
protein abundances in brain and liver 
372F 
RNAi resistant 501 
self-renewing, and cancer 1120-1121 
specimen preparation for microscopy 
535-536 
titin protein 549F, 920 
TLRs (Toll-like receptors) 873, 1165, 
1299-1300, 1304 
TMV (tobacco mosaic virus) 128-129 
TNF (tumor necrosis factor) receptor 
family 1024 
TNFa (tumor necrosis factor a) 873, 1301 
tobacco cells 
GFP-tagged organelles 544F 
vacuoles 724F 
Western blotting 455F 
tobacco smoking 1095, 1127-1128, 
1129F 
TOF (time of flight) analyzers 456F, 457 
Toll gene 1158F, 1165 
Toll-like receptors (TLRs) 873, 1165, 
1299, 1300T, 1304 
Toll receptors 873, 1158F, 1165, 1299 
TOM complex (translocator of the outer 
membrane) 659-662, 663F, 664 
tomograms 558, 559F 
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topoisomerases 
recombinases resembling 293 
superhelical tension removal 315 
topoisomerase | 251-253 
topoisomerase Il 195F, 252-253, 977 
topological relationships/equivalents 
644-647 
TOR (target of rapamycin) 861, 1017 
mTOR 860F, 861, 864T, 1114-1115 
torque, ATP synthase 777F 
Torso receptors 1158F 
total internal reflection fluorescence 
microscopy (TIRF) 547-548 
totipotency 
fertilized egg 1253 
totipotent cells 507, 1148, 1248 
toxins 
antibody response 1307 
of bacterial pathogens 1270, 1278 
entry points 576 
G protein targets 834 
and neurotransmitters 629 
Toxoplasma gondii 1282, 1283F 
trabecular bone 1230 
trace elements 43F 
tracheae and tracheoles 1191 
traction, in cell migration 951, 952F, 955, 
956F 
trajectories of points 519-520 
trans-acting epigenetic mechanisms 412, 
413F 
trans-acting IncRNAs 435F, 436 
trans-Golgi network (TGN) 716, 727-729, 
730F, 731, 736, 741-744, 
749-750 
trans-SNARE complexes 708 
transcellular transport 605, 1047, 1048F 
transcription 301-317 
in bacteria and eukaryotes 306 
coordination with splicing 322 
direction of 309 
“factories” 332 
histone acetylation 202 
nucleotide excision repair coupled to 
271 
process described 302-305, 315F, 
510F 
speed of 304 
as templated polymerization 4 
transcription activators (activator proteins) 
chromatin modification 386-388 
coactivators 385-386, 388-389, 392, 
394, 395F 
combinatorial controls 394-395 
fractional promoter occupation 
511-512 
fusion with bacterial Cas9 497 
in positive feedback 518 
repressors and 381-383, 394 
RNA polymerase Il and 312-313 
switch from repressors 868, 871 
transcription attenuation 414 
transcription bubbles 306 
transcription circuits, as switches 
402-403 
transcription control 
in the cell-cycle control system 971 
in eukaryotes 384-392, 405 
of gene expression 372, 373F 
transcription elongation 
accessory proteins 313-314 
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and riboswitches 415 
and RNA processing 315-316 
transcription factors see transcription 
regulators 
transcription initiation 


chromatin modification using 387-388 


in eukaryotes 310-313, 387 
and termination 306-309 
transcription initiation complexes 311 
transcription rate constants 513 
transcription regulators (transcription 
factors) 
ability to read DNA sequences 
374-375 
activating differentiation 1170-1171 
in bacteria 380-383 
chromatin immunoprecipitation and 
505 
in chromatin modification 196-197, 
199-201, 206 


combinatorial gene control 396, 397F, 


399, 520-521 

controlled activation 395, 396F 

cooperative binding 378-380, 517F 

dimerization 375-378 

in eukaryotes 384-392 

extracellular signaling 867-880 

Hox modulated 1163 

ligand-modulated 874-876, 877F 

maintaining embryonic stem cells 
1254-1255 

master transcription regulators 
398-399, 1171 

neurons expressing 1200F 

p53 protein as 1116 

as proteins 30 

role and mode of operation 373-380, 
389-390 

sequential expression 1179 

in specialized cells 399-400 

as switches 384-392, 402-403 

in the syncytial blastoderm 1157 


see also latent transcription regulators; 


transcription activators; 
transcription repressors 
transcription repressors (repressor 
proteins) 
activators and 381-383, 394 
binding site occupancy 514 
co-repressors 385, 386F, 392, 394, 
395F 
combinatorial controls 394-395 
DNA inversion and 294F 
DNA methylation and 405 
fusion with bacterial Cas9 497 
negative feedback 402,515 
switch to activators 868, 871 
tryptophan repressor 376, 380-383, 
392 
transcription units 306 
transcriptional activators, regulators, 
repressors see transcription 
activators etc. 
transcriptional synergy 388-389 
transcytosis 732, 737, 738F, 749F 
transdifferentiation 1258-1259 
transducin (G) 845, 846T, 848 
transduction and horizontal gene transfer 
1268 
transesterifications in RNA splicing 318, 
321 


transfer RNAs (tRNAs) 
function 7, 305, 334 
initiator tRNA 347-348 
and the nucleolus 330 
ribosomal binding sites 347 
structure 335F 
unusual bases 335F, 337F 
transferrin 427, 553F 
transferrin receptor 734-735 
transformation, horizontal gene transfer 
by 1268 
transformed phenotypes 1098, 1106 
transforming growth factor- see TGFB 
transgenic organisms 495-497, 500, 506, 
543, 1221F 
transgenic plants 507-508 
transit amplifying cells 1121-1122, 1219, 
1220F, 1223, 1225, 1226F 
transition-metal ions 764-766 
transition states, enzyme stabilization of 
141-146 
translation 
accuracy and free energy 345-346 
co-translational protein folding 
353-354, 355F 
conserved gene families 20, 21T 
control with 3’ and 5' UTRs 422-423 
control with IRES 425-426 
converting RNA into proteins 6-7, 
315F, 333-362 
initiation and termination 347-349 
messenger RNA in 343-344 
quality control 351-353 
ribosome profiling technique 505-506 
speed of 346 
stepwise 343-344F 
translation initiation mechanisms 425F, 
1288 
translation recoding 350 
translational control, gene expression 
372, 373F 
translational repressors 623 
translesion polymerases 273, 274F 
translocation steps, ribosomal 342 
translocations 487 
translocator proteins 1270-1271F 
the translocon 677 
transmembrane proteins 
adhesion proteins 1037 
association with the lipid bilayer 577F 
ATG9 as 726 
B-barrels in 579-581, 659, 662-663 
cadherins and integrins as 1037 
death receptors as 1024 
enzyme-coupled receptors as 849 
ER capture 672 
fates of endocytosed 738F 
hydrophobicity 577, 579F, 593 
integration process 678 
IRE1 as a transmembrane protein 
kinase 687 
most receptors as 816 
Notch as 720 
PERK as a transmembrane protein 
kinase 688 
single-pass and multipass 578-579 
SRP receptor as 674 
see also multipass; single-pass 
transmission electron microscopy (TEM) 
305F, 554, 555-556F, 558, 1210F 
transmitter-gated ion channels 614, 
627-632, 636 


A-to-l editing 418 
acetylcholine receptors as 630-631 
allostery in 618, 825 
ion-channel-coupled receptors 81/7F, 
818, 843 
types in neurons 631, 636 
transport ATPases (ATP-driven pumps) 
601-602, 606-607 
transport proteins, inner mitochondrial 
membrane 779-780 
transport vesicles 644-648, 650, 656F, 
670, 691 
coated vesicles 697 
formation 701 
scope 695 
see also vesicular transport 
transporters (protein) 599 
in epithelial cells 605 
in plants 883 
regulation of cytosolic pH 604-605 
transposable elements (transposons) 
conservative site-specific 
recombination and 293 
DNA-only transposons 288-290 
double-stranded RNA in 421 
eukaryotic genomes 28 
and genome changes 217-218, 222 
nonretroviral retrotransposons 291 
retroviral-like transposons 291 
RNA interference and 429, 432-433 
three classes 288T, 292 
virulence genes 1268 
transposition 287-292 
trastuzumab 1137 
treadmilling, actin filaments 901, 
903-904, 953-954 
tree of life see phylogenetic trees 
triacylglycerols 78, 81, 83F, 98-99 
tricarboxylic acid cycle see citric acid 
cycle 
tricellulin 1049 
trichothiodystrophy 497F 
trigger mechanism of invasion 1281, 
1282F, 1285F 
triskelions, clathrin 698, 699F 
Trithorax group proteins 1164 
Triton X-100 583, 584F 
TrkA kinase 853-854, 1209 
tropoelastin 1065 
tropomodulin 905, 909, 919, 920F 
tropomyosin 416, 592F, 905, 907-910, 
920F, 921, 922F, 923 
a-tropomyosin gene 319F 
troponins 921-923 
trypanosomes 689, 805, 1290 
Trypanosoma brucei 1266, 1290 
Trypanosoma cruzi 1283-1284, 1290 
trypsin 119, 132T, 440, 456-457, 942F 
tryptophan, structure 113 
tryptophan repressor 376, 380-383, 392 
TSH (thyroid-stimulating hormone) 835T 
tube and vesicle formation 1192 
tube worms 12F 
tubules 
branching tubular structures 1190 
vesicle budding events 705 
see also microtubules 
tubulin subunits 893-894, 904T, 927, 
928-929, 934-935 
a-tubulin 925, 926 
B-tubulin 926F, 927 


y-tubulin 929, 930F, 982 
tubulins 459 
consequences of mutations 925 
fluorescence-tagging 544, 545F 
FtsZ as homolog 896 
immunofluorescence 539F 
in microtubules 891, 925 
separation from samples 440 
sequestration 935 
tumor biopsies, genome sequencing 
1141 
tumor cells see cancer cells 
tumor microenvironments 1100-1101, 
1137, 1138F 
tumor progression 
colorectal cancers 1122-1126 
correlation with mutation sequence 
1125-1126 
immune system and 1137, 1138F 
loss of p53 function as essential 1126 
metastatic and primary genomes 
1119 
natural selection in 1091-1092, 1096, 
1104, 1118, 1119F, 1125 
process 1095-1097, 1109, 1114F, 
1115-1116, 1119 
tumor Suppressor genes 
discovery 1107-1109 
DNA sequence changes 1110F 
examples 1110F, 1115 
genetic and epigenetic inactivation 
1108-1109, 1110F 
loss-of-function mutation and 1104 
tumor viruses 1105-1106, 1129-1132 
tumors 
benign and malignant 1092, 1093F 
cell types in 1093, 1101F 
defined 1092 
growth rate 1094F 
TUNEL technique (TdT-mediated dUTP 
nick end labeling) 1024F 
turbine analogy, ATP synthase 776-778 
y-TuRC (y-tubulin ring complex) 929, 
930-931F, 933 
turgor pressure 
extracellular matrix and 1058 
plant cell walls and 1083, 1085 
vacuoles and 724-725 
turnover numbers 141, 142-143 
turnover rates 
connexins 1052, 1053F 
Myc regulating protein 1107 
and signal response speeds 825-826, 
827F 
Twist genes 1165, 1166F 
Twist protein 1042 
two-dimensional gel electrophoresis 
452-454 
two-photon effect 542 
Tyk2 as a JAK kinase 863, 864T 
Type Ill fibronectin repeats 1066F, 1067- 
1068, 1073 
Type IV collagen 1058F, 1062, 1069-1073 
Type XVII collagen 1037T, 1062, 1070, 
1076F 
tyrosine 
structure 113 
sulfation 719 
thyroid hormones from 876 
see also phosphotyrosine 
tyrosine aminotransferases 371-372, 400 


tyrosine-kinase-associated receptors 862 
tyrosine kinase domains 849 
tyrosine kinases 
distinguished from serine/threonine 
kinases 819-820 
in protein kinase evolutionary tree 
155F 
regulation of Src kinase 155-156 
see also cytoplasmic; receptor 
tyrosine kinases 


U 


U1, U2, U4, U5 and U6 see snRNAs 
ubiquinone 765-766, 767F, 768-770, 
7171F, 772-773 
ubiquitin 
APC/C ubiquitylation in mitosis 992 
histone side chain modifications 197F 
monoubiquitylation, multiubiquitylation 
and polyubiquitylation 735 
and the proteasome 358F, 359 
regulatory function 157-159, 165T 
structure 159F 
tagging proteins for degradation 736, 
882, 992 
see also E1; E2; SUMO 
ubiquitin ligases (E3) 158, 359-360 
APC/C as 360, 970 
Parkin 727 
SCF ubiquitin ligase 159-160, 164, 
167-168, 971, 972F, 973, 1004 
Smurf 866 
Ubx (Ultrabithorax) gene/complex 1162F, 
1164F 
UIMs (ubiquitin-interaction motifs) 853 
ultracentrifuges 
analytical 455 
preparative 445-447, 455 
ultraviolet radiation 267-268, 269F, 1094 
uncoupling protein 780 
“unfoldases” 358, 359F 
unfolded protein response 686-687 
unidirectional rachets 162 
uniporters 602, 605 
units of length 531 
universal gene families 20, 21T 
unsaturated fatty acids 98 
unsaturated lipids 571F 
unstructured regions (proteins) see 
disordered regions 
untranslated regions, mRNA see UTRs 
UORFs (upstream open reading frames) 
424 
uracil 
from cytosine deamination 267T, 268, 
269F, 1322-1323 
in RNA 302, 303F 
structure 100 
urate oxidase in peroxisomes 666 
urea as denaturing agent 453 
urea cycle 760 
uric acid 1301 
uridine 
dihydro- 335F, 337F 
isomerization to pseudouridine 328 
4-thio- 337F 
uridine diphosphate glucose 69T 
uronic acids see galacturonic; glucuronic; 
iduronic 
uterine cervix, cancers of 1093F, 1096F, 
1129, 1131, 1265 
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UTRs (untranslated regions, mRNA) 184T 
3’ and 5,’ control of translation 
422-423 
3'-UTRs (untranslated regions, mRNA) 
421-422, 430 


V 


V gene segments 1320-1321 
v-Ras oncogene 1106 
V region, lg light chains 1320, 1321F, 
1323F 
v-Src gene 1105 
V-type ATPases 723, 736, 743, 747F, 778 
V-type pumps 606 
vaccination 
and the adaptive immune system 
1297 
immunization as the basis 1307 
against Neisseria 1291 
vaccine production through DNA 
cloning 484 
against virus diseases 1131, 1274, 
1275F, 1291 
Vaccinia virus 939, 1278, 1287F 
vacuolar H+ ATPases 723 
vacuoles 
as lysosomes 724-725 
parasite containing 1283F, 1285, 
1286F 
products from 725 
valine, structure 113 
van der Waals attractions 
atomic force microscopy and 548 
lipid bilayers 572, 575 
as noncovalent 44, 94 
van der Waals radii 45F, 94, 110 
van Leeuwenhoek, Antonie 440 
vancomycin 1292-1293 
variable regions, lg light and heavy chains 
1318-1319 
variegation 807 
position effect 195-197, 199-202, 
205 
variola (smallpox) virus 1265, 1273T 
vasculogenesis and angiogenesis 1236 
vasopressin 835T, 837T, 840T 
VCAMs (vascular cell adhesion molecules) 
1055 
V(D)J recombination 275F, 290, 
1320-1321, 1323, 1325, 1332 
VEGF (vascular endothelial growth factor) 
action via receptor tyrosine kinases 
850T 
angiogenesis and 1120, 1237 
interaction with matrix proteins 1073 
VegT transcription regulator 1156, 1173 
“Velcro principle” 1039, 1079 
velocity sedimentation 446, 447F 
vernalization 1183-1184 
vertebrate embryos 
inductive signaling 1166, 1167F, 1177, 
1184, 1198 
nervous systems 1041F 
spatial patterning 1167-1169 
vertebrates 
adaptive immune system 1307-1315 
body plan inversion 1169 
cyclins and Cdks 969T 
desmosomes in 1046 
Drosophila melanogaster as model 33 
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evolution of 227 
extracellular matrix in 1057-1074 
gene duplication frequency 34-35 
innate and adaptive immune response 
1297 
intermediate filament types 944T 
loss of CG sequences 406-407 
mitochondrial genomes 803-804 
neural development 1199-1200 
organ regeneration in 1248-1249 
times of divergence 37F 
vertical gene transfer 1268 
vesicle coat proteins 650 
vesicle transport model 720-721 
vesicles 9, 585, 594 
budding and fusion 647, 648 
caveolae as pinocytic vesicles 
731-732 
formation of tubes and 1192 
intralumenal vesicles 724F, 729-730, 
735-737, 738F 
synaptic vesicles 628, 744-746, 747F 
vesicles, endocytic see endocytic vesicles 
vesicles, secretory see secretory vesicles 
vesicular traffic, disruption by pathogens 
1284-1286 
vesicular transport 
bidirectionality 730-731 
Rab family GTPases in 705 
retrieval pathways 695-696, 713-714, 
726 
sorting signals and 646 
steps in 696F, 698F 
see also transport vesicles 
vesicular tubular clusters 712-713 
Viagra® 847 
Vibrio cholerae 13F, 1265, 1266F, 
1269-1270 
vimentin-like filaments 944T, 946, 948, 
959 
vinblastine 993 
vinculin 1037T, 1042F, 1043, 1044F, 
1075F, 1078F, 1079-1081 
viral capsids 128, 129F, 290, 562F, 1266F, 
1274, 1280F 
nucleocapsids 1274, 1275F, 1286, 
1287F, 1288 
viral carcinogens 1130-1132 
viral envelopes 1275F, 1280-1281, 1287F 
virions 1274, 1281, 1287F, 1288 
viroids 363F 
virulence genes/virulence factors 609, 
610F, 1268-1270 
virulence plasmids 1268 
virus receptors 1279 
virus types and strains 
adenovirus 1273T, 1274F, 1280F, 
1281, 1289 
baculoviruses 1029, 1287-1288 
DNA viruses 1130, 1273, 1274F, 1278, 
1289 
enveloped viruses 1274, 1275F, 1280, 
1286, 1287F, 1288 
hepatitis-B and -C 1129-1130, 1132 
herpes simplex virus 1273T, 1279 
measles 1273T, 1274, 1289 
nonenveloped viruses 1274, 
1280-1281 
not using double-stranded DNA 268 
pandoraviruses 1274 
papillomaviruses 732, 1129, 1130T, 
1131-1132 


parvoviruses 1274 
poxviruses 1274, 1280, 1289 
rabies 1273T, 1274F, 1279 
RNA viruses 1273, 1274F, 1278, 1289 
SV40 simian virus 128F, 732, 1279 
see also bacteriophages; HIV; 
retroviruses; tumor viruses 
viruses (generally) 
bacterial defenses 433-434 
budding 738F, 1274, 1275F, 1286, 
1287F 
causing human diseases 1273T 
class | MHC suppression 1304-1305 
cross-presentation and 1329 
detecting in samples 475F 
entry points and mechanisms 576, 
732, 1280-1281 
and gene transfer 18 
host-cell responses 1303-1305 
infection and membrane fusion 709 
introducing altered genes 495, 
1105-1106 
latent phases 1131 
and microtubular transport 939 
morphology 1274F 
negative staining 560 
receptor-mediated endocytosis 1281 
replication errors and evolution 1291 
RNA interference and 431 
self-assembly 128-129 
transdifferentiation vectors 1258 
use of conservative site-specific 
recombination 293 
use of host cell machinery 
1273-1275, 1288-1289 
use of host cytoskeleton 1286-1288 
use of IRES 425F, 426 
viewed as mobile genetic elements 
290 
vision 825, 843-846, 1204-1206 
visual system electrical activity 
1211-1212 
vitamins 
as coenzymes 146-148 
vitamin A 876 
vitamin D 874-876 
vitamin D3 876F 
see also biotin 
vitreous ice 556, 560 
voltage-gated channels 614, 621-622, 
623F, 625F, 626-630, 632-635 
voltage-gated Ca*+ channels 626, 
628, 632-633, 635, 745 
voltage-gated cation channels 621, 
626-627, 629-630 
voltage-gated K+ channels 622, 625F, 
627, 634 
voltage-gated Na* channels 621-622, 
623F, 626-627, 633-634 
voltage sensors 622 
VSG (variant-specific glycoprotein) 1290 


W 


Warburg effect 1098-1099, 1115 
WASp proteins 957, 958F, 1278F, 1281, 
1288, 1289F 
water 
as an electron donor 762, 788, 790 
cage structures 568, 569F 
content of cells 535 


excluding from binding sites 136 
formation of acids and bases in 45-46 
hydrogen bonding in 44, 92, 94, 136 
hydrophilic and hydrophobic 
molecules 569F 
proton behavior in 45-46, 613, 773 
as a solvent 93 
structure 92 
vitreous ice 556, 560 
see also hydrophilicity; hydrophobicity 
water channels see aquaporins 
water-soluble proteins, ER capture 672 
wavelength 
of electrons 554 
excitation and emission, of fluorescent 
dyes 537F 
fluorescent protein excitation 545 
multiphoton microscopes 542 
resolution limits and 529-530, 533F, 
554 
Wee kinase 970, 973T, 979 
Western blotting 455 
wheat 559F, 783F 
white blood cells 
chemokine attraction of 1241 
inflammatory response and 1240 
leukemias as cancers of 1092 
major categories of 1239 
matrix degradation 1072 
selectin and integrin action 1055, 
1077 
use of lymphatic system 1236 
White gene, Drosophila 195F 
whole-genome analysis see genome 
sequencing 
whole-genome duplications 35, 228 
whooping cough (pertussis) 834, 1277 
wing hairs 1189 
Wingless gene 536F, 869, 1161 
Wingless protein 1160 
Wiskott-Aldrich Syndrome 967 
Wnt proteins 868-871, 1199, 1200F 
Wnt11 1156-1157 
see also Wingless 
Wnt signaling pathways 1124, 1157, 
1160, 1178, 1190 
canonical Wnt pathway 869-870 
planar-cell-polarity signaling pathway 
869, 1189F 
stem cell maintenance 1220-1221, 
1224 
Wnt/B-catenin pathway 869-870, 
1117 
wobble base-pairing 335, 336-337F, 342 
in mitochondrial genomes 804 
worms see Ascaris; Caenorhabditis 
wound healing 504, 953, 1011, 1059 
VEGF and 1237 
see also tissue remodeling/ 
regeneration 
wound repair 
fibroblasts in 1228 
plasma membrane enlargement 748 
wounds, and infection 1276 
writer proteins 199-200, 406F 


X 


X-chromosome sample sequence 300F 
X-inactivation 410-411, 412F, 1252 
x-ray crystallography/x-ray diffraction 


of ADP/ATP carrier protein 780F 
amyloid fibrils 131F 
of an SH2 domain 853F 
of ATP synthase 777F 
of bacterial Kt channel 617 
of bacteriorhodopsin 587 
of Bcr-Abl-imatinib 1136F 
of Ca2+/calmodulin/CaM-kinase II 
841-842F 
coiled coils 117 
of aconnexin 1052F 
of a cyclin/Cdk complex 970F 
of DNA and RNA polymerases 304 
electron microscopy compared 562 
of lg domains 1319F 
of integrins 1077 
of lysozyme 144 
of nucleosomes 189F, 192 
of protein conformations 120F, 121, 
144 
protein structure from 460, 461F 
of respiratory chain complexes 768 
of ribosomes 346F, 460 
ribulose bisphosphate carboxylase 
461F 
of transcription regulators 375 
of transfer RNA 335F 
of transmembrane proteins 579-581 
viral capsids 128-129F 
x-ray exposure see radiation 
Xenopus microtubule-associated protein 
(XMAP215) 933-935 
Xenopus ruwenzoriensis 35 
Xenopus tropicalis 35 
Xenopus (X. laevis) 
cell cycle model 966 
embryo cortical rotation 1156, 1167 
embryo patterning 1167 
embryo reprogramming experiments 
205, 1252 


gene duplication among 35 
globin gene family in 230 
as laboratory species 35 
nuclear organelles 213F 
oocyte asymmetries 1156 
oocyte extracts 448, 545 
oocyte nuclear lamina 656F 
rRNA genes 328 
xeroderma pigmentosum (XP) 266T, 267, 
1094 
XIAP protein 1029 
XIC (X-inactivation center) 411, 412F 
Xist INCRNA 411, 412F, 435 
XMAP215 (Xenopus microtubule- 
associated protein) 933-935 


Y 


Yap/Yorkie gene 1197F 
yeasts 
barcoded mutants 499F 
characteristic transposon types 292 
control of cell cycle in 186, 1018F 
electron microscope image 556F 
endoplasmic reticulum 671F 
MAP kinases 857 
mitochondrion 451F 
mold transitions 1271 
and prion diseases 131 
protein interaction mapping 166-167, 
168F 
protein purification 451 
as single-celled eukaryotes 30 
yeasts, budding 925, 949, 950F, 966F 
cell cycle proteins 969T, 971 
cyclins and Cdks 969T 
see also Saccharomyces 
yeasts, fission see Schizosaccharomyces 
Yersinia pestis 1276, 1281 
Yersinia pseudotuberculosis 1281 
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Z 
Z discs 918-920, 948 
Z-ring 896 


Zschemes 789-790, 792-793 
zebrafish (Danio rerio) 
convergent extension 1189F 
genome size 35 
germ cell migration 1185F 
Her gene expression 1179F 
lakritz mutant 488 
maternal-zygotic transition 1181 
as model organism 29T, 33, 35, 36F 
neural map 1204F 
whole-genome duplications 228 
Zellweger syndrome 668 
zigzag model, chromatin 192 
zinc finger proteins 377 
Zinnia elegans 783F 
zipper mechanism of invasion 1281, 
1282F, 1284 
ZO proteins (zonula occludens) 1049 
zonula adherens (adhesion belt) 1036F, 
1044, 1045F 
zonula occludens/ZO proteins 1049 
zygotene 1006, 1007F 
zygotes 
as beginning of development 1145, 
1147 
as pluripotent or totipotent 1253 
zygotic-effect genes 1159, 1165, 1181F, 
2282 
zymogens 736 
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